Common Conventions and API Elements of scikit-learn
It’s hard to believe that the scikit-learn project started back in 2007 and officially launched in 2009. Even after so many years, it is hard to deny the impact the Python library has had on the world of data science and machine learning (ML). For many of us, scikit-learn is one of the first libraries we hear about when we begin our journey in ML programming and engineering—and that hasn’t changed, with the library being one of the most widely used in research, academia, and production applications at scale in the business world.
This chapter will cover the standard conventions and core API elements of scikit-learn, including the design principles behind estimators, transformers, and pipelines, as well as common methods such as fit(), predict(), and transform(). The exercises provided throughout the rest of this book will involve using these conventions to build and evaluate models, all while focusing on understanding the consistent structure of scikit-learn’s API to enhance usability and flexibility in ML projects.
In this chapter, we’re going to cover the following recipes:
- Introduction to scikit-learn’s design philosophy
- Understanding estimators
- Transformers and the
transform()method - Handling custom estimators and transformers
- Pipelines and workflow automation
- Common attributes and methods
- Hyperparameter tuning with search methods
- Working with metadata: Tags and more
- Best practices for API usage
Free Benefits with Your Book
Your purchase includes a free PDF copy of this book along with other exclusive benefits. Check the Free Benefits with Your Book section in the Preface to unlock them instantly and maximize your learning experience.