You're reading from scikit-learn Cookbook Over 80 recipes for machine learning in Python with scikit-learn

Product type Paperback

Published in Dec 2025

Publisher Packt

ISBN-13 9781836644453

Length 388 pages

Edition 3rd Edition

Languages

Python

Tools

Scikit-learn

Concepts

Machine Learning

Author (1):

John Sukup

View More author details

Table of Contents (17) Chapters

Preface

1. Chapter 1: Common Conventions and API Elements of scikit-learn

2. Chapter 2: Pre-Model Workflow and Data Preprocessing FREE CHAPTER

3. Chapter 3: Dimensionality Reduction Techniques

4. Chapter 4: Building Models with Distance Metrics and Nearest Neighbors

5. Chapter 5: Linear Models and Regularization

6. Chapter 6: Advanced Logistic Regression and Extensions

7. Chapter 7: Support Vector Machines and Kernel Methods

8. Chapter 8: Tree-Based Algorithms and Ensemble Methods

9. Chapter 9: Text Processing and Multiclass Classification

10. Chapter 10: Clustering Techniques

11. Chapter 11: Novelty and Outlier Detection

12. Chapter 12: Cross-Validation and Model Evaluation Techniques

13. Chapter 13: Deploying scikit-learn Models in Production

14. Chapter 14: Unlock Your Exclusive Benefits

Unlock this Book’s Free Benefits in 3 Easy Steps

15. Index

Why subscribe?

16. Other Books You May Enjoy

Scaling techniques

When working with datasets, features can have vastly different scales. For instance, a feature representing age may range from 0 to 100, while another feature representing income could range from 0 to 100,000. Many ML algorithms, such as KNN and gradient descent-based methods (e.g., linear regression), are sensitive to these differences in scale. Therefore, scaling helps ensure that no single feature dominates the learning process. This recipe covers the three most commonly used scaling techniques in ML.

The following are key concepts. It is worth noting that sometimes these two terms are used interchangeably, but they are not the same and should not be implemented as such!

Standardization (Z-score transformation) changes the data to have a mean of 0 and a standard deviation of 1
Normalization changes the range of the data distribution so values fall between 0 and 1