Common attributes and methods
As model complexity grows, it becomes harder and harder to look inside and understand a model’s inner workings (especially with artificial neural networks). Thankfully, scikit-learn models share several key attributes and methods that provide valuable insights into how a model has learned from data. For instance, attributes such as coef_ and intercept_, found in linear models specifically, store the learned coefficients and intercepts to help with interpreting model behavior.
Similarly, methods such as score() allow users to evaluate model performance, typically returning a default metric such as accuracy for classifiers or R² for regressors. These common features ensure consistency across different models and simplify model analysis and interpretation:
from sklearn.linear_model import LinearRegression
import numpy as np
# Example data
X = np.array([[1], [2], [3], [4], [5]]) # Feature matrix
y = np.array([1, 2, 3, 3.5, 5]) # Target values
# Create and fit the model
model = LinearRegression()
model.fit(X, y)
# Access coefficients (slope of the linear model)
print("Coefficients:", model.coef_)
# Access y-intercept
print("Intercept:", model.intercept_)
# Use score() method to evaluate the model (R-squared value)
print("Model R-squared:", model.score(X, y))
# Output:
Coefficients: [0.95]
Intercept: 0.04999999999999938
Model R-squared: 0.9809782608695652 Note
R-squared has received criticism in some cases as being misleading as it can be influenced by how messy or organized your data is. It will also always increase with the addition of more variables in your data. Often, the adjusted R-squared is used to account for the number of variables in your dataset, applying a penalty when many variables are included.)
We will look more closely at these shared attributes and methods across various scikit-learn models throughout this book, with examples on how to access and interpret values such as coef_ and how to use methods such as score() to quickly evaluate performance. Practical examples will be provided to show how these features can be applied in real-world scenarios, such as evaluating model accuracy or interpreting regression coefficients for better model insights.