Skip to content

Commit 384ff19

Browse files
committed
Merge pull request scikit-learn#2862 from MechCoder/LogCV
[MRG] Logistic Regression CV
2 parents ebc3f0d + 2374a09 commit 384ff19

File tree

11 files changed

+1459
-83
lines changed

11 files changed

+1459
-83
lines changed

doc/modules/classes.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -639,6 +639,7 @@ From text
639639
linear_model.LassoLarsIC
640640
linear_model.LinearRegression
641641
linear_model.LogisticRegression
642+
linear_model.LogisticRegressionCV
642643
linear_model.MultiTaskLasso
643644
linear_model.MultiTaskElasticNet
644645
linear_model.MultiTaskLassoCV

doc/modules/linear_model.rst

Lines changed: 17 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -649,10 +649,10 @@ rather than regression. Logistic regression is also known in the literature as
649649
logit regression, maximum-entropy classification (MaxEnt)
650650
or the log-linear classifier. In this model, the probabilities describing the possible outcomes of a single trial are modeled using a `logistic function <http://en.wikipedia.org/wiki/Logistic_function>`_.
651651

652-
The implementation of logistic regression in scikit-learn can be accessed from
653-
class :class:`LogisticRegression`. This
652+
The implementation of logistic regression in scikit-learn can be accessed from
653+
class :class:`LogisticRegression`. This
654654
implementation can fit a multiclass (one-vs-rest) logistic regression with optional
655-
L2 or L1 regularization.
655+
L2 or L1 regularization.
656656

657657
As an optimization problem, binary class L2 penalized logistic regression minimizes
658658
the following cost function:
@@ -663,7 +663,14 @@ Similarly, L1 regularized logistic regression solves the following optimization
663663

664664
.. math:: \underset{w, c}{min\,} \|w\|_1 + C \sum_{i=1}^n \log(\exp(- y_i (X_i^T w + c)) + 1) .
665665

666-
L1 penalization yields sparse predicting weights.
666+
The solvers implemented in the class :class:`LogisticRegression`
667+
are "liblinear" (which is a wrapper around the C++ library,
668+
LIBLINEAR), "newton-cg" and "lbfgs".
669+
670+
The lbfgs and newton-cg solvers only support L2 penalization and are found
671+
to converge faster for some high dimensional data. L1 penalization yields
672+
sparse predicting weights.
673+
667674
For L1 penalization :func:`sklearn.svm.l1_min_c` allows to calculate
668675
the lower bound for C in order to get a non "null" (all feature weights to
669676
zero) model.
@@ -685,6 +692,12 @@ which is shipped with scikit-learn.
685692
thus be used to perform feature selection, as detailed in
686693
:ref:`l1_feature_selection`.
687694

695+
:class:`LogisticRegressionCV` implements Logistic Regression with
696+
builtin cross-validation to find out the optimal C parameter. In
697+
general the "newton-cg" and "lbfgs" solvers are found to be faster
698+
due to warm-starting. For the multiclass case, One-vs-All is used
699+
and an optimal C is obtained for each class.
700+
688701

689702
Stochastic Gradient Descent - SGD
690703
=================================

doc/tutorial/statistical_inference/supervised_learning.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -370,8 +370,8 @@ function or **logistic** function:
370370
>>> logistic = linear_model.LogisticRegression(C=1e5)
371371
>>> logistic.fit(iris_X_train, iris_y_train)
372372
LogisticRegression(C=100000.0, class_weight=None, dual=False,
373-
fit_intercept=True, intercept_scaling=1, penalty='l2',
374-
random_state=None, tol=0.0001)
373+
fit_intercept=True, intercept_scaling=1, max_iter=100,
374+
penalty='l2', random_state=None, solver='liblinear', tol=0.0001)
375375

376376
This is known as :class:`LogisticRegression`.
377377

doc/whats_new.rst

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,9 @@ New features
2121
- Add the :func:`metrics.label_ranking_average_precision_score` metrics. By
2222
`Arnaud Joly`_.
2323

24+
- Added :class:`linear_model.LogisticRegressionCV`. By
25+
`Manoj Kumar`_, `Fabian Pedregosa`_, `Gael Varoquaux`_
26+
and `Alexandre Gramfort`_.
2427

2528
Enhancements
2629
............
@@ -29,6 +32,9 @@ Enhancements
2932
- Add support for sample weights in scorer objects. Metrics with sample
3033
weight support will automatically benefit from it.
3134

35+
- Added ``newton-cg`` and `lbfgs` solver support in
36+
:class:`linear_model.LogisticRegression`. By `Manoj Kumar`_.
37+
3238

3339
Documentation improvements
3440
..........................

sklearn/linear_model/__init__.py

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,8 @@
2222
from .stochastic_gradient import SGDClassifier, SGDRegressor
2323
from .ridge import (Ridge, RidgeCV, RidgeClassifier, RidgeClassifierCV,
2424
ridge_regression)
25-
from .logistic import LogisticRegression
25+
from .logistic import (LogisticRegression, LogisticRegressionCV,
26+
logistic_regression_path)
2627
from .omp import (orthogonal_mp, orthogonal_mp_gram, OrthogonalMatchingPursuit,
2728
OrthogonalMatchingPursuitCV)
2829
from .passive_aggressive import PassiveAggressiveClassifier
@@ -48,6 +49,7 @@
4849
'LinearRegression',
4950
'Log',
5051
'LogisticRegression',
52+
'LogisticRegressionCV',
5153
'ModifiedHuber',
5254
'MultiTaskElasticNet',
5355
'MultiTaskElasticNetCV',
@@ -71,6 +73,7 @@
7173
'lars_path',
7274
'lasso_path',
7375
'lasso_stability_path',
76+
'logistic_regression_path',
7477
'orthogonal_mp',
7578
'orthogonal_mp_gram',
7679
'ridge_regression',

0 commit comments

Comments
 (0)