Skip to content

Commit 891839c

Browse files
MechCoderlarsmans
authored andcommitted
DOC: Explain prediction when decision_function is zero
LogisticRegression and liblinear's predictions differ when the decision function is zero. Explain why and what to do in that case. Fixes scikit-learn#3600 (by documenting the won't fix status).
1 parent d9288f0 commit 891839c

File tree

4 files changed

+43
-1
lines changed

4 files changed

+43
-1
lines changed

doc/modules/linear_model.rst

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -701,6 +701,21 @@ so the "multinomial" setting does not learn sparse models.
701701

702702
* :ref:`example_linear_model_plot_logistic_path.py`
703703

704+
.. _liblinear_differences:
705+
706+
.. topic:: Differences from liblinear:
707+
708+
There might be a difference in the scores obtained between
709+
:class:`LogisticRegression` with ``solver=liblinear``
710+
or :class:`LinearSVC` and the external liblinear library directly,
711+
when ``fit_intercept=False`` and the fit ``coef_`` (or) the data to
712+
be predicted are zeroes. This is because for the sample(s) with
713+
``decision_function`` zero, :class:`LogisticRegression` and :class:`LinearSVC`
714+
predict the negative class, while liblinear predicts the positive class.
715+
Note that a model with ``fit_intercept=False`` and having many samples with
716+
``decision_function`` zero, is likely to be a underfit, bad model and you are
717+
advised to set ``fit_intercept=True`` and increase the intercept_scaling.
718+
704719
.. note:: **Feature selection with sparse logistic regression**
705720

706721
A logistic regression with L1 penalty yields sparse models, and can

sklearn/linear_model/logistic.py

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -826,7 +826,12 @@ class LogisticRegression(BaseEstimator, LinearClassifierMixin,
826826
to have slightly different results for the same input data. If
827827
that happens, try with a smaller tol parameter.
828828
829-
References:
829+
Predict output may not match that of standalone liblinear in certain
830+
cases. See :ref:`differences from liblinear <liblinear_differences>`
831+
in the narrative documentation.
832+
833+
References
834+
----------
830835
831836
LIBLINEAR -- A Library for Large Linear Classification
832837
http://www.csie.ntu.edu.tw/~cjlin/liblinear/

sklearn/linear_model/tests/test_logistic.py

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -514,3 +514,21 @@ def test_logistic_regression_multinomial():
514514
clf_path.fit(X, y)
515515
assert_array_almost_equal(clf_path.coef_, clf_int.coef_, decimal=3)
516516
assert_almost_equal(clf_path.intercept_, clf_int.intercept_, decimal=3)
517+
518+
519+
def test_liblinear_decision_function_zero():
520+
"""Test negative prediction when decision_function values are zero.
521+
522+
Liblinear predicts the positive class when decision_function values
523+
are zero. This is a test to verify that we do not do the same.
524+
See Issue: https://github.com/scikit-learn/scikit-learn/issues/3600
525+
and the PR https://github.com/scikit-learn/scikit-learn/pull/3623
526+
"""
527+
rng = np.random.RandomState(0)
528+
X, y = make_classification(n_samples=5, n_features=5)
529+
clf = LogisticRegression(fit_intercept=False)
530+
clf.fit(X, y)
531+
532+
# Dummy data such that the decision function becomes zero.
533+
X = np.zeros((5, 5))
534+
assert_array_equal(clf.predict(X), np.zeros(5))

sklearn/svm/classes.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -109,6 +109,10 @@ class frequencies.
109109
The underlying implementation (liblinear) uses a sparse internal
110110
representation for the data that will incur a memory copy.
111111
112+
Predict output may not match that of standalone liblinear in certain
113+
cases. See :ref:`differences from liblinear <liblinear_differences>`
114+
in the narrative documentation.
115+
112116
**References:**
113117
`LIBLINEAR: A Library for Large Linear Classification
114118
<http://www.csie.ntu.edu.tw/~cjlin/liblinear/>`__

0 commit comments

Comments
 (0)