Skip to content

Commit 567a956

Browse files
committed
Merge branch 'pr/4714' + pep8 fixes
Conflicts: doc/modules/model_persistence.rst doc/modules/pipeline.rst doc/modules/svm.rst doc/tutorial/basic/tutorial.rst doc/tutorial/statistical_inference/supervised_learning.rst sklearn/svm/classes.py
2 parents 3f2f225 + 6d4bcda commit 567a956

File tree

13 files changed

+280
-137
lines changed

13 files changed

+280
-137
lines changed

doc/modules/model_persistence.rst

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -22,9 +22,10 @@ persistence model, namely `pickle <http://docs.python.org/library/pickle.html>`_
2222
>>> iris = datasets.load_iris()
2323
>>> X, y = iris.data, iris.target
2424
>>> clf.fit(X, y) # doctest: +NORMALIZE_WHITESPACE
25-
SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0, degree=3,
26-
gamma='auto', kernel='rbf', max_iter=-1, probability=False,
27-
random_state=None, shrinking=True, tol=0.001, verbose=False)
25+
SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,
26+
decision_function_shape=None, degree=3, gamma='auto', kernel='rbf',
27+
max_iter=-1, probability=False, random_state=None, shrinking=True,
28+
tol=0.001, verbose=False)
2829

2930
>>> import pickle
3031
>>> s = pickle.dumps(clf)

doc/modules/pipeline.rst

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -42,9 +42,9 @@ is an estimator object::
4242
>>> clf # doctest: +NORMALIZE_WHITESPACE
4343
Pipeline(steps=[('reduce_dim', PCA(copy=True, n_components=None,
4444
whiten=False)), ('svm', SVC(C=1.0, cache_size=200, class_weight=None,
45-
coef0=0.0, degree=3, gamma='auto', kernel='rbf', max_iter=-1,
46-
probability=False, random_state=None, shrinking=True, tol=0.001,
47-
verbose=False))])
45+
coef0=0.0, decision_function_shape=None, degree=3, gamma='auto',
46+
kernel='rbf', max_iter=-1, probability=False, random_state=None,
47+
shrinking=True, tol=0.001, verbose=False))])
4848

4949
The utility function :func:`make_pipeline` is a shorthand
5050
for constructing pipelines;
@@ -76,9 +76,9 @@ Parameters of the estimators in the pipeline can be accessed using the
7676
>>> clf.set_params(svm__C=10) # doctest: +NORMALIZE_WHITESPACE
7777
Pipeline(steps=[('reduce_dim', PCA(copy=True, n_components=None,
7878
whiten=False)), ('svm', SVC(C=10, cache_size=200, class_weight=None,
79-
coef0=0.0, degree=3, gamma='auto', kernel='rbf', max_iter=-1,
80-
probability=False, random_state=None, shrinking=True, tol=0.001,
81-
verbose=False))])
79+
coef0=0.0, decision_function_shape=None, degree=3, gamma='auto',
80+
kernel='rbf', max_iter=-1, probability=False, random_state=None,
81+
shrinking=True, tol=0.001, verbose=False))])
8282

8383
This is particularly important for doing grid searches::
8484

doc/modules/svm.rst

Lines changed: 22 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -76,9 +76,10 @@ n_features]`` holding the training samples, and an array y of class labels
7676
>>> y = [0, 1]
7777
>>> clf = svm.SVC()
7878
>>> clf.fit(X, y) # doctest: +NORMALIZE_WHITESPACE
79-
SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0, degree=3,
80-
gamma='auto', kernel='rbf', max_iter=-1, probability=False,
81-
random_state=None, shrinking=True, tol=0.001, verbose=False)
79+
SVC(C=1.0, cache_size=200, class_weight=None, coef0='auto',
80+
decision_function_shape=None, degree=3, gamma='auto', kernel='rbf',
81+
max_iter=-1, probability=False, random_state=None, shrinking=True,
82+
tol=0.001, verbose=False)
8283

8384
After being fitted, the model can then be used to predict new values::
8485

@@ -109,18 +110,27 @@ Multi-class classification
109110
:class:`SVC` and :class:`NuSVC` implement the "one-against-one"
110111
approach (Knerr et al., 1990) for multi- class classification. If
111112
``n_class`` is the number of classes, then ``n_class * (n_class - 1) / 2``
112-
classifiers are constructed and each one trains data from two classes::
113+
classifiers are constructed and each one trains data from two classes.
114+
To provide a consistent interface with other classifiers, the
115+
``decision_function_shape`` option allows to aggregate the results of the
116+
"one-against-one" classifiers to a decision function of shape ``(n_samples,
117+
n_classes)``::
113118

114119
>>> X = [[0], [1], [2], [3]]
115120
>>> Y = [0, 1, 2, 3]
116-
>>> clf = svm.SVC()
121+
>>> clf = svm.SVC(decision_function_shape='ovo')
117122
>>> clf.fit(X, Y) # doctest: +NORMALIZE_WHITESPACE
118-
SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0, degree=3,
119-
gamma='auto', kernel='rbf', max_iter=-1, probability=False,
120-
random_state=None, shrinking=True, tol=0.001, verbose=False)
123+
SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,
124+
decision_function_shape='ovo', degree=3, gamma='auto', kernel='rbf',
125+
max_iter=-1, probability=False, random_state=None, shrinking=True,
126+
tol=0.001, verbose=False)
121127
>>> dec = clf.decision_function([[1]])
122128
>>> dec.shape[1] # 4 classes: 4*3/2 = 6
123129
6
130+
>>> clf.decision_function_shape = "ovr"
131+
>>> dec = clf.decision_function([[1]])
132+
>>> dec.shape[1] # 4 classes
133+
4
124134

125135
On the other hand, :class:`LinearSVC` implements "one-vs-the-rest"
126136
multi-class strategy, thus training n_class models. If there are only
@@ -503,9 +513,10 @@ test vectors must be provided.
503513
>>> # linear kernel computation
504514
>>> gram = np.dot(X, X.T)
505515
>>> clf.fit(gram, y) # doctest: +NORMALIZE_WHITESPACE
506-
SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0, degree=3,
507-
gamma='auto', kernel='precomputed', max_iter=-1, probability=False,
508-
random_state=None, shrinking=True, tol=0.001, verbose=False)
516+
SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,
517+
decision_function_shape=None, degree=3, gamma='auto',
518+
kernel='precomputed', max_iter=-1, probability=False,
519+
random_state=None, shrinking=True, tol=0.001, verbose=False)
509520
>>> # predict on training examples
510521
>>> clf.predict(gram)
511522
array([0, 1])

doc/tutorial/basic/tutorial.rst

Lines changed: 28 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -176,9 +176,10 @@ which produces a new array that contains all but
176176
the last entry of ``digits.data``::
177177

178178
>>> clf.fit(digits.data[:-1], digits.target[:-1]) # doctest: +NORMALIZE_WHITESPACE
179-
SVC(C=100.0, cache_size=200, class_weight=None, coef0=0.0, degree=3,
180-
gamma=0.001, kernel='rbf', max_iter=-1, probability=False,
181-
random_state=None, shrinking=True, tol=0.001, verbose=False)
179+
SVC(C=100.0, cache_size=200, class_weight=None, coef0=0.0,
180+
decision_function_shape=None, degree=3, gamma=0.001, kernel='rbf',
181+
max_iter=-1, probability=False, random_state=None, shrinking=True,
182+
tol=0.001, verbose=False)
182183

183184
Now you can predict new values, in particular, we can ask to the
184185
classifier what is the digit of our last image in the ``digits`` dataset,
@@ -214,9 +215,10 @@ persistence model, namely `pickle <http://docs.python.org/library/pickle.html>`_
214215
>>> iris = datasets.load_iris()
215216
>>> X, y = iris.data, iris.target
216217
>>> clf.fit(X, y) # doctest: +NORMALIZE_WHITESPACE
217-
SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0, degree=3,
218-
gamma='auto', kernel='rbf', max_iter=-1, probability=False,
219-
random_state=None, shrinking=True, tol=0.001, verbose=False)
218+
SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,
219+
decision_function_shape=None, degree=3, gamma='auto', kernel='rbf',
220+
max_iter=-1, probability=False, random_state=None, shrinking=True,
221+
tol=0.001, verbose=False)
220222

221223
>>> import pickle
222224
>>> s = pickle.dumps(clf)
@@ -286,18 +288,20 @@ maintained::
286288
>>> from sklearn.svm import SVC
287289
>>> iris = datasets.load_iris()
288290
>>> clf = SVC()
289-
>>> clf.fit(iris.data, iris.target)
290-
SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0, degree=3,
291-
gamma='auto', kernel='rbf', max_iter=-1, probability=False,
292-
random_state=None, shrinking=True, tol=0.001, verbose=False)
291+
>>> clf.fit(iris.data, iris.target) # doctest: +NORMALIZE_WHITESPACE
292+
SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,
293+
decision_function_shape=None, degree=3, gamma='auto', kernel='rbf',
294+
max_iter=-1, probability=False, random_state=None, shrinking=True,
295+
tol=0.001, verbose=False)
293296

294297
>>> list(clf.predict(iris.data[:3]))
295298
[0, 0, 0]
296299

297-
>>> clf.fit(iris.data, iris.target_names[iris.target])
298-
SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0, degree=3,
299-
gamma='auto', kernel='rbf', max_iter=-1, probability=False,
300-
random_state=None, shrinking=True, tol=0.001, verbose=False)
300+
>>> clf.fit(iris.data, iris.target_names[iris.target]) # doctest: +NORMALIZE_WHITESPACE
301+
SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,
302+
decision_function_shape=None, degree=3, gamma='auto', kernel='rbf',
303+
max_iter=-1, probability=False, random_state=None, shrinking=True,
304+
tol=0.001, verbose=False)
301305

302306
>>> list(clf.predict(iris.data[:3])) # doctest: +NORMALIZE_WHITESPACE
303307
['setosa', 'setosa', 'setosa']
@@ -323,17 +327,19 @@ more than once will overwrite what was learned by any previous ``fit()``::
323327
>>> X_test = rng.rand(5, 10)
324328

325329
>>> clf = SVC()
326-
>>> clf.set_params(kernel='linear').fit(X, y)
327-
SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0, degree=3,
328-
gamma='auto', kernel='linear', max_iter=-1, probability=False,
329-
random_state=None, shrinking=True, tol=0.001, verbose=False)
330+
>>> clf.set_params(kernel='linear').fit(X, y) # doctest: +NORMALIZE_WHITESPACE
331+
SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,
332+
decision_function_shape=None, degree=3, gamma='auto', kernel='linear',
333+
max_iter=-1, probability=False, random_state=None, shrinking=True,
334+
tol=0.001, verbose=False)
330335
>>> clf.predict(X_test)
331336
array([1, 0, 1, 1, 0])
332337

333-
>>> clf.set_params(kernel='rbf').fit(X, y)
334-
SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0, degree=3,
335-
gamma='auto', kernel='rbf', max_iter=-1, probability=False,
336-
random_state=None, shrinking=True, tol=0.001, verbose=False)
338+
>>> clf.set_params(kernel='rbf').fit(X, y) # doctest: +NORMALIZE_WHITESPACE
339+
SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,
340+
decision_function_shape=None, degree=3, gamma='auto', kernel='rbf',
341+
max_iter=-1, probability=False, random_state=None, shrinking=True,
342+
tol=0.001, verbose=False)
337343
>>> clf.predict(X_test)
338344
array([0, 0, 0, 1, 0])
339345

doc/tutorial/statistical_inference/supervised_learning.rst

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -453,9 +453,10 @@ classification --:class:`SVC` (Support Vector Classification).
453453
>>> from sklearn import svm
454454
>>> svc = svm.SVC(kernel='linear')
455455
>>> svc.fit(iris_X_train, iris_y_train) # doctest: +NORMALIZE_WHITESPACE
456-
SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0, degree=3,
457-
gamma='auto', kernel='linear', max_iter=-1, probability=False,
458-
random_state=None, shrinking=True, tol=0.001, verbose=False)
456+
SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,
457+
decision_function_shape=None, degree=3, gamma='auto', kernel='linear',
458+
max_iter=-1, probability=False, random_state=None, shrinking=True,
459+
tol=0.001, verbose=False)
459460

460461

461462
.. warning:: **Normalizing data**

doc/whats_new.rst

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -83,6 +83,11 @@ API changes summary
8383
for retrieving the leaf indices samples are predicted as. By
8484
`Daniel Galvez`_ and `Gilles Louppe`_.
8585

86+
- :class:`svm.SVC`` and :class:`svm.NuSVC` now have an ``decision_function_shape``
87+
parameter to make their decision function of shape ``(n_samples, n_classes)``
88+
by setting ``decision_function_shape='ovr'``. This will be the default behavior
89+
starting in 0.19. By `Andreas Müller`_.
90+
8691
.. _changes_0_1_16:
8792

8893
0.16.1

sklearn/base.py

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,11 @@
1111
from .externals import six
1212

1313

14-
###############################################################################
14+
class ChangedBehaviorWarning(UserWarning):
15+
pass
16+
17+
18+
##############################################################################
1519
def clone(estimator, safe=True):
1620
"""Constructs a new estimator with the same parameters.
1721

sklearn/grid_search.py

Lines changed: 5 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@
2020
import numpy as np
2121

2222
from .base import BaseEstimator, is_classifier, clone
23-
from .base import MetaEstimatorMixin
23+
from .base import MetaEstimatorMixin, ChangedBehaviorWarning
2424
from .cross_validation import _check_cv as check_cv
2525
from .cross_validation import _fit_and_score
2626
from .externals.joblib import Parallel, delayed
@@ -308,10 +308,6 @@ def __repr__(self):
308308
self.parameters)
309309

310310

311-
class ChangedBehaviorWarning(UserWarning):
312-
pass
313-
314-
315311
class BaseSearchCV(six.with_metaclass(ABCMeta, BaseEstimator,
316312
MetaEstimatorMixin)):
317313
"""Base class for hyper parameter search with cross-validation."""
@@ -648,9 +644,10 @@ class GridSearchCV(BaseSearchCV):
648644
... # doctest: +NORMALIZE_WHITESPACE +ELLIPSIS
649645
GridSearchCV(cv=None, error_score=...,
650646
estimator=SVC(C=1.0, cache_size=..., class_weight=..., coef0=...,
651-
degree=..., gamma=..., kernel='rbf', max_iter=-1,
652-
probability=False, random_state=None, shrinking=True,
653-
tol=..., verbose=False),
647+
decision_function_shape=None, degree=..., gamma=...,
648+
kernel='rbf', max_iter=-1, probability=False,
649+
random_state=None, shrinking=True, tol=...,
650+
verbose=False),
654651
fit_params={}, iid=..., n_jobs=1,
655652
param_grid=..., pre_dispatch=..., refit=...,
656653
scoring=..., verbose=...)

sklearn/multiclass.py

Lines changed: 52 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -556,36 +556,58 @@ def decision_function(self, X):
556556
"""
557557
check_is_fitted(self, 'estimators_')
558558

559-
n_samples = X.shape[0]
560-
n_classes = self.classes_.shape[0]
561-
votes = np.zeros((n_samples, n_classes))
562-
sum_of_confidences = np.zeros((n_samples, n_classes))
563-
564-
k = 0
565-
for i in range(n_classes):
566-
for j in range(i + 1, n_classes):
567-
pred = self.estimators_[k].predict(X)
568-
confidence_levels_ij = _predict_binary(self.estimators_[k], X)
569-
sum_of_confidences[:, i] -= confidence_levels_ij
570-
sum_of_confidences[:, j] += confidence_levels_ij
571-
votes[pred == 0, i] += 1
572-
votes[pred == 1, j] += 1
573-
k += 1
574-
575-
max_confidences = sum_of_confidences.max()
576-
min_confidences = sum_of_confidences.min()
577-
578-
if max_confidences == min_confidences:
579-
return votes
580-
581-
# Scale the sum_of_confidences to (-0.5, 0.5) and add it with votes.
582-
# The motivation is to use confidence levels as a way to break ties in
583-
# the votes without switching any decision made based on a difference
584-
# of 1 vote.
585-
eps = np.finfo(sum_of_confidences.dtype).eps
586-
max_abs_confidence = max(abs(max_confidences), abs(min_confidences))
587-
scale = (0.5 - eps) / max_abs_confidence
588-
return votes + sum_of_confidences * scale
559+
predictions = np.vstack([est.predict(X) for est in self.estimators_]).T
560+
confidences = np.vstack([_predict_binary(est, X) for est in self.estimators_]).T
561+
return _ovr_decision_function(predictions, confidences,
562+
len(self.classes_))
563+
564+
565+
def _ovr_decision_function(predictions, confidences, n_classes):
566+
"""Compute a continuous, tie-breaking ovr decision function.
567+
568+
It is important to include a continuous value, not only votes,
569+
to make computing AUC or calibration meaningful.
570+
571+
Parameters
572+
----------
573+
predictions : array-like, shape (n_samples, n_classifiers)
574+
Predicted classes for each binary classifier.
575+
576+
confidences : array-like, shape (n_samples, n_classifiers)
577+
Decision functions or predicted probabilities for positive class
578+
for each binary classifier.
579+
580+
n_classes : int
581+
Number of classes. n_classifiers must be
582+
``n_classes * (n_classes - 1 ) / 2``
583+
"""
584+
n_samples = predictions.shape[0]
585+
votes = np.zeros((n_samples, n_classes))
586+
sum_of_confidences = np.zeros((n_samples, n_classes))
587+
588+
k = 0
589+
for i in range(n_classes):
590+
for j in range(i + 1, n_classes):
591+
sum_of_confidences[:, i] -= confidences[:, k]
592+
sum_of_confidences[:, j] += confidences[:, k]
593+
votes[predictions[:, k] == 0, i] += 1
594+
votes[predictions[:, k] == 1, j] += 1
595+
k += 1
596+
597+
max_confidences = sum_of_confidences.max()
598+
min_confidences = sum_of_confidences.min()
599+
600+
if max_confidences == min_confidences:
601+
return votes
602+
603+
# Scale the sum_of_confidences to (-0.5, 0.5) and add it with votes.
604+
# The motivation is to use confidence levels as a way to break ties in
605+
# the votes without switching any decision made based on a difference
606+
# of 1 vote.
607+
eps = np.finfo(sum_of_confidences.dtype).eps
608+
max_abs_confidence = max(abs(max_confidences), abs(min_confidences))
609+
scale = (0.5 - eps) / max_abs_confidence
610+
return votes + sum_of_confidences * scale
589611

590612

591613
@deprecated("fit_ecoc is deprecated and will be removed in 0.18."

0 commit comments

Comments
 (0)