Skip to content

Commit c10c886

Browse files
Sentient07ogrisel
authored andcommitted
Deprecate randomized_l1 (scikit-learn#9031)
1 parent 896f9d9 commit c10c886

File tree

7 files changed

+55
-242
lines changed

7 files changed

+55
-242
lines changed

doc/modules/feature_selection.rst

Lines changed: 0 additions & 61 deletions
Original file line numberDiff line numberDiff line change
@@ -227,67 +227,6 @@ alpha parameter, the fewer features selected.
227227
Processing Magazine [120] July 2007
228228
http://dsp.rice.edu/sites/dsp.rice.edu/files/cs/baraniukCSlecture07.pdf
229229

230-
.. _randomized_l1:
231-
232-
Randomized sparse models
233-
-------------------------
234-
235-
.. currentmodule:: sklearn.linear_model
236-
237-
In terms of feature selection, there are some well-known limitations of
238-
L1-penalized models for regression and classification. For example, it is
239-
known that the Lasso will tend to select an individual variable out of a group
240-
of highly correlated features. Furthermore, even when the correlation between
241-
features is not too high, the conditions under which L1-penalized methods
242-
consistently select "good" features can be restrictive in general.
243-
244-
To mitigate this problem, it is possible to use randomization techniques such
245-
as those presented in [B2009]_ and [M2010]_. The latter technique, known as
246-
stability selection, is implemented in the module :mod:`sklearn.linear_model`.
247-
In the stability selection method, a subsample of the data is fit to a
248-
L1-penalized model where the penalty of a random subset of coefficients has
249-
been scaled. Specifically, given a subsample of the data
250-
:math:`(x_i, y_i), i \in I`, where :math:`I \subset \{1, 2, \ldots, n\}` is a
251-
random subset of the data of size :math:`n_I`, the following modified Lasso
252-
fit is obtained:
253-
254-
.. math:: \hat{w_I} = \mathrm{arg}\min_{w} \frac{1}{2n_I} \sum_{i \in I} (y_i - x_i^T w)^2 + \alpha \sum_{j=1}^p \frac{ \vert w_j \vert}{s_j},
255-
256-
where :math:`s_j \in \{s, 1\}` are independent trials of a fair Bernoulli
257-
random variable, and :math:`0<s<1` is the scaling factor. By repeating this
258-
procedure across different random subsamples and Bernoulli trials, one can
259-
count the fraction of times the randomized procedure selected each feature,
260-
and used these fractions as scores for feature selection.
261-
262-
:class:`RandomizedLasso` implements this strategy for regression
263-
settings, using the Lasso, while :class:`RandomizedLogisticRegression` uses the
264-
logistic regression and is suitable for classification tasks. To get a full
265-
path of stability scores you can use :func:`lasso_stability_path`.
266-
267-
.. figure:: ../auto_examples/linear_model/images/sphx_glr_plot_sparse_recovery_003.png
268-
:target: ../auto_examples/linear_model/plot_sparse_recovery.html
269-
:align: center
270-
:scale: 60
271-
272-
Note that for randomized sparse models to be more powerful than standard
273-
F statistics at detecting non-zero features, the ground truth model
274-
should be sparse, in other words, there should be only a small fraction
275-
of features non zero.
276-
277-
.. topic:: Examples:
278-
279-
* :ref:`sphx_glr_auto_examples_linear_model_plot_sparse_recovery.py`: An example
280-
comparing different feature selection approaches and discussing in
281-
which situation each approach is to be favored.
282-
283-
.. topic:: References:
284-
285-
.. [B2009] F. Bach, "Model-Consistent Sparse Estimation through the
286-
Bootstrap." https://hal.inria.fr/hal-00354771/
287-
288-
.. [M2010] N. Meinshausen, P. Buhlmann, "Stability selection",
289-
Journal of the Royal Statistical Society, 72 (2010)
290-
http://arxiv.org/pdf/0809.2932.pdf
291230

292231
Tree-based feature selection
293232
----------------------------

doc/modules/linear_model.rst

Lines changed: 0 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -205,11 +205,6 @@ computes the coefficients along the full path of possible values.
205205
thus be used to perform feature selection, as detailed in
206206
:ref:`l1_feature_selection`.
207207

208-
.. note:: **Randomized sparsity**
209-
210-
For feature selection or sparse recovery, it may be interesting to
211-
use :ref:`randomized_l1`.
212-
213208

214209
Setting regularization parameter
215210
--------------------------------

doc/whats_new.rst

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -575,6 +575,7 @@ API changes summary
575575
- ``utils.sparsetools.connected_components``
576576
- ``utils.stats.rankdata``
577577
- ``neighbors.approximate.LSHForest``
578+
- ``linear_model.randomized_l1``
578579

579580
- Deprecate the ``y`` parameter in `transform` and `inverse_transform`.
580581
The method should not accept ``y`` parameter, as it's used at the prediction time.
@@ -1306,6 +1307,9 @@ Model evaluation and meta-estimators
13061307
the parameter ``n_labels`` is renamed to ``n_groups``.
13071308
:issue:`6660` by `Raghav RV`_.
13081309

1310+
- The :mod:`sklearn.linear_model.randomized_l1` is deprecated.
1311+
:issue: `8995` by :user:`Ramana.S <sentient07>`.
1312+
13091313
Code Contributors
13101314
-----------------
13111315
Aditya Joshi, Alejandro, Alexander Fabisch, Alexander Loginov, Alexander

examples/linear_model/plot_sparse_recovery.py

Lines changed: 0 additions & 173 deletions
This file was deleted.

sklearn/linear_model/__init__.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -30,8 +30,10 @@
3030
from .passive_aggressive import PassiveAggressiveClassifier
3131
from .passive_aggressive import PassiveAggressiveRegressor
3232
from .perceptron import Perceptron
33+
3334
from .randomized_l1 import (RandomizedLasso, RandomizedLogisticRegression,
3435
lasso_stability_path)
36+
3537
from .ransac import RANSACRegressor
3638
from .theil_sen import TheilSenRegressor
3739

sklearn/linear_model/randomized_l1.py

Lines changed: 12 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6,9 +6,10 @@
66
# Author: Gael Varoquaux, Alexandre Gramfort
77
#
88
# License: BSD 3 clause
9+
10+
import warnings
911
import itertools
1012
from abc import ABCMeta, abstractmethod
11-
import warnings
1213

1314
import numpy as np
1415
from scipy.sparse import issparse
@@ -20,7 +21,8 @@
2021
from ..externals import six
2122
from ..externals.joblib import Memory, Parallel, delayed
2223
from ..feature_selection.base import SelectorMixin
23-
from ..utils import (as_float_array, check_random_state, check_X_y, safe_mask)
24+
from ..utils import (as_float_array, check_random_state, check_X_y, safe_mask,
25+
deprecated)
2426
from ..utils.validation import check_is_fitted
2527
from .least_angle import lars_path, LassoLarsIC
2628
from .logistic import LogisticRegression
@@ -58,6 +60,8 @@ def _resample_model(estimator_func, X, y, scaling=.5, n_resampling=200,
5860
return scores_
5961

6062

63+
@deprecated("The class BaseRandomizedLinearModel is deprecated in 0.19"
64+
" and will be removed in 0.21.")
6165
class BaseRandomizedLinearModel(six.with_metaclass(ABCMeta, BaseEstimator,
6266
SelectorMixin)):
6367
"""Base class to implement randomized linear models for feature selection
@@ -178,6 +182,8 @@ def _randomized_lasso(X, y, weights, mask, alpha=1., verbose=False,
178182
return scores
179183

180184

185+
@deprecated("The class RandomizedLasso is deprecated in 0.19"
186+
" and will be removed in 0.21.")
181187
class RandomizedLasso(BaseRandomizedLinearModel):
182188
"""Randomized Lasso.
183189
@@ -388,6 +394,8 @@ def _randomized_logistic(X, y, weights, mask, C=1., verbose=False,
388394
return scores
389395

390396

397+
@deprecated("The class RandomizedLogisticRegression is deprecated in 0.19"
398+
" and will be removed in 0.21.")
391399
class RandomizedLogisticRegression(BaseRandomizedLinearModel):
392400
"""Randomized Logistic Regression
393401
@@ -573,6 +581,8 @@ def _lasso_stability_path(X, y, mask, weights, eps):
573581
return alphas, coefs
574582

575583

584+
@deprecated("The function lasso_stability_path is deprecated in 0.19"
585+
" and will be removed in 0.21.")
576586
def lasso_stability_path(X, y, scaling=0.5, random_state=None,
577587
n_resampling=200, n_grid=100,
578588
sample_fraction=0.75,

0 commit comments

Comments
 (0)