-
Notifications
You must be signed in to change notification settings - Fork 37
[READY] Fit intercept inside cd_solver #55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…into intercept
skglm/solvers/group_bcd_solver.py
Outdated
intercept_old = w[-1] | ||
w[-1] -= datafit.intercept_update_step(y, Xw) | ||
Xw += (w[-1] - intercept_old) | ||
|
||
w_acc, Xw_acc, is_extrapolated = accelerator.extrapolate(w, Xw) | ||
|
||
if is_extrapolated: # avoid computing p_obj for un-extrapolated w, Xw |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably below, the objective should be computed with w[:n_features]. I wonder why it does not fail, maybe it's not tested
skglm/tests/test_group.py
Outdated
@@ -132,6 +132,44 @@ def test_vs_celer_grouplasso(n_groups, n_features, shuffle): | |||
np.testing.assert_allclose(model.coef_, w, atol=1e-5) | |||
|
|||
|
|||
@pytest.mark.parametrize("n_groups, n_features, shuffle", | |||
[[15, 50, False]]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if a single set of parameters is used, we can get rid of parametrize and hardcode these.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is a first round of comments, my main complain is the test_intercept.py
which I think could / should be merged with test_estimators.py
skglm/tests/test_estimators.py
Outdated
if estimator_name == "GeneralizedLinearEstimator": | ||
pytest.skip() | ||
estimator_sk = dict_estimators_sk[estimator_name] | ||
estimator_ours = dict_estimators_ours[estimator_name] | ||
# TODO This seems a bit unusal, maybe to discuss |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@QB3 this has potentially harmful side effects: it sets the intercept to True on an object defined outside the function, so if we first call the function with fit_intercept=True, we set to model.fit_intercept
to True, then the second time the function is called, the model still has clf.fit_intercept
set to True.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it can be avoided by cloning the estimator when entering the test instead (sklearn.base.clone)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! For the documentation, i don't feel able to write it for the intercept fitting right now as I don't fully understand what's going on. @Badr-MOUFAD do you want to give it a try? a few lines would suffice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did a second pass. I think we are almost done.
Some minor comments.
skglm/tests/test_group.py
Outdated
fit_intercept=True, tol=1e-12) | ||
model.fit(X, y) | ||
|
||
np.testing.assert_allclose(model.coef_, w[:X.shape[1]], atol=1e-5) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree!
we can test w[-1] against intercept_ when fit_intercept=True?
@@ -268,19 +285,27 @@ def cd_solver( | |||
for epoch in range(max_epochs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mathurinm, I don't know whether it's worth it to check intercept optimality when solving subproblems. WDYT?
Second part of #19 split (first part is #54)