-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Change default EvaluationMetric for LightGbm trainers to conform to d… #3859
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…efault metric in standalone LightGbm
@@ -162,7 +162,7 @@ public enum EvaluateMetricType | |||
[Argument(ArgumentType.AtMostOnce, | |||
HelpText = "Evaluation metrics.", | |||
ShortName = "em")] | |||
public EvaluateMetricType EvaluationMetric = EvaluateMetricType.Logloss; | |||
public EvaluateMetricType EvaluationMetric = EvaluateMetricType.Default; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Default [](start = 76, length = 7)
Isn't this a breaking change?
cc @eerhardt
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but the other option is to have an inconsistent user experience. I talked to @ebarsoumMS about this. Let's discuss and reach a conclusion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not an "API breaking change". I think it falls into the scenarios that @TomFinley listed here #3602 (comment).
However even many years later sometimes we still have somewhat troublesome defaults running around
Here, if there is a better default value, I think it is acceptable to change the default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, this doesn't actually change training behavior, nor the metrics calculated by ML.NET evaluators. Just changes the metric that LightGbm calculates internally.
In ML.NET, when we do the following (e.g. for binary classification)
var transformedTestData = model.Transform(testData);
var metrics = mlContext.BinaryClassification.Evaluate(transformedTestData);
the evaluator computes all relevant metrics for binary classification regardless of what is specified by LightGbm's EvaluationMetric
parameter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may control LightGBM's early stopping, but otherwise I think this is a NOOP change. ML.NET doesn't relay the stdout from LightGBM to the user, and ML.NET uses its own evaluators for computing the final metrics.
Users could benefit from ML.NET relaying this info back to the user. This would allow a GUI to show the learning curves in real time (or as text output from a CLI):
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
…orm to default metric in standalone LightGbm (dotnet#3859)" This reverts commit 3a35a82.
…efault metric in standalone LightGbm (dotnet#3859)
…efault metric in standalone LightGbm
Fixes #3822
In ML.NET, the default
EvaluationMetric
for LightGbm is set toEvaluateMetricType.Error
for multiclass,EvaluationMetricType.LogLoss
for binary, and so on. This leads to inconsistent behavior from the user's perspective: If a user specifiedEvaluationMetric = EvaluateMetricType.Default
, the parameter passed to LightGbm would be the empty string "", which is the LightGbm default and means that the metric is selected based on the objective. However, if they do not specifyEvaluationMetric
at all, the parameter passed to LightGbm would be Error for multiclass, LogLoss for binary, and so on.We should make the experience for LightGbm in ML.NET consistent with what a user of standalone LightGbm experiences, and not expect them to dig through LightGbm docs and ML.NET docs to find this out.
This PR makes the user experience consistent with standalone LightGbm by by changing the default
EvaluationMetric
in ML.NET toEvaluationMetricType.Default
.LightGbm metric parameters docs