-
Notifications
You must be signed in to change notification settings - Fork 2.7k
Add Ranking AutoML Sample #852
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@justinormont Feel free to let me know if anything is missing or should be changed with this sample. |
samples/csharp/getting-started/Ranking_AutoML/Ranking/Ranking/Program.cs
Outdated
Show resolved
Hide resolved
|
||
ConsoleHelper.PrintRankingMetrics(bestRun.TrainerName, metrics); | ||
|
||
// STEP 6: Save/persist the trained model to a .ZIP file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you're up for it, you could demonstrate thrice training.
The general idea is to progressively refit the best found pipeline with merged data from all of the datasets in a three step process.
step | fitting data | scoring data | notes |
---|---|---|---|
1 | train | valid | Done within AutoML to find the best pipeline |
2 | train+valid | test | Gives final metrics which estimate the model's perf in production |
3 | train+valid+test | N/A | Gives final model to launch to production |
I have an example created for another bug:
full example -- https://dotnetfiddle.net/nWpCkP
// Re-fit best pipeline on train and validation data, to produce
// a model that is trained on as much data as is available while
// still having test data for the final estimate of how well the
// model will do in production.
Console.WriteLine("\n===== Refitting on train+valid and evaluating model's rsquared with test data =====");
var TrainPlusValidationDataView = textLoader.Load(new MultiFileSource(TrainDataPath, ValidationDataPath));
var refitModel1 = experimentResult.BestRun.Estimator.Fit(TrainPlusValidationDataView);
IDataView predictionsRefitOnTrainPlusValidation = refitModel1.Transform(TestDataView);
var metricsRefitOnTrainPlusValidation = mlContext.Regression.Evaluate(predictionsRefitOnTrainPlusValidation, labelColumnName: "Label", scoreColumnName: "Score");
Console.WriteLine("|" + $"{"-",-4} {experimentResult.BestRun.TrainerName,-35} {metricsRefitOnTrainPlusValidation?.RSquared ?? double.NaN,8:F4} {metricsRefitOnTrainPlusValidation?.MeanAbsoluteError ?? double.NaN,13:F2} {metricsRefitOnTrainPlusValidation?.MeanSquaredError ?? double.NaN,12:F2} {metricsRefitOnTrainPlusValidation?.RootMeanSquaredError ?? double.NaN,8:F2} {"-",9}".PadRight(112) + "|");
// Re-fit best pipeline on train, validation, and test data, to
// produce a model that is trained on as much data as is available.
// This is the final model that can be deployed to production.
Console.WriteLine("\n===== Refitting on train+valid+test to get the final model to launch to production =====");
var TrainPlusValidationPlusTestDataView = textLoader.Load(new MultiFileSource(TrainDataPath, ValidationDataPath, TestDataPath));
var refitModel2 = experimentResult.BestRun.Estimator.Fit(TrainPlusValidationPlusTestDataView);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an interesting concept! I'll definitely give this a go.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for letting me know about this! I think a video on this would be good to make. 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would make a great video. This process is most beneficial when the training set is small, or the dataset is split by time.
Ranking datasets are often split by time with the older data in the training split, newer data in the validation, and newest in the test dataset split. There's mainly two gains in using time splits; (1) removing leakage -- see "time leakage" in https://en.wikipedia.org/wiki/Leakage_(machine_learning), and (2) the most valuable data is the most up to date data as user trends shift over time.
The newest data is the most representative of the live production traffic the model will see, hence it's a better measure of the model's performance when launched in production (aka, why we use it as the final metrics), and its use in refitting the model will create a model which performs better on the production traffic.
samples/csharp/getting-started/Ranking_AutoML/Ranking/Ranking/Program.cs
Outdated
Show resolved
Hide resolved
samples/csharp/getting-started/Ranking_AutoML/Ranking/Ranking/Program.cs
Outdated
Show resolved
Hide resolved
samples/csharp/getting-started/Ranking_AutoML/Ranking/Ranking/Program.cs
Outdated
Show resolved
Hide resolved
samples/csharp/getting-started/Ranking_AutoML/Ranking/Ranking/Program.cs
Outdated
Show resolved
Hide resolved
samples/csharp/getting-started/Ranking_AutoML/Ranking/Ranking/Program.cs
Outdated
Show resolved
Hide resolved
samples/csharp/getting-started/Ranking_AutoML/Ranking/Ranking/Program.cs
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM; thanks so much for adding a sample.
Two minor/optional items:
- Matching the sort order, so the sample chooses the same best model as the AutoML code Add Ranking AutoML Sample #852 (comment)
- Checking on the runtime Add Ranking AutoML Sample #852 (review)
samples/csharp/getting-started/Ranking_AutoML/Ranking/Ranking/Ranking.csproj
Show resolved
Hide resolved
The issue to add the truncation to the AutoML API may be in soon, so we may be able to wait for that and update this accordingly. |
samples/csharp/getting-started/Ranking_AutoML/Ranking/README.md
Outdated
Show resolved
Hide resolved
@justinormont The latest version got pushed with the DcgTruncation change so I updated to use it. Just let me know if I missed anything. |
samples/csharp/getting-started/Ranking_AutoML/Ranking/Ranking/Program.cs
Outdated
Show resolved
Hide resolved
samples/csharp/getting-started/Ranking_AutoML/Ranking/Ranking/Program.cs
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Minor items: #852 (comment) & #852 (comment).
// The scores are used to determine the ranking where a higher score indicates a higher ranking versus another candidate result. | ||
foreach (var prediction in firstGroupPredictions) | ||
{ | ||
Console.WriteLine($"GroupId: {prediction.GroupId}, Score: {prediction.Score}"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This prints:
=============== Testing prediction engine ===============
#########################################################
=============== Loaded Model OK ===============
GroupId: 94629043, Score: 10.5748415
GroupId: 94629043, Score: 8.747048
GroupId: 94629043, Score: 8.339484
GroupId: 94629043, Score: 8.295864
GroupId: 94629043, Score: 7.7126718
GroupId: 94629043, Score: 7.094361
GroupId: 94629043, Score: 6.403935
GroupId: 94629043, Score: 5.6126056
GroupId: 94629043, Score: 5.4343157
GroupId: 94629043, Score: 5.341767
GroupId: 94629043, Score: 3.4955919
GroupId: 94629043, Score: 3.0068336
GroupId: 94629043, Score: 2.925596
...
Is there more information we can print? Currently the user has no way to compare the ranked results against the original input data.
Could we print the correct/input Label in a column? The user could then verify the input Label value is decreasing in the list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a way to do a late merge, without adding Label
to the RankingPrediction
class?
Do you think if we add the Label
to the output class, some users will misconstrue it as the predicted label (Score
is the actual prediction)? Seeing no mispredictions, users could assume their model is perfect, and launch it to production only to find the Label
field is then empty.
Even without the added label output, this PR is very good. I'm ok merging as is, or with adding the correct/input label.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My only thought is to do something like this:
var predictionsPreview = _predictions.Preview();
for (int i = 0; i < firstGroupPredictions.Count; i++)
{
var currentPredictionPreview = predictionsPreview.RowView[i].Values;
Console.WriteLine($"GroupId: {firstGroupPredictions[i].GroupId}, Score: {firstGroupPredictions[i].Score}");
}
Use the predictions from the test set and get the label from the Preview
of it.
I'm not sure if it's the best thing to do or not, though. If this isn't good, perhaps we can get this in and can do another update once we can find a better solution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does this look:
// Label values from the test dataset (not the predicted scores/labels)
IEnumerator<float> labelEnumerator = mlContext.Data.CreateEnumerable<RankingData>(testDataView, true).Select(a => a.Label).GetEnumerator();
foreach (var prediction in firstGroupPredictions)
{
labelEnumerator.MoveNext();
Console.WriteLine($"GroupId: {prediction.GroupId}, Score: {prediction.Score}, Correct Label: {labelEnumerator.Current}");
}
It would require passing in testDataView
. Technically, the same enumerator can be made from firstGroupPredictions
, though reading from the test dataset reinforces that the printed label is not part of the prediction, but instead a given label.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That works great! Pushed the latest changes with this in it. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Good to merge?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think so! Thank you for helping me with this sample!
samples/csharp/getting-started/Ranking_AutoML/Ranking/Ranking/Program.cs
Outdated
Show resolved
Hide resolved
@jwood803: I checked-in various fixes. Most fixes were addressing my own review feedback. I also modified the thrice training to use the correct scoring dataset, and to return the refit model. Only remaining feedback is #852 (review). |
samples/csharp/getting-started/Ranking_AutoML/Ranking/Ranking/Program.cs
Outdated
Show resolved
Hide resolved
* Initial add of project * Update ranking sample * Get sample working * Updates based on feedback * Add refitting on validation and test data sets * Update console headers * Iteration print improvements * Correct validationData * Printing NDCG@1,3,10 & DCG@10 * Printing NDCG@1,3,10 & DCG@10 * Add readme * Update based on feedback * Use new DcgTruncation property * Update to latest AutoML package * Review feedback * Wording for 1st refit step * Update to include original label in output Co-authored-by: Justin Ormont <[email protected]>
Add Ranking sample for AutoML.
Update for this issue