-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Issue 3234: use model schema type instead of class definition schema #5228
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -217,6 +217,7 @@ public class ModelInput | |||
public string[] CategoricalFeatures; | |||
public float[] NumericalFeatures; | |||
#pragma warning restore SA1401 | |||
public float Label; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Label [](start = 25, length = 5)
add extra column Label as the model modelwithoptionalcolumntransform's schema contains an extra Label column
Codecov Report
@@ Coverage Diff @@
## master #5228 +/- ##
==========================================
- Coverage 73.47% 73.45% -0.02%
==========================================
Files 1010 1010
Lines 187988 187984 -4
Branches 20262 20261 -1
==========================================
- Hits 138118 138092 -26
- Misses 44385 44404 +19
- Partials 5485 5488 +3
|
|
||
//Always use column type from model as this type can be more specific. | ||
//This can be corner case: | ||
//For example, we can load an model whose schema contains Vector<Single, 38> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will there be a check or at least logging for this corner case? Maybe we should put in a check for arrays of size 0, which is what float[]
would be. Also is there a check for the size of provided arrays matching the size of the corresponding Vector
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
test LoadModelWithOptionalColumnTransform is to test this case.
for length check, it is applied at below:
https://github.com/dotnet/machinelearning/blob/master/src/Microsoft.ML.Data/Scorers/SchemaBindablePredictorWrapper.cs#L129-L147
In reply to: 439585282 [](ancestors = 439585282)
@@ -230,15 +231,17 @@ public class ModelOutput | |||
[Fact] | |||
public void LoadModelWithOptionalColumnTransform() | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this fix invalidate the old test?
It appears that the the previous test was testing this using SchemaDefinition and to reproduce this issue you are using the one DataViewSchema.
Does it make sense to test both code paths not replace the previous one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, the old test is still valid. I'm ok with test both code path not just replace as I don't find a test is test the previous case directly.
This test was originally created at below PR:
#3700
From the conversation you can see the original intention was to use the modified way to write test(load model with schema output then use the schema to create prediction engine), but with the type bug this test was written in the way with extra schema definition.
I will made the change to test both ways of create prediction engine, thanks
In reply to: 440394844 [](ancestors = 440394844)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
address issue: #3234