Skip to content

AutoML API model incompatible with AutoML CLI? #694

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
famschopman opened this issue Jul 20, 2019 · 11 comments · Fixed by dotnet/machinelearning#5227
Closed

AutoML API model incompatible with AutoML CLI? #694

famschopman opened this issue Jul 20, 2019 · 11 comments · Fixed by dotnet/machinelearning#5227
Assignees
Labels
Mark for close Issues pending final confirmation before closing Reported by: Customer

Comments

@famschopman
Copy link

When I train and save a model with the ModelBuilder from Visual Studio, and inspect the model that is loaded to predict I get a TrainSchema with all columns, and the last column is the Features column.

When I train and save a model with the AutoML API, it received an exception where it tries to find a SamplingKeyColumn. After inspection AutoML has saved the model with an additional column just before the Features column. This is different compared to the ModelBuilder behavior.

So I have to extend me model with the following lines to make sure I can run a prediction (I have 34 columns in my dataset).

        [ColumnName("SamplingKeyColumn"), LoadColumn(35)]
        public float SamplingKeyColumn { get; set; }

Is this expected behavior? It feels very inconsistent and quite confusing behavior. I couldn't find any documentation about this either.

@justinormont
Copy link

justinormont commented Jul 20, 2019

Sounds like an issue w/ our CodeGen in the CLI. If a SamplingKeyColumn is provided to the API, we should be creating one in the output data class.

/cc @srsaggam, @JakeRadMSFT

@famschopman
Copy link
Author

famschopman commented Jul 21, 2019

Ok, good to hear. For the time being I will have to use two different model definitions.

1: One to load the IDataView, without a SamplingKeyColumn
2: One to load the model from disk, with a SamplingKeyColumn

If you add the SamplingKeyColumn to the initial model definition AutoML tries to mitigate that and you will end up with another one, drumroll so don't even think about it 🤐

Could not find input column 'temp_SamplingKeyColumn_000'
Parameter name: inputSchema

@acrigney
Copy link

acrigney commented Aug 2, 2019

This really needs to be fixed ASAP please.

@gvashishtha gvashishtha transferred this issue from dotnet/machinelearning Apr 21, 2020
@LittleLittleCloud
Copy link
Contributor

Seems like a bug in AutoML.Net
@justinormont is this one been fixed or not?

@justinormont
Copy link

@justinormont is this one been fixed or not?

I don't think it's been fixed.

@JakeRadMSFT
Copy link
Contributor

@LittleLittleCloud Can you look into this?

@antoniovs1029
Copy link
Member

antoniovs1029 commented Jun 26, 2020

So I believe this is actually an issue on ML.NET's mlContext.Data.TrainTestSplit() and mlContext.Data.CrossValidationSplit() as explained here:
dotnet/machinelearning#5256 (comment)

But I'm unsure if the OP of this issue here ( @famschopman ) used these methods? Can you still provide a full repro of your issue? It would be very helpful, @famschopman !! 😄

My suspicion is that when the user says "When I train and save a model with the AutoML API", it means that they used code similar to the one on dotnet/machinelearning#5256 (comment) , where TrainTestSplit() or CrossValidationSplit() was actually used before calling AutoML? ... It is also not clear from this issue description exactly where/when the exception was thrown? In the other issue I've linked, it was while creating a PredictionEngine, whereas no PredictionEngine is mentioned here, and it would be interesting to know this information to find other possible places where this issue might surface.

@beccamc
Copy link
Contributor

beccamc commented Jun 8, 2022

@LittleLittleCloud Is this fixed?

@famschopman
Copy link
Author

I am not sure if I still have the code laying around to help you with a repro, but if needed just ping me and I will try to find. I will be picking up ML.Net soon again. First have to finish some other software first ... 😘

@luisquintanilla luisquintanilla added the Mark for close Issues pending final confirmation before closing label Oct 5, 2022
@luisquintanilla
Copy link
Contributor

luisquintanilla commented Oct 5, 2022

@LittleLittleCloud since this references the old AutoML API and presumably the older version of the ML.NET CLI, I think we're okay to close unless it continues to be an issue.

@LittleLittleCloud
Copy link
Contributor

Close since this issue is related to old AutoML API, feel free to reopen it if you have questions

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Mark for close Issues pending final confirmation before closing Reported by: Customer
Projects
None yet
8 participants