Skip to content

Are ML .Net models deterministic? #5497

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
rebecca-burwei opened this issue Nov 18, 2020 · 3 comments
Closed

Are ML .Net models deterministic? #5497

rebecca-burwei opened this issue Nov 18, 2020 · 3 comments
Labels
question Further information is requested

Comments

@rebecca-burwei
Copy link

Some models are inherently stochastic, others are deterministic. Are ML .Net models deterministic? In other words, given the same input, will an ML .Net model always return the same output/prediction? If so, to how many decimal places is this prediction deterministic?

@frank-dong-ms-zz
Copy link
Contributor

@rebecca-burwei Yes, I believe ML.NET models are deterministic if you use them properly, we have bunch of tests verifying various of model outputs. Regarding the decimal places I believe it is related to the input data and model settings like how many rounds of training will be performed, floating point error will accumulate with the calculation.
@justinormont cc Justin see if Justin has more insights on this questions.

@justinormont
Copy link
Contributor

Model prediction

If you have a trained model, it is almost always deterministic. One example where it's not is if it includes the CountTargetEncoder. See: #4514 (review)

Model training

Some parts of ML․NET can be deterministic. In general practice I wouldn't assume model training is fully deterministic.

Setting a seed in the MLContext and disabling multi-threading gets you close.

Many components also have their own seed values to set. There's a bit of a usability bug in ML․NET as non-hashing seeds should fall-back to the global seed, but the code wasn't added to do so in all components. See: #4752 (comment)

If you're using the AutoML APIs, there's a bit of a butterfly effect in model sweeping due to the small model differences being amplified. See: #4986 (comment)

@justinormont justinormont added the question Further information is requested label Nov 19, 2020
@frank-dong-ms-zz
Copy link
Contributor

Close this issue as answer has already been provided, @rebecca-burwei feel free to reopen if you have any follow up questions, thanks.

@ghost ghost locked as resolved and limited conversation to collaborators Mar 17, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants