-
Notifications
You must be signed in to change notification settings - Fork 1.9k
AutoML Regression Experiment fails after 67iterations #4906
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hi @francescomazzurco , please send along a .csv example with which we can reproduce this issue. |
Hi @mstfbl, I'm now creating a small working example along with the .csv, but I am encountering difficulties in reproducing the issue. I'll dig into it and give you updates by the end of the day |
Ok, I found the problem. I could reproduce the exception only on one of our computers, so I finally realised that the issue is related to culture, even when data is loaded from memory and there is no parsing. In the project I attached, data is parsed and loaded using invariant culture. Then, a non-english culture is set just before running the experiment. var mlContext = new MLContext();
List<Model> models = ReadCsv(@"data\data.csv");
var dataView = BuildDataView(mlContext, models);
var experimentSettings = new RegressionExperimentSettings
{
MaxExperimentTimeInSeconds = 600,
CacheDirectory = new DirectoryInfo(@".\cache"),
};
var experiment = mlContext.Auto().CreateRegressionExperiment(experimentSettings);
// Data has already been parsed using invariant culture
CultureInfo.DefaultThreadCurrentCulture = CultureInfo.CreateSpecificCulture("it-IT");
var bestRun = experiment.Execute(dataView).BestRun; The exception is thrown after the 67th iteration. Now I've seen other issues related to culture, not sure if they are reporting the same issue but in such case feel free to close this issue. Thanks |
@francescomazzurco: This should be fixed in the next release (v1.5.0-preview2). There was a fix added in January to use culture invariant when sweeping parameter values -- #4635. You can test against the nightly NuGet feed by adding |
Hi @justinormont, thanks for your reply. I published the working example here: https://github.com/francescomazzurco/TestML |
@LittleLittleCloud: Do you have time to investigate? |
I will take a look |
Hi I am Diego S. , from Italy. |
Quick update: I just tested against v.0.17.1 and the bug is still there. Same behavior: the 68th iteration hang forever and never completes. |
@francescomazzurco: I believe this fixed now. It will be available in the next release. Or you can run against the nightly build, as outlined above. |
@justinormont I just tested against v.0.17.3-29420-1 from October 20th, but the bug is still there. I see there are newer builds, but I am not able to install them as NuGet can not find package MlNetMklDepsCode |
@francescomazzurco: You'll need a nightly build or release after 2020-10-30 as the fix went in then. @harishsk: Any guess why the nightly won't install for @francescomazzurco? |
@justinormont @francescomazzurco As part of moving into arcade, we've published some nugets that have a bug, where it requires the Also, there had been some problems with publishing nugets from master (which are the ones required by @francescomazzurco ), and so I believe there hasn't been any nuget published correctly from master since October 20th. So I don't think there's any public nuget including the change made on October 30, Justin is referring to. This problem was on Azure DevOps side, and should be fixed now. So I'll run a manual build to publish nugets from master branch, and hopefully it will work. I'll update this thread with info about that. Thanks. |
There are some problems with our nuget publishing pipeline. Working on that now, I'll update this thread once the nuget is published. |
The nugets has just been published to the public feed. |
I was able to successfully install the most recent build from today ( 0.17.3-29602-5 ) which indeed solves the bug. Feel free to close the issue. Thanks for the support |
Hi,
When running a Regression Experiment, AutoML sistematically fails after 67 iterations, raising the Exception "All instances skipped due to missing features". By looking at other issues, I got the idea that the SmacSweeper could be the cause. This is also suggested by the stack strace:
However, compared to the other issues, I'm running a console application, I'm loading data from database with no missing values. and I hopefully have the right NuGet dependencies:
I understand that the problem might be caused by some of the third-party libraries ML depends on, but isn't at least possible to ignore the exception thrown by a single trainer without compromising the whole regression experiment? I would like to be able to access the
BestRun
object and choose the best out of the first 67 experiments without having to look back at theCacheDirectory
.If necessary, I can generate a csv with all the data used for training.
Thanks
The text was updated successfully, but these errors were encountered: