Skip to content

The input string was not formatted correctly #5320

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
vffuunnyy opened this issue Jul 22, 2020 · 8 comments · Fixed by #5163
Closed

The input string was not formatted correctly #5320

vffuunnyy opened this issue Jul 22, 2020 · 8 comments · Fixed by #5163
Assignees
Labels
bug Something isn't working P2 Priority of the issue for triage purpose: Needs to be fixed at some point.

Comments

@vffuunnyy
Copy link

Model Builder Error

Visual studio 2019 last version
ML.NET last version
windows 8.1 x64
input data: https://yadi.sk/d/dOH4eg9ojMV4gA
Training stops after this log line:
|63 OlsRegression 0,0000 5,55 428,48 20,70 55,0 63 |

trace:

в System.Number.ParseSingle(String value, NumberStyles options, NumberFormatInfo numfmt)
   в Microsoft.ML.AutoML.SweeperProbabilityUtils.ParameterSetAsFloatArray(IValueGenerator[] sweepParams, ParameterSet ps, Boolean expandCategoricals)
   в Microsoft.ML.AutoML.SmacSweeper.FitModel(IEnumerable`1 previousRuns)
   в Microsoft.ML.AutoML.SmacSweeper.ProposeSweeps(Int32 maxSweeps, IEnumerable`1 previousRuns)
   в Microsoft.ML.AutoML.PipelineSuggester.SampleHyperparameters(MLContext context, SuggestedTrainer trainer, IEnumerable`1 history, Boolean isMaximizingMetric)
   в Microsoft.ML.AutoML.PipelineSuggester.GetNextInferredPipeline(MLContext context, IEnumerable`1 history, DatasetColumnInfo[] columns, TaskKind task, Boolean isMaximizingMetric, CacheBeforeTrainer cacheBeforeTrainer, IEnumerable`1 trainerWhitelist)
   в Microsoft.ML.AutoML.Experiment`2.Execute()
   в Microsoft.ML.AutoML.ExperimentBase`2.Execute(ColumnInformation columnInfo, DatasetColumnInfo[] columns, IEstimator`1 preFeaturizer, IProgress`1 progressHandler, IRunner`1 runner)
   в Microsoft.ML.AutoML.ExperimentBase`2.Execute(IDataView trainData, ColumnInformation columnInformation, IEstimator`1 preFeaturizer, IProgress`1 progressHandler)
   в Microsoft.ML.ModelBuilder.AutoMLService.Experiments.AutoMLExperiment`3.<>c__DisplayClass21_0.<ExecuteAsync>b__5() в /_/src/Microsoft.ML.ModelBuilder.AutoMLService/Experiments/AutoMLExperiment.cs:строка 81
   в System.Threading.Tasks.Task`1.InnerInvoke()
   в System.Threading.Tasks.Task.Execute()
--- Конец трассировка стека из предыдущего расположения, где возникло исключение ---
   в System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   в System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   в Microsoft.ML.ModelBuilder.AutoMLService.Experiments.AutoMLExperiment`3.<ExecuteAsync>d__21.MoveNext() в /_/src/Microsoft.ML.ModelBuilder.AutoMLService/Experiments/AutoMLExperiment.cs:строка 108
--- Конец трассировка стека из предыдущего расположения, где возникло исключение ---
   в System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   в System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   в Microsoft.ML.ModelBuilder.AutoMLEngine.<StartTrainingAsync>d__30.MoveNext() в /_/src/Microsoft.ML.ModelBuilder.AutoMLService/AutoMLEngineService/AutoMLEngine.cs:строка 147
@mstfbl mstfbl self-assigned this Jul 22, 2020
@mstfbl mstfbl added Awaiting User Input Awaiting author to supply further info (data, model, repro). Will close issue if no more info given. need info This issue needs more info before triage labels Jul 22, 2020
@mstfbl
Copy link
Contributor

mstfbl commented Jul 22, 2020

Hi @vffuunnyy , I've looked at the input data you've linked, it is just a list of doubles/floats separated by semicolons (sample data below).:

1;2;3;4;5;6;7;8;9;10;11;12;13;14;15;16;17;18;19;20;21;22;23;24;25;26;27;28;29;30;31;32;33;34;35;36;37;38;39;40;41;42;43;44;45;46;47;48;49;50;51;52;53;54;55;56;57;58;59;60;61;62;63;64;65;66;67;68;69;70;next
91.71;1.05;3.45;5.44;1.87;264.9;1.21;1.02;1.48;5.57;1.81;2.71;1.24;2.11;1.07;1.31;3.44;1.37;43.02;3.21;10.33;2.28;6.63;2.14;1.55;1.36;1.06;7.55;2.02;1.5;3.82;1.02;2.59;1.07;2.04;1.75;1.8;2.24;1.16;1.03;8.32;5.94;4.75;1.04;3.49;24.42;1;1.2;1.91;99;1.94;2.5;7.9;1.07;1.36;1.83;3.91;2.45;1.6;2.31;3.34;3.03;7.52;1.22;2.48;1.13;1.59;1.2;2.2;2.58;4.75

Can you please share your reproduction of this issue? You can zip your solution and data, and attach it in your comment in this issue. This way, we'll be able to reproduce the issue on our end and investigate. Thanks!

@vffuunnyy
Copy link
Author

Hi @vffuunnyy , I've looked at the input data you've linked, it is just a list of doubles/floats separated by semicolons (sample data below).:

1;2;3;4;5;6;7;8;9;10;11;12;13;14;15;16;17;18;19;20;21;22;23;24;25;26;27;28;29;30;31;32;33;34;35;36;37;38;39;40;41;42;43;44;45;46;47;48;49;50;51;52;53;54;55;56;57;58;59;60;61;62;63;64;65;66;67;68;69;70;next
91.71;1.05;3.45;5.44;1.87;264.9;1.21;1.02;1.48;5.57;1.81;2.71;1.24;2.11;1.07;1.31;3.44;1.37;43.02;3.21;10.33;2.28;6.63;2.14;1.55;1.36;1.06;7.55;2.02;1.5;3.82;1.02;2.59;1.07;2.04;1.75;1.8;2.24;1.16;1.03;8.32;5.94;4.75;1.04;3.49;24.42;1;1.2;1.91;99;1.94;2.5;7.9;1.07;1.36;1.83;3.91;2.45;1.6;2.31;3.34;3.03;7.52;1.22;2.48;1.13;1.59;1.2;2.2;2.58;4.75

Can you please share your reproduction of this issue? You can zip your solution and data, and attach it in your comment in this issue. This way, we'll be able to reproduce the issue on our end and investigate. Thanks!

I just use ML.NET Model Builder and there is no code 🤔
After about 900+- seconds of Training throwing an error.
output data from Machine Learning: https://pastebin.com/AKHsDaV6

@justinormont
Copy link
Contributor

My assumption is an internalization issue.

This should have been fixed in https://github.com/dotnet/machinelearning/pull/4635/files, though perhaps there are additional .ToString() used for sweeping parameters which need to be set as CultureInfo.InvariantCulture.

We may want to add a CI leg which tests in Russian (or another culture w/ 0,123 number formats).

@vffuunnyy: To mitigate, you could set your computer to a culture using the 0.123 number format.

@vffuunnyy
Copy link
Author

My assumption is an internalization issue.

This should have been fixed in https://github.com/dotnet/machinelearning/pull/4635/files, though perhaps there are additional .ToString() used for sweeping parameters which need to be set as CultureInfo.InvariantCulture.

We may want to add a CI leg which tests in Russian (or another culture w/ 0,123 number formats).

@vffuunnyy: To mitigate, you could set your computer to a culture using the 0.123 number format.

sooo, what I need to do with https://github.com/dotnet/machinelearning/pull/4635/files 🤯

@justinormont
Copy link
Contributor

justinormont commented Jul 23, 2020

@vffuunnyy: I was mentioning that it looks like a bug in ML․NET in how it converts numbers to strings in our sweeping code. We have seen similar bug before, which #4635 was meant to fix. You're welcome to file a PR to contribute a fix to ML․NET.

Before a fix is in, you can try setting your machine's locale to perhaps en-US (or another locale with periods as the separator) -- https://www.isunshare.com/windows-10/change-system-locale-in-windows-10.html


Related:

/cc @LittleLittleCloud, @JakeRadMSFT

@mstfbl mstfbl removed the Awaiting User Input Awaiting author to supply further info (data, model, repro). Will close issue if no more info given. label Jul 24, 2020
@vffuunnyy
Copy link
Author

@vffuunnyy: I was mentioning that it looks like a bug in ML․NET in how it converts numbers to strings in our sweeping code. We have seen similar bus before, which #4635 was meant to fix. You're welcome to file a PR to contribute a fix to ML․NET.

Before a fix is in, you can try setting your machine's locale to perhaps en-US (or another locale with periods as the separator) -- https://www.isunshare.com/windows-10/change-system-locale-in-windows-10.html

So, I tried, but nothing fixed. I get same error

@mstfbl
Copy link
Contributor

mstfbl commented Jul 25, 2020

@vffuunnyy Can you please provide us with the actual dataset with which you're getting this error? The one you've linked above is not an actual dataset from what I could tell.
@justinormont Could this be an issue specific to the Model Builder? If so, we should transfer the issue to the appropriate repository.

@mstfbl mstfbl added the Awaiting User Input Awaiting author to supply further info (data, model, repro). Will close issue if no more info given. label Jul 25, 2020
@justinormont
Copy link
Contributor

@msftbl: No need to transfer. Error is in the use of the hyperparameter values in the AutoML․NET code.

I made a short repro: https://dotnetfiddle.net/nWpCkP

Setting CultureInfo.DefaultThreadCurrentCulture = new CultureInfo("ru-RU") causes a failure in the sweeper around iteration 44. The exact error message will depend on the type of the hyperparameter.

Setting CultureInfo.DefaultThreadCurrentCulture = new CultureInfo("en-US") will work correctly (though .NET Fiddle can also timeout at 10 sec; if so, re-run).

This DefaultThreadCurrentCulture method can also be used to make a unit test for internationalization. Though adding a full CI leg, with Windows/Linux/macOS set to ru-RU, would test all components in ML․NET.

--

As a side bug, LightGBM fails on .NET Fiddle: System.DllNotFoundException: Unable to load shared library 'lib_lightgbm' or one of its dependencies.

@mstfbl mstfbl added bug Something isn't working P2 Priority of the issue for triage purpose: Needs to be fixed at some point. and removed Awaiting User Input Awaiting author to supply further info (data, model, repro). Will close issue if no more info given. need info This issue needs more info before triage labels Sep 8, 2020
@Lynx1820 Lynx1820 self-assigned this Oct 23, 2020
@Lynx1820 Lynx1820 removed their assignment Jan 15, 2021
@ghost ghost locked as resolved and limited conversation to collaborators Mar 18, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working P2 Priority of the issue for triage purpose: Needs to be fixed at some point.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants