Closed
Description
System information
- OS version/distro: .Net 4.6.2, Win 10
- .NET Version (eg., dotnet --info): ML.Net nuget 0.10.1
Issue
- What did you do? Input data has a double columns with (double.min, double.max, <100 random numbers from 0 to 10000>), I call
var normalizeColumns = numericalFeatures.Select(
f => new NormalizingEstimator.MeanVarColumn(f.Name, fixZero: false, useCdf: false)).ToArray();
var normalizingEstimator = this.context.Transforms.Normalize(normalizeColumns);
I see a feature in transformedData.Preview, which is all NaN, for each row. I use SDCA trainer
- What happened?
The pipeline.Fit(transformedData)
fails and throw an exception say "train with 0 instances"
- What did you expect?
- ML.Net should handle double.min, double.max for NormalizingEstimator?
- ML.Net should throw a more meaningful exception - "train with 0 instances" for a feature with NaN is a bit misleading - I do have 100+ rows for this feature.
Source code / logs
Please paste or attach the code or logs or traces that would be helpful to diagnose the issue you are reporting.