-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Closed
Labels
need infoThis issue needs more info before triageThis issue needs more info before triage
Description
System information
- OS version/distro: .Net 4.6.2, Win 10
- .NET Version (eg., dotnet --info): ML.Net nuget 0.10.1
Issue
- What did you do? Input data has a double columns with (double.min, double.max, <100 random numbers from 0 to 10000>), I call
var normalizeColumns = numericalFeatures.Select(
f => new NormalizingEstimator.MeanVarColumn(f.Name, fixZero: false, useCdf: false)).ToArray();
var normalizingEstimator = this.context.Transforms.Normalize(normalizeColumns);
I see a feature in transformedData.Preview, which is all NaN, for each row. I use SDCA trainer
- What happened?
The pipeline.Fit(transformedData)
fails and throw an exception say "train with 0 instances"
- What did you expect?
- ML.Net should handle double.min, double.max for NormalizingEstimator?
- ML.Net should throw a more meaningful exception - "train with 0 instances" for a feature with NaN is a bit misleading - I do have 100+ rows for this feature.
Source code / logs
Please paste or attach the code or logs or traces that would be helpful to diagnose the issue you are reporting.
Metadata
Metadata
Labels
need infoThis issue needs more info before triageThis issue needs more info before triage