Skip to content

OnlineGradientDescent crash #4363

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
daholste opened this issue Oct 22, 2019 · 4 comments
Closed

OnlineGradientDescent crash #4363

daholste opened this issue Oct 22, 2019 · 4 comments
Labels
P0 Priority of the issue for triage purpose: IMPORTANT, needs to be fixed right away.

Comments

@daholste
Copy link
Contributor

daholste commented Oct 22, 2019

System information

  • OS version/distro: Windows 10
  • .NET Version (eg., dotnet --info): .NET Core 2.2

Issue

  • What did you do?
    I ran the script:
var mlContext = new MLContext();
var textLoader = mlContext.Data.CreateTextLoader(new TextLoader.Options()
{
	Columns = new TextLoader.Column[]
	{
		new TextLoader.Column("Label", DataKind.Single, 0),
		new TextLoader.Column("0", DataKind.String, 1),
		new TextLoader.Column("1", DataKind.String, 2),
		new TextLoader.Column("2", DataKind.Single, 3),
		new TextLoader.Column("3", DataKind.Single, 4),
		new TextLoader.Column("4", DataKind.Single, 5),
		new TextLoader.Column("5", DataKind.Single, 5),
	},
	HasHeader = true,
	Separators = new char[] {','}
});
var dataView = textLoader.Load(@"dataset.csv");
var pipeline = mlContext.Transforms.Categorical.OneHotHashEncoding("0")
	.Append(mlContext.Transforms.Categorical.OneHotEncoding("1", "3"))
	.Append(mlContext.Transforms.Concatenate("Features", "0", "1", "2", "3", "4", "5"))
	.Append(mlContext.Transforms.NormalizeMinMax("Features"))
	.Append(mlContext.Regression.Trainers.OnlineGradientDescent(new OnlineGradientDescentTrainer.Options()
	{
		LearningRate = 1.0f,
		DecreaseLearningRate = false,
	}));
pipeline.Fit(dataView);

(I can provide the data as requested)

  • What happened?
    I get the exception
System.InvalidOperationException
  HResult=0x80131509
  Message=The weights/bias contain invalid values (NaN or Infinite). Potential causes: high learning rates, no normalization, high initial weights, etc.
  Source=Microsoft.ML.StandardTrainers
  StackTrace:
   at Microsoft.ML.Trainers.OnlineLinearTrainer`2.TrainModelCore(TrainContext context)
   at Microsoft.ML.Trainers.TrainerEstimatorBase`2.TrainTransformer(IDataView trainSet, IDataView validationSet, IPredictor initPredictor)
   at Microsoft.ML.Data.EstimatorChain`1.Fit(IDataView input)
   at OnlineGradientDescentCrash.Program.Main(String[] args) in C:\Users\daholste\source\repos\OnlineGradientDescentCrash\OnlineGradientDescentCrash\Program.cs:line 38
  • What did you expect?
    Successful training
@gvashishtha
Copy link
Contributor

Does this still happen if you set DecreaseLearningRate to true? Any reason you want a constant learning rate?

@justinormont
Copy link
Contributor

justinormont commented Nov 4, 2019

Any reason you want a constant learning rate?

Constant learning rates work well in many cases. In AutoML, we sweep over both choices of true/false.

My top guess is that LearningRate = 1.0f, DecreaseLearningRate = false don't work very well together without L2 regularization for this dataset. Defaults are LearningRate = 0.1f, DecreaseLearningRate = true.

I'd recommend building a meta-model to predict failures given a position in the hyperparameter space. Then AutoML can avoid areas that generally throw an error.

It would be nice to attach a debugger and see what the trainer is doing. I'm guessing the model weights are oscillating and growing towards +/- Infinity. Though, it's possible there's a bug in the weight update code leading to the infinity. Stepping thru the trainer's weight update code will tell us.

@yaeldekel yaeldekel added the P0 Priority of the issue for triage purpose: IMPORTANT, needs to be fixed right away. label Jan 9, 2020
@mstfbl
Copy link
Contributor

mstfbl commented Jan 27, 2020

Hi @daholste , has your question been addressed by @justinormont 's comments ? If so, please feel free to close the issue.

@mstfbl
Copy link
Contributor

mstfbl commented Mar 19, 2020

Closing this issue as the issue seems to be resolved.

@mstfbl mstfbl closed this as completed Mar 19, 2020
@ghost ghost locked as resolved and limited conversation to collaborators Mar 20, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
P0 Priority of the issue for triage purpose: IMPORTANT, needs to be fixed right away.
Projects
None yet
Development

No branches or pull requests

5 participants