You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When creating a model for a recommender system that uses MatrixFactorization, the Docker Container crashes on an Ubuntu server without further notice.
The only note in the kernel log is kernel: [12922.080806] traps: dotnet[30957] trap invalid opcode ip:7f07d81b5efc sp:7ffdc5965110 error:0 in libMatrixFactorizationNative.so[7f07d81a5000+2a000]
In the local version the recommender system and the training of the model are working.
System information
OS version/distro:
Ubuntu 18.04.4 LTS (GNU/Linux 4.15.0-101-generic x86_64)
.NET Version (eg., dotnet --info):
Host (useful for support):
Version: 3.1.4
Commit: 0c2e69caa6
.NET Core SDKs installed:
No SDKs were found.
.NET Core runtimes installed:
Microsoft.AspNetCore.App 3.1.4 [/usr/share/dotnet/shared/Microsoft.AspNetCore.App]
Microsoft.NETCore.App 3.1.4 [/usr/share/dotnet/shared/Microsoft.NETCore.App]
Issue
What did you do?
Starting training of a new model using matrix factorization
What happened?
The application and the Docker Container crash without further message or exception.
What did you expect?
Training of a new model or at least an exception
Source code / logs
Implementation:
Log.Information("Extracting train data...");
var trainingData = GetDataView(trainData);
var options = new MatrixFactorizationTrainer.Options
{
MatrixColumnIndexColumnName = UserIdEncoding,
MatrixRowIndexColumnName = MusicIdEncoding,
LabelColumnName = "Label",
NumberOfIterations = 20,
ApproximationRank = 100,
//Quiet = false
};
Log.Information("Setting Matrix Factorization");
var trainingPipeline = trainingData.Transformer.Append(
MLContext.Recommendation().Trainers.MatrixFactorization(options));
Log.Information("Starting training...");
ITransformer trainedModel = trainingPipeline.Fit(trainingData.DataView);
Log.Information("Saving model...");
MLContext.Model.Save(trainedModel, trainingData.DataView.Schema, ModelPath);
Log.Information("Extracting test data..."); ;
var testingData = GetDataView(testData);
Log.Information("Starting model testing...");
var testingTransform = trainedModel.Transform(testingData.DataView);
Log.Information("Evaluating model");
return MLContext.Recommendation().Evaluate(testingTransform);
Container Logs:
[13:45:25 Information]
Preparing prediction Model
[13:45:25 Information]
Starting Model Training...
[13:45:25 Information]
Extracting train data...
[13:45:25 Information]
Setting Matrix Factorization
[13:45:25 Information]
Starting training...
Warning: insufficient blocks may slow down the trainingprocess (4*nr_threads^2+1 blocks is suggested)
Warning: insufficient blocks may slow down the trainingprocess (4*nr_threads^2+1 blocks is suggested)
--> Application crash
The text was updated successfully, but these errors were encountered:
I made an Ubuntu Docker model, and installed our latest ML .NET build on our current master branch. I then ran the code snippet you've provided with our training and test dataset for testing MF. I did not obtain the warning you've listed. As @antoniovs1029 said, this issue has been fixed in PR #5071 , and will be available in our upcoming v1.5 release!
As such, I'm closing this issue. Please feel free to reopen if you are receiving further errors on our v1.5 release. Thanks.
When creating a model for a recommender system that uses MatrixFactorization, the Docker Container crashes on an Ubuntu server without further notice.
The only note in the kernel log is
kernel: [12922.080806] traps: dotnet[30957] trap invalid opcode ip:7f07d81b5efc sp:7ffdc5965110 error:0 in libMatrixFactorizationNative.so[7f07d81a5000+2a000]
In the local version the recommender system and the training of the model are working.
System information
Ubuntu 18.04.4 LTS (GNU/Linux 4.15.0-101-generic x86_64)
Host (useful for support):
Version: 3.1.4
Commit: 0c2e69caa6
.NET Core SDKs installed:
No SDKs were found.
.NET Core runtimes installed:
Microsoft.AspNetCore.App 3.1.4 [/usr/share/dotnet/shared/Microsoft.AspNetCore.App]
Microsoft.NETCore.App 3.1.4 [/usr/share/dotnet/shared/Microsoft.NETCore.App]
Issue
Starting training of a new model using matrix factorization
The application and the Docker Container crash without further message or exception.
Training of a new model or at least an exception
Source code / logs
Implementation:
Container Logs:
The text was updated successfully, but these errors were encountered: