Description
System Information (please complete the following information):
- OS & Version: [e.g. Windows 10] Win 11
- ML.NET Version: [e.g. ML.NET v1.5.5] latest
- .NET Version: [e.g. .NET 5.0] .NET 6.0
Describe the bug
I noticed an exception during local testing. The error from the test was DownloadFailed with exception One or more errors occurred. (Object synchronization method was called from an unsynchronized block of code.)
This is happening because we are using a Mutex within an async method.
Mutex's have thread affinity. They must be released from the same thread that they were acquired from:
https://learn.microsoft.com/en-us/dotnet/api/system.threading.mutex.releasemutex?view=net-8.0
However an async method that uses ConfigureAwait(false) will not necessarily resume on the same thread.
To Reproduce
Steps to reproduce the behavior:
- Delete local copies of ML.NET resources (eg from %TEMP%\mlnet)
- Run
dotnet test
onMicrosoft.ML.TorchSharp.Tests
- Observe failure, if not observed then repeat from step 1.
Expected behavior
Tests run to completion.
Screenshots, Code, Sample Projects
System.InvalidOperationException : Error downloading resource from 'https://aka.ms/mlnet-resources/models/pretrained_Roberta_encoder.tsm': DownloadFailed with exception One or more errors occurred. (Object synchronization method was called from an unsynchronized block of code.)\\nDownloadFailed with exception One or more errors occurred. (A task was canceled.)\\nDownloadFailed with exception One or more errors occurred. (The wait completed due to an abandoned mutex.)\\nDownloadFailed with exception One or more errors occurred. (A task was canceled.)\\nDownloadFailed with exception One or more errors occurred. (The wait completed due to an abandoned mutex.)\\n\nmodel file could not be downloaded!
at Microsoft.ML.TorchSharp.Roberta.QATrainer.Trainer.GetModelPath() in C:\src\dotnet\machinelearning\src\Microsoft.ML.TorchSharp\Roberta\QATrainer.cs:line 260