Skip to content

PredictionEnginePool.GetPredictionEngine is not thread safe #4981

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
strikene opened this issue Mar 30, 2020 · 5 comments · Fixed by #4992
Closed

PredictionEnginePool.GetPredictionEngine is not thread safe #4981

strikene opened this issue Mar 30, 2020 · 5 comments · Fixed by #4992
Assignees
Labels
bug Something isn't working P0 Priority of the issue for triage purpose: IMPORTANT, needs to be fixed right away.

Comments

@strikene
Copy link

strikene commented Mar 30, 2020

System information

  • OS version/distro:Windows 10 Server 2019
  • .NET Version (eg., dotnet --info): core 3.0

Issue

  • What did you do?
    Invoked PredictionEnginePool.Predict("MyModelName", example) from multiple threads.
  • What happened?
    Collection was modified; enumeration operation may not execute.
  • What did you expect?
    Method is thread safe

Source code / logs

Package
Microsoft.Azure.Functions.Extensions Version="1.0.0"
Microsoft.Extensions.MLVersion="1.5.0-preview2" or 1.40
Microsoft.ML Version="1.5.0-preview2" or 1.40
Microsoft.NET.Sdk.Functions Version="3.0.3"

in startup

builder.Services.AddPredictionEnginePool<Input, PredictedEngineOutput>().FromFile("ModelName", "ModelPath", true);

in function
var re = _PredictionEnginePool.Predict("ModelName", new Input() {String = str });

Exception

   at System.Collections.Generic.List`1.Enumerator.MoveNextRare()
   at System.Linq.Enumerable.Any[TSource](IEnumerable`1 source, Func`2 predicate)
   at Microsoft.Extensions.ML.PredictionEnginePoolPolicy`2.Return(PredictionEngine`2 obj)
   at Microsoft.Extensions.ObjectPool.DefaultObjectPool`1.Return(T obj)
   at Microsoft.Extensions.ML.PredictionEnginePool`2.ReturnPredictionEngine(String modelName, PredictionEngine`2 engine)
   at Microsoft.Extensions.ML.PredictionEnginePoolExtensions.Predict[TData,TPrediction](PredictionEnginePool`2 predictionEnginePool, String modelName, TData example)
   at BLDM_F.F.Predict(HttpRequest req, ILogger log) in C:\tfs\BLDM_F\Function1.cs:line 148

Please paste or attach the code or logs or traces that would be helpful to diagnose the issue you are reporting.

@antoniovs1029 antoniovs1029 self-assigned this Mar 30, 2020
@antoniovs1029 antoniovs1029 added bug Something isn't working P0 Priority of the issue for triage purpose: IMPORTANT, needs to be fixed right away. labels Mar 30, 2020
@antoniovs1029
Copy link
Member

Thanks for reporting this. Can you please provide your complete code and dataset so that we can easily reproduce your issue? Thanks!

@strikene
Copy link
Author

strikene commented Mar 30, 2020

Thanks for reporting this. Can you please provide your complete code and dataset so that we can easily reproduce your issue? Thanks!

The attachment contains the source file and the model file. This problem occurs randomly and may take a long time to see. Thank you for your help
f.zip

@antoniovs1029
Copy link
Member

Thanks. I am taking a look. Also, what kind of exception was thrown? It seems you only included the stacktrace on your description of the issue, but not the exception itself.

@strikene
Copy link
Author

strikene commented Apr 1, 2020

Thanks. I am taking a look. Also, what kind of exception was thrown? It seems you only included the stacktrace on your description of the issue, but not the exception itself.

Exception:Collection was modified; enumeration operation may not execute.
KDY@ZH%TFPJ{S1527}R~2MR

If I delete the following code, the exception will not appear, and I'm not sure about the correlation.

  #region get tag in model

        var pr = _PredictionEnginePool.Predict("m", new Input() { DMString = "ss" });
        var mo = _PredictionEnginePool.GetModel("m");
        var OutputSchema = _PredictionEnginePool.GetPredictionEngine("m").OutputSchema;
        var column = OutputSchema.GetColumnOrNull("Score");
        var slotNames = new VBuffer<ReadOnlyMemory<char>>();
        column.Value.GetSlotNames(ref slotNames);
        var alltag = slotNames.DenseValues().ToArray();
        TaginModel = new string[alltag.Count()];
        for (int i = 0; i < TaginModel.Length; i++)
        {
            TaginModel[i] = alltag[i].ToString();
        }
        #endregion

@antoniovs1029
Copy link
Member

It seems the way List<WeakReference> _references on PredictionEnginePoolPolicy is being used is not thread-safe, so this is a bug.

@eerhardt Is looking into this now to fix it.

eerhardt added a commit to eerhardt/machinelearning that referenced this issue Apr 1, 2020
Accesses to this list are not thread-safe because we can be enumerating it on one thread (in Return) while another thread adds to it (in Create).

The list is unnecessary because the PredictionEngine will always be "rooted". Either it was being held in memory by someone who got the PredictionEngine from the pool, or it is being held by the ObjectPool itself. So it won't ever be GC'd, and the WeakReference is not doing anything.

Also, inherit from PooledObjectPolicy instead of IPooledObjectPolicy, so it can take the "fastPolicy" path in DefaultObjectPool. (See dotnet/extensions#318).

Fix dotnet#4981
eerhardt added a commit that referenced this issue Apr 2, 2020
Accesses to this list are not thread-safe because we can be enumerating it on one thread (in Return) while another thread adds to it (in Create).

The list is unnecessary because the PredictionEngine will always be "rooted". Either it was being held in memory by someone who got the PredictionEngine from the pool, or it is being held by the ObjectPool itself. So it won't ever be GC'd, and the WeakReference is not doing anything.

Also, inherit from PooledObjectPolicy instead of IPooledObjectPolicy, so it can take the "fastPolicy" path in DefaultObjectPool. (See dotnet/extensions#318).

Fix #4981
@ghost ghost locked as resolved and limited conversation to collaborators Mar 18, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working P0 Priority of the issue for triage purpose: IMPORTANT, needs to be fixed right away.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants