-
-
Notifications
You must be signed in to change notification settings - Fork 56.2k
CUDA GoodFeaturesToTrackDetector is not ThreadSafe ? #18051
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I've just see inside the function and find: void sortCorners_gpu(PtrStepSzf eig, float2* corners, int count, cudaStream_t stream)
{
bindTexture(&eigTex, eig);
thrust::device_ptr<float2> ptr(corners);
#if THRUST_VERSION >= 100802
if (stream)
thrust::sort(thrust::cuda::par(ThrustAllocator::getAllocator()).on(stream), ptr, ptr + count, EigGreater());
else
thrust::sort(thrust::cuda::par(ThrustAllocator::getAllocator()), ptr, ptr + count, EigGreater());
#else
thrust::sort(ptr, ptr + count, EigGreater());
#endif
}
It seem there is something wrong when cuda::thrust works with multiple cpu threads. When I search "cuda thrust merge_sort failed to synchronize", I find some other discussion: |
It seems this algorithm uses texture reference, which is quite obsolete and does not support multi-threaded programming. Texture objects came up in 2013 and superseded the texture reference(link), and texture references are now deprecated in CUDA 11. There have been some other cases similar to this issue, and they were solved by removing texture references and adopting texture objects. I believe that we can apply the same solution to this issue. |
Thanks @nglee |
it's resolved now? |
Hi @nkwangyh, are your changes in a pull request? |
@shubhamcodez @areche Sorry for the delay. Yes, the bug was resolved, but since my fix was a little dirty and didn't cover all image types, I thus didn't create a pull request. I will try to submit the changes before this weekend. |
@asmorkalov @areche I've submitted a pull request for GoodFeaturesToTrackDectector at here. The code has been verified in my local environment. Hope it could do the help. |
I encountered similar issue while calling cv::cuda::resize() to upscale GpuMat in multiple threads context. |
System information (version)
Detailed description
I came across the same problem as in the link by AlexBn:
https://answers.opencv.org/question/227794/cuda-goodfeaturestotrackdetector-is-not-threadsafe/
While using OpenCV CUDA GoodFeaturesToTrackDetector in parallel loop I noticed that I get systematic Exception "merge_sort: failed to synchronize" , though I run it on different cuda::GpuMats and in separate cuda::Streams with separate Algorithm instances.
Steps to reproduce
after many loop I get Exception with CallStack :
I must conclude that OpenCV Cuda
GoodFeaturesToTrackDetector
is not thread-safe despite usage of the Stream s ?The text was updated successfully, but these errors were encountered: