-
-
Notifications
You must be signed in to change notification settings - Fork 56.2k
cuda::flip - Use in-place npp function for inplace arguments #17863
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
3900414
to
bfefe79
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not a CUDA-only issue for this particular function.
It is related to many CPU and OpenCL implementations (almost zero tests for this mode).
So some robust generic approach should be applied.
Possible fixes degrade performance, so we want to forbid such usages (to keep OpenCV library and applications optimized).
We don't want to trying fixing that and adding corresponding tests.
see also #13570 (it is closed without raising a general issue)
bfefe79
to
bb34398
Compare
@alalek Consulting this issue comment, should we raise an exception instead? |
bb34398
to
58dcd52
Compare
58dcd52
to
9411cd6
Compare
Instead of copying the source matrix, this code calls in-place npp function if given in-place arguments. |
Looks good to me. Passed test on Ubuntu 18.04 with CUDA 10.2 and GPU 1080Ti. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
@nglee The same issue is reproducible with master branch too, but the code for master is moved to opencv_contrib repo. Could you propose yet another PR to contrib repository too? |
@asmorkalov |
@asmorkalov Once in the past, I was pointed from @alalek that this is not necessary, porting one PR from opencv to opencv_contrib.
And alalek also did the same recently Was there any special reason for this PR to be ported to opencv_contrib ? |
It's good idea to have CUDA implementations aligned between branches, especially if the patch could be ported as is without serious modifications. |
@asmorkalov , my point is not "why", but "who". |
General approach is to merge 3.4 to master weekly or so. Alalek does it and you usually hould not worry about it. Cuda changes cannot be just merged to contrib with git and some manual work is required. I usually ask the PR author to do porting as soon as author is in context can can do it faster. |
Resolves #17840
#17863 (comment)
cv::cuda::GpuMat::create
does not allocate a new GpuMat if required size and type equals existing size and type.opencv/modules/core/src/cuda/gpu_mat.cu
Lines 160 to 161 in bf8136e
This PR implementscv::cuda::detachOutput
to allocate a new GpuMat in such cases. Unfortunately,_dst
is const so assignment is not possible. Instead, this implementation clonessrc
.Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
Patch to opencv_extra has the same branch name.