Skip to content

Conversation

@killeent
Copy link
Contributor

@killeent killeent commented May 23, 2017

Previously, THPPointer did not have its constructors marked as explicit. As a result, something like:

THTensor *someIntermediateResult = THTensor_(new)(LIBRARY_STATE_NOARGS);
// do some things to someIntermediateResult
THPTensor wrapped = someIntermediateResult; // calls implicit constructor

The fact that THPTensor takes "ownership" of the THTensor in this case is obfuscated, which could lead to double frees.

This diff marks all the THPTensor constructors explicit, preventing this type of implicit constructor from a pointer. I followed it with the following codemod:

codemod -m -d torch/ --extensions h,hpp,cpp,cwrap \
    '(TH[a-zA-Z_]+Ptr [a-zA-Z_]+) = ([^;]*);'  \
    '\1(\2);'

To replace all assignment calls with calls to the explicit constructor. There were a couple of other places where I need to fix this, e.g. in the THPPlugin.py allocation template.

Test Plan: Run Tests locally

}

THLongStoragePtr THPUtils_unpackSize(PyObject *arg) {
THLongStorage* THPUtils_unpackSize(PyObject *arg) {

This comment was marked as off-topic.

THPPointer(): ptr(nullptr) {};
THPPointer(T *ptr): ptr(ptr) {};
THPPointer(THPPointer &&p) { free(); ptr = p.ptr; p.ptr = nullptr; };
explicit THPPointer(): ptr(nullptr) {};

This comment was marked as off-topic.

@killeent
Copy link
Contributor Author

@colesbury I think it was due to the fact that the move constructor had been marked explicit. By removing the keyword, I no longer need to pass the Tensor as a raw pointer return type in order for it to compile.

@soumith soumith merged commit 05bc877 into pytorch:master May 25, 2017
jjsjann123 added a commit to jjsjann123/pytorch that referenced this pull request May 3, 2022
update CI tests with lintrunner to ensure format stuff are running in CI
jagadish-amd pushed a commit to jagadish-amd/pytorch that referenced this pull request Oct 24, 2024
Improve performance for smaller shapes that use block radix sort by
decreasing the item_per_thread to 8.
This will increase the thread block size leading to higher occupancy.

Co-author: @amd-sushetty

---------

Co-authored-by: Pruthvi Madugundu <[email protected]>
jagadish-amd pushed a commit to jagadish-amd/pytorch that referenced this pull request Jan 14, 2025
Improve performance for smaller shapes that use block radix sort by
decreasing the item_per_thread to 8.
This will increase the thread block size leading to higher occupancy.

Co-author: @amd-sushetty

---------

Co-authored-by: Pruthvi Madugundu <[email protected]>
(cherry picked from commit 1024f36)
pytorch-bot bot pushed a commit that referenced this pull request Feb 24, 2025
Improve performance for smaller shapes that use block radix sort by
decreasing the item_per_thread to 8.
This will increase the thread block size leading to higher occupancy.

Co-author: @amd-sushetty

---------

Co-authored-by: Pruthvi Madugundu <[email protected]>
(cherry picked from commit 1024f36)
pytorch-bot bot pushed a commit that referenced this pull request Feb 24, 2025
Improve performance for smaller shapes that use block radix sort by
decreasing the item_per_thread to 8.
This will increase the thread block size leading to higher occupancy.

Co-author: @amd-sushetty

---------

Co-authored-by: Pruthvi Madugundu <[email protected]>
(cherry picked from commit 1024f36)
jagadish-amd pushed a commit to jagadish-amd/pytorch that referenced this pull request May 8, 2025
Improve performance for smaller shapes that use block radix sort by
decreasing the item_per_thread to 8.
This will increase the thread block size leading to higher occupancy.

Co-author: @amd-sushetty

---------

Co-authored-by: Pruthvi Madugundu <[email protected]>
(cherry picked from commit 1024f36)
(cherry picked from commit 08c0749)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants