Skip to content

Conversation

Scanning a list of pybind enums with `in` is slow. See NOTE in code for full explanation.

This is a significant optimization; will be updating the torchdispatch/return_and_correct_aliasing portion of this stack with benchmark and results soonish.

[ghstack-poisoned]
@pytorch-bot
Copy link

pytorch-bot bot commented Aug 22, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/161315

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

✅ No Failures

As of commit ba4c3f7 with merge base 05eeb29 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

…asing"

Scanning a list of pybind enums with `in` is slow. See NOTE in code for full explanation.

This is a significant optimization; will be updating the torchdispatch/return_and_correct_aliasing portion of this stack with benchmark and results soonish.

[ghstack-poisoned]
@pytorchmergebot
Copy link
Collaborator

Starting merge as part of PR stack under #161317

]
schema_info = SchemaInfo(args=arg_schemas, outs=out_schemas)
schema_info = SchemaInfo(
args=arg_schemas, outs=out_schemas, int_tags=[int(x) for x in func.tags]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it be reasonable to move this into a SchemaInfo.__post_init__, since this is just a perf optimization that can be inferred from the other args? Or are you worried about post_init itself being slow?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are you worried about post_init itself being slow?

Not specifically, more that I had never heard of __post_init__ before working on this stack.

I would rather execute fewer Python functions than more of them, though, and unless I'm wrong about SchemaInfo being confined to this file I'd rather not compromise performance for this one-callsite class.

…asing"

Scanning a list of pybind enums with `in` is slow. See NOTE in code for full explanation.

This is a significant optimization; will be updating the torchdispatch/return_and_correct_aliasing portion of this stack with benchmark and results soonish.

[ghstack-poisoned]
…asing"

Scanning a list of pybind enums with `in` is slow. See NOTE in code for full explanation.

This is a significant optimization; will be updating the torchdispatch/return_and_correct_aliasing portion of this stack with benchmark and results soonish.

[ghstack-poisoned]
@pytorchmergebot
Copy link
Collaborator

Starting merge as part of PR stack under #161432

1 similar comment
@pytorchmergebot
Copy link
Collaborator

Starting merge as part of PR stack under #161432

pytorchmergebot pushed a commit that referenced this pull request Aug 30, 2025
…61317)

This assertion was expensive because of is_traceable_wrapper_subclass. Finding a cheap check to run first that's likely to let us skip the rest seems to improve things significantly.
Pull Request resolved: #161317
Approved by: https://github.com/ezyang, https://github.com/XilunWu, https://github.com/bdhirsh
ghstack dependencies: #161301, #161292, #161304, #161308, #161315
pytorchmergebot pushed a commit that referenced this pull request Aug 30, 2025
Not a huge cost, but free win is free.
Pull Request resolved: #161328
Approved by: https://github.com/Skylion007
ghstack dependencies: #161301, #161292, #161304, #161308, #161315, #161317
pytorchmergebot pushed a commit that referenced this pull request Aug 30, 2025
`auto` forces a copy. Confirmed this did something noticable with perf.
Pull Request resolved: #161329
Approved by: https://github.com/zpcore, https://github.com/fduwjj, https://github.com/Skylion007, https://github.com/bdhirsh
ghstack dependencies: #161301, #161292, #161304, #161308, #161315, #161317, #161328
pytorchmergebot pushed a commit that referenced this pull request Aug 30, 2025
If we want them interned, we should intern at callsites. (The numpy reference has bit rotted; see numpy/numpy@b222eb6#diff-6bdb6105198083838f51c57b55b3a49472ed23043bb40018f1ea41138e687163)

Profiling a simple torchdispatch benchmark with perf before/after seems to show that time spent copying std::strings and interning Python strings is gone, though there is some noise and the improvement is very small.
Pull Request resolved: #161432
Approved by: https://github.com/ezyang
ghstack dependencies: #161301, #161292, #161304, #161308, #161315, #161317, #161328, #161329
pytorchmergebot pushed a commit that referenced this pull request Aug 31, 2025
)

symbools are not identical with Py_True or PyFalse, so we can do those cheap checks first and at least get plain old bools to go fast.
Pull Request resolved: #161455
Approved by: https://github.com/Skylion007
ghstack dependencies: #161301, #161292, #161304, #161308, #161315, #161317, #161328, #161329, #161432
markc-614 pushed a commit to markc-614/pytorch that referenced this pull request Sep 17, 2025
…rch#161315)

Scanning a list of pybind enums with `in` is slow. See NOTE in code for full explanation.

This is a significant optimization; will be updating the torchdispatch/return_and_correct_aliasing portion of this stack with benchmark and results soonish.
Pull Request resolved: pytorch#161315
Approved by: https://github.com/Skylion007, https://github.com/bdhirsh
ghstack dependencies: pytorch#161301, pytorch#161292, pytorch#161304, pytorch#161308
markc-614 pushed a commit to markc-614/pytorch that referenced this pull request Sep 17, 2025
…torch#161317)

This assertion was expensive because of is_traceable_wrapper_subclass. Finding a cheap check to run first that's likely to let us skip the rest seems to improve things significantly.
Pull Request resolved: pytorch#161317
Approved by: https://github.com/ezyang, https://github.com/XilunWu, https://github.com/bdhirsh
ghstack dependencies: pytorch#161301, pytorch#161292, pytorch#161304, pytorch#161308, pytorch#161315
markc-614 pushed a commit to markc-614/pytorch that referenced this pull request Sep 17, 2025
markc-614 pushed a commit to markc-614/pytorch that referenced this pull request Sep 17, 2025
markc-614 pushed a commit to markc-614/pytorch that referenced this pull request Sep 17, 2025
…h#161432)

If we want them interned, we should intern at callsites. (The numpy reference has bit rotted; see numpy/numpy@b222eb6#diff-6bdb6105198083838f51c57b55b3a49472ed23043bb40018f1ea41138e687163)

Profiling a simple torchdispatch benchmark with perf before/after seems to show that time spent copying std::strings and interning Python strings is gone, though there is some noise and the improvement is very small.
Pull Request resolved: pytorch#161432
Approved by: https://github.com/ezyang
ghstack dependencies: pytorch#161301, pytorch#161292, pytorch#161304, pytorch#161308, pytorch#161315, pytorch#161317, pytorch#161328, pytorch#161329
markc-614 pushed a commit to markc-614/pytorch that referenced this pull request Sep 17, 2025
…rch#161455)

symbools are not identical with Py_True or PyFalse, so we can do those cheap checks first and at least get plain old bools to go fast.
Pull Request resolved: pytorch#161455
Approved by: https://github.com/Skylion007
ghstack dependencies: pytorch#161301, pytorch#161292, pytorch#161304, pytorch#161308, pytorch#161315, pytorch#161317, pytorch#161328, pytorch#161329, pytorch#161432
mansiag05 pushed a commit to mansiag05/pytorch that referenced this pull request Sep 22, 2025
…rch#161315)

Scanning a list of pybind enums with `in` is slow. See NOTE in code for full explanation.

This is a significant optimization; will be updating the torchdispatch/return_and_correct_aliasing portion of this stack with benchmark and results soonish.
Pull Request resolved: pytorch#161315
Approved by: https://github.com/Skylion007, https://github.com/bdhirsh
ghstack dependencies: pytorch#161301, pytorch#161292, pytorch#161304, pytorch#161308
mansiag05 pushed a commit to mansiag05/pytorch that referenced this pull request Sep 22, 2025
…torch#161317)

This assertion was expensive because of is_traceable_wrapper_subclass. Finding a cheap check to run first that's likely to let us skip the rest seems to improve things significantly.
Pull Request resolved: pytorch#161317
Approved by: https://github.com/ezyang, https://github.com/XilunWu, https://github.com/bdhirsh
ghstack dependencies: pytorch#161301, pytorch#161292, pytorch#161304, pytorch#161308, pytorch#161315
mansiag05 pushed a commit to mansiag05/pytorch that referenced this pull request Sep 22, 2025
mansiag05 pushed a commit to mansiag05/pytorch that referenced this pull request Sep 22, 2025
mansiag05 pushed a commit to mansiag05/pytorch that referenced this pull request Sep 22, 2025
…h#161432)

If we want them interned, we should intern at callsites. (The numpy reference has bit rotted; see numpy/numpy@b222eb6#diff-6bdb6105198083838f51c57b55b3a49472ed23043bb40018f1ea41138e687163)

Profiling a simple torchdispatch benchmark with perf before/after seems to show that time spent copying std::strings and interning Python strings is gone, though there is some noise and the improvement is very small.
Pull Request resolved: pytorch#161432
Approved by: https://github.com/ezyang
ghstack dependencies: pytorch#161301, pytorch#161292, pytorch#161304, pytorch#161308, pytorch#161315, pytorch#161317, pytorch#161328, pytorch#161329
mansiag05 pushed a commit to mansiag05/pytorch that referenced this pull request Sep 22, 2025
…rch#161455)

symbools are not identical with Py_True or PyFalse, so we can do those cheap checks first and at least get plain old bools to go fast.
Pull Request resolved: pytorch#161455
Approved by: https://github.com/Skylion007
ghstack dependencies: pytorch#161301, pytorch#161292, pytorch#161304, pytorch#161308, pytorch#161315, pytorch#161317, pytorch#161328, pytorch#161329, pytorch#161432
@github-actions github-actions bot deleted the gh/swolchok/800/head branch September 30, 2025 02:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants