Accumulate tokens in generate mode #1534

manuelcandales · 2025-04-22T21:49:26Z

We accumulate tokens in generate mode before calling callback on them.
This avoids synchronizing GPU and CPU on each token, improving performance on MPS backend.

pytorch-bot · 2025-04-22T21:49:29Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchchat/1534

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit facb0b7 with merge base 359db61 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Jack-Khuu · 2025-04-22T23:32:43Z

torchchat/generate.py

@@ -294,6 +295,7 @@ def from_args(cls, args):
            sequential_prefill=sequential_prefill,
            max_autotune=args.max_autotune,
            is_torchtune_model=args.model and args.model.endswith("tune"),
+            accumulate_tokens=getattr(args, "accumulate_tokens", 8),


Unrelated to this PR: I should fix this so we don't have duplicate defaults

manuelcandales added 2 commits April 22, 2025 16:46

batch callbacks in generate mode

0f66ad0

command line argument

facb0b7

facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Apr 22, 2025

manuelcandales requested a review from Jack-Khuu April 22, 2025 21:57

Jack-Khuu approved these changes Apr 22, 2025

View reviewed changes

manuelcandales merged commit 5f8f35d into main Apr 23, 2025
72 checks passed

manuelcandales deleted the manuel/batch-callbacks branch April 23, 2025 00:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Accumulate tokens in generate mode #1534

Accumulate tokens in generate mode #1534

manuelcandales commented Apr 22, 2025

pytorch-bot bot commented Apr 22, 2025 •

edited

Loading

Jack-Khuu Apr 22, 2025

Accumulate tokens in generate mode #1534

Accumulate tokens in generate mode #1534

Conversation

manuelcandales commented Apr 22, 2025

pytorch-bot bot commented Apr 22, 2025 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchchat/1534

✅ No Failures

Jack-Khuu Apr 22, 2025

Choose a reason for hiding this comment

pytorch-bot bot commented Apr 22, 2025 •

edited

Loading