-
-
Notifications
You must be signed in to change notification settings - Fork 10.6k
[Bugfix] Remove contiguous output req for context parallel MLA #25414
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Michael Goin <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request fixes a failing distributed test by removing an assertion that checks for tensor contiguity. While this makes the test pass, it might introduce a silent correctness or performance issue in the subsequent reduce_scatter
operation, which often relies on contiguous tensors. I've suggested ensuring the tensor is contiguous instead of just removing the check, which is a safer approach.
…project#25414) Signed-off-by: Michael Goin <[email protected]>
…project#25414) Signed-off-by: Michael Goin <[email protected]> Signed-off-by: charlifu <[email protected]>
Signed-off-by: Michael Goin <[email protected]> Signed-off-by: yewentao256 <[email protected]>
…project#25414) Signed-off-by: Michael Goin <[email protected]> Signed-off-by: gaojc <[email protected]>
…project#25414) Signed-off-by: Michael Goin <[email protected]> Signed-off-by: xuebwang-amd <[email protected]>
…project#25414) Signed-off-by: Michael Goin <[email protected]>
Purpose
"Distributed Tests (B200)" has been failing due to this assert on the output being contiguous. This doesn't seem to be required and if I remove the assert the test is green
https://buildkite.com/vllm/ci/builds/31718/steps/canvas?sid=01996a6e-b6da-4752-9ec9-dfb3968b9a2e
Test Plan
Test Result
Essential Elements of an Effective PR Description Checklist
supported_models.md
andexamples
for a new model.