Skip to content

Conversation

aws-satyajith
Copy link
Contributor

@aws-satyajith aws-satyajith commented May 16, 2025

Add multi-LoRA support for Neuron.

  • Add a test using single and multiple LoRAs

RFC: #15970

Copy link

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

Copy link

mergify bot commented May 19, 2025

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @aws-satyajith.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

Copy link

mergify bot commented May 22, 2025

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @aws-satyajith.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify bot added the needs-rebase label May 22, 2025
@mergify mergify bot removed the needs-rebase label May 22, 2025
@aws-satyajith aws-satyajith changed the title Add multi-LoRA support for Neuron. [Neuron] Add multi-LoRA support for Neuron. May 22, 2025
Copy link
Contributor

@liangfu liangfu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for contributing @aws-satyajith . the changes look good to me. it would be better if we could support dynamic lora loading, so that we don't need to pass lora-module in the argument.

@mrinalks
Copy link
Contributor

@WoosukKwon @simon-mo could we get a quick review on this PR so we merge before V0 Freeze in 0.9.0? 🙏

Copy link

mergify bot commented May 27, 2025

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @aws-satyajith.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify bot added the needs-rebase label May 27, 2025
max_lora_rank: int = LoRAConfig.max_lora_rank
fully_sharded_loras: bool = LoRAConfig.fully_sharded_loras
max_cpu_loras: Optional[int] = LoRAConfig.max_cpu_loras
lora_modules: Optional[LoRAModulePath] = None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use override_neuron_config to include lora modules, instead of adding an additional argument lora_modules?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. Since the regular usage diverges from dynamic loading anyway, I think moving lora_modules to override neuron config is doable. I'll make the change and publish a new revision

Copy link
Collaborator

@jeejeelee jeejeelee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Considering that most of the changes in this PR are within Neuron, after addressing the comments above, overall LGTM.

Signed-off-by: Satyajith Chilappagari <[email protected]>
@mrinalks
Copy link
Contributor

@jeejeelee @simon-mo seems like all checks are complete - we know that entrypoints has been flaky recently.
Ready to merge?

@jeejeelee jeejeelee added the ready ONLY add when PR is ready to merge/full CI is needed label May 29, 2025
@jeejeelee jeejeelee merged commit 972eddf into vllm-project:main May 29, 2025
72 checks passed
@GhostCCCatHenry
Copy link

GhostCCCatHenry commented May 29, 2025 via email

amitm02 pushed a commit to amitm02/vllm that referenced this pull request Jun 1, 2025
@aws-satyajith aws-satyajith deleted the neuron_up_mlora branch June 12, 2025 20:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready ONLY add when PR is ready to merge/full CI is needed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants