Skip to content

Conversation

hmellor
Copy link
Member

@hmellor hmellor commented Aug 27, 2025

Fix the Transformers backend torch.compile support by disabling it when using models with dynamic rope are loaded.

@mergify mergify bot added documentation Improvements or additions to documentation new-model Requests to new models labels Aug 27, 2025
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds support for the InternVLForConditionalGeneration architecture and disables torch.compile for models with dynamic rope scaling. The changes for the new model support are correct. However, I've identified a critical bug in the implementation for disabling torch.compile where a potential AttributeError can occur. Please see my comment for the suggested fix.

"3.5-qwen3moe": "OpenGVLab/InternVL3_5-30B-A3B", # noqa: E501
"3.5-gptoss": "OpenGVLab/InternVL3_5-GPT-OSS-20B-A4B-Preview"}, # noqa: E501
trust_remote_code=True),
"InternVLForConditionalGeneration": _HfExamplesInfo("OpenGVLab/InternVL3-1B-hf"), # noqa: E501
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I plan to enable native support through existing InternS1ForConditionalGeneration with video input support in #23742, because their implementations are exactly same.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we add video support to the Transformers backend instead 👀

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Long term, if it's not necessary to reimplement a model in vLLM we should avoid it if possible

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we add video support to the Transformers backend instead 👀

I think this needs extra refactoring at Transformers side first, because we only added helper function in processor for images previously IIRC.

Long term, if it's not necessary to reimplement a model in vLLM we should avoid it if possible

Yes, but since InternS1ForConditionalGeneration is already here, we can reuse it simply with video support and no need to disable torch.compile:

"InternVLForConditionalGeneration": ("interns1", "InternS1ForConditionalGeneration")

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but since InternS1ForConditionalGeneration is already here, we can reuse it simply with video support

For now I will remove the InternVL stuff from my PR

no need to disable torch.compile

InternVLChatModel and InternS1ForConditionalGeneration do not support torch.compile in vLLM. Neither of them are decorated with @support_torch_compile

Copy link
Member

@Isotr0py Isotr0py Aug 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

InternVLChatModel and InternS1ForConditionalGeneration do not support torch.compile in vLLM. Neither of them are decorated with @support_torch_compile

No, we only compile text backbone for VLM exactly (no ViT compilation yet), @support_torch_compile will only take effect at language model level. And Qwen2 and Qwen3MoE (without MRoPE) they used have supported @support_torch_compile.

(There are no VLM implementations wrapped by @support_torch_compile, because we only added it to backbone)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I wasn't aware that's how it worked.

TransformersForMultimodalLM is wrapped with @support_torch_compile and it seems to work fine. In TransformersForMultimodalLM there is no separate backbone to decorate as we use inheritance to add MM support to TransformersForCausalLM. If we removed @support_torch_compile from TransformersForMultimodalLM, would the @support_torch_compile on TransformersForCausalLM do anything?

Copy link
Member

@DarkLight1337 DarkLight1337 Aug 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the torch compile will be limited to the modules that have the decorator (and their submodules)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TransformersForMultimodalLM is a subclass of TransformersForCausalLM, not a submodule.

Do you know how @support_torch_compile interacts with inheritance?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think torch.compile on the base class is also applied to subclasses. Maybe @youkaichao can confirm this

@hmellor hmellor changed the title Add support for InternVLForConditionalGeneration and disable torch.compile for dynamic rope models Disable torch.compile for dynamic rope models in Transformers backend Aug 27, 2025
@hmellor hmellor enabled auto-merge (squash) August 27, 2025 12:22
@github-actions github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Aug 27, 2025
@hmellor hmellor merged commit 0585a9e into vllm-project:main Aug 27, 2025
40 checks passed
@hmellor hmellor deleted the internvl-hf branch August 27, 2025 19:03
epwalsh pushed a commit to epwalsh/vllm that referenced this pull request Aug 28, 2025
zhewenl pushed a commit to zhewenl/vllm that referenced this pull request Aug 28, 2025
zhewenl pushed a commit to zhewenl/vllm that referenced this pull request Sep 3, 2025
FeiDaLI pushed a commit to FeiDaLI/vllm that referenced this pull request Sep 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation new-model Requests to new models ready ONLY add when PR is ready to merge/full CI is needed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants