[Misc] Clean up MM profiling warnings #25222

ywang96 · 2025-09-19T02:46:16Z

Purpose

As we're deprecating V0, some of multimodal profiling warnings can be removed.

Test Plan

Test Result

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: Roger Wang <[email protected]>

gemini-code-assist

Code Review

This pull request cleans up multimodal profiling warnings that are no longer relevant with the deprecation of the V0 scheduler. The changes remove warnings related to sequence length limitations that are now handled by the V1 scheduler's chunked prefill mechanism. The modifications are straightforward and align with the goal of removing obsolete code. I have reviewed the changes and found no high or critical issues.

ywang96 · 2025-09-19T02:49:05Z

vllm/multimodal/profiling.py

-            if total_mm_tokens > seq_len:
-                logger.warning_once(
-                    "The sequence length (%d) is smaller than the pre-defined"
-                    " worst-case total number of multimodal tokens (%d). "
-                    "This may cause certain multi-modal inputs to fail during "
-                    "inference. To avoid this, you should increase "
-                    "`max_model_len` or reduce `mm_counts`.",
-                    seq_len,
-                    total_mm_tokens,
-                )


This warning will now show for the QwenVL model series by default since we modified profiling logic in #24312

Since we already return the following error message (without the server being crashed) if user actually sends a request longer than context window, I think this warning is no longer necessary and will be rather confusing in V1.

openai.BadRequestError: Error code: 400 - {'error': {'message': 'The decoder prompt (length 131072) is longer than the maximum model length of 128000. Make sure thatmax_model_lenis no smaller than the number of text tokens plus multimodal tokens. For image inputs, the number of image tokens depends on the number of images, and possibly their aspect ratios as well.', 'type': 'BadRequestError', 'param': None, 'code': 400}}

Signed-off-by: Roger Wang <[email protected]>

Signed-off-by: Roger Wang <[email protected]> Signed-off-by: charlifu <[email protected]>

Signed-off-by: Roger Wang <[email protected]> Signed-off-by: xuebwang-amd <[email protected]>

Signed-off-by: Roger Wang <[email protected]>

simplify

667ea01

Signed-off-by: Roger Wang <[email protected]>

ywang96 requested review from DarkLight1337 and NickLucche as code owners September 19, 2025 02:46

mergify bot added the multi-modality Related to multi-modality (#4194) label Sep 19, 2025

gemini-code-assist bot reviewed Sep 19, 2025

View reviewed changes

ywang96 commented Sep 19, 2025

View reviewed changes

ywang96 requested a review from Isotr0py September 19, 2025 02:49

Isotr0py approved these changes Sep 19, 2025

View reviewed changes

Isotr0py enabled auto-merge (squash) September 19, 2025 03:25

github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Sep 19, 2025

Isotr0py merged commit 31a8a2a into vllm-project:main Sep 19, 2025
53 checks passed

ywang96 added a commit to ywang96/vllm that referenced this pull request Sep 19, 2025

[Misc] Clean up MM profiling warnings (vllm-project#25222)

091403e

Signed-off-by: Roger Wang <[email protected]>

debroy-rh pushed a commit to debroy-rh/vllm that referenced this pull request Sep 19, 2025

[Misc] Clean up MM profiling warnings (vllm-project#25222)

03d183e

Signed-off-by: Roger Wang <[email protected]>

FeiDaLI pushed a commit to FeiDaLI/vllm that referenced this pull request Sep 25, 2025

[Misc] Clean up MM profiling warnings (vllm-project#25222)

de827bb

Signed-off-by: Roger Wang <[email protected]>

charlifu pushed a commit to ROCm/vllm that referenced this pull request Sep 25, 2025

[Misc] Clean up MM profiling warnings (vllm-project#25222)

cbf9296

Signed-off-by: Roger Wang <[email protected]> Signed-off-by: charlifu <[email protected]>

xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 10, 2025

[Misc] Clean up MM profiling warnings (vllm-project#25222)

6a883b0

Signed-off-by: Roger Wang <[email protected]> Signed-off-by: xuebwang-amd <[email protected]>

choprahetarth pushed a commit to Tandemn-Labs/vllm that referenced this pull request Oct 11, 2025

[Misc] Clean up MM profiling warnings (vllm-project#25222)

897e713

Signed-off-by: Roger Wang <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Misc] Clean up MM profiling warnings #25222

[Misc] Clean up MM profiling warnings #25222

Uh oh!

ywang96 commented Sep 19, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

ywang96 Sep 19, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

[Misc] Clean up MM profiling warnings #25222

[Misc] Clean up MM profiling warnings #25222

Uh oh!

Conversation

ywang96 commented Sep 19, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

ywang96 Sep 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ywang96 commented Sep 19, 2025 •

edited by github-actions bot

Loading

ywang96 Sep 19, 2025 •

edited

Loading