vllm-project / vllm Public

Notifications You must be signed in to change notification settings
Fork 10.6k
Star 60.4k

Code
Issues 1.8k
Pull requests 1.2k
Discussions
Actions
Projects 14
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Pull requests: vllm-project/vllm

Labels 70 Milestones 3

New pull request New

1,168 Open 13,927 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

[Misc] Add VLLM_DISTRIBUTED_INIT_METHOD_OVERRIDE env var

#27162 opened Oct 19, 2025 by WoosukKwon

Loading…

output type conversion fix

#27159 opened Oct 19, 2025 by jianyuh • Draft

[BugFix] Fix lazy imports involving outlines_core ready

ONLY add when PR is ready to merge/full CI is needed

structured-output v1

#27158 opened Oct 19, 2025 by 22quinn

Loading…

5 tasks

[Feature] Pydantic validation for speculative.py

#27156 opened Oct 18, 2025 by Navya1707

Loading…

Add auto max model len for available memory with --max-model-len -1 codex documentation

Improvements or additions to documentation

#27155 opened Oct 18, 2025 by mgoin

Loading…

[Chore] Separate out hashing utilities from vllm.utils kv-connector ready

ONLY add when PR is ready to merge/full CI is needed

#27151 opened Oct 18, 2025 by dongbo910220

Loading…

[MM Encoder]: Refactor mm encoder attention interface and support attention mask

#27147 opened Oct 18, 2025 by Isotr0py • Draft

1 of 5 tasks

[torch.compile] Enable silu_mul_fp8_quant fusion without custom ops enabled

#27146 opened Oct 18, 2025 by ZJY0516

Loading…

5 tasks

[Model][3/N] Improve all pooling task | Support chunked prefill with ALL pooling frontend v1

#27145 opened Oct 18, 2025 by noooop

Loading…

5 tasks

[Bugfix] fixes the decoding metadata of dense mla's fp8 kvcache. ci/build v1

#27144 opened Oct 18, 2025 by sighingnow

Loading…

[Core] Remove V0 executors frontend tpu

Related to Google TPUs

#27142 opened Oct 18, 2025 by njhill • Draft

[Core] CuteDSL MoE with Nvfp4 DeepEP dispatch ci/build v1

#27141 opened Oct 18, 2025 by wenscarl • Draft

5 tasks

[NIXL] use Host buffer to support TP_ratio > 1 for XPU kv-connector

#27140 opened Oct 18, 2025 by xuechendi

Loading…

5 tasks

[Fix][Spec Decode] Fix llama4 draft loading with different quantization llama

Related to Llama models

speculative-decoding

#27136 opened Oct 18, 2025 by linzebing

Loading…

3 of 5 tasks

feat: enable FlashInfer FP8 Blockscale on SM90

#27134 opened Oct 18, 2025 by djmmoss • Draft

1 of 3 tasks

[Bugfix] Fix incorrect kv cache metrics in grafana.json documentation

Improvements or additions to documentation

#27133 opened Oct 17, 2025 by fangpings

Loading…

5 tasks

Early exit for MoE LoRA kernels ci/build deepseek

Related to DeepSeek models

gpt-oss

Related to GPT-OSS models

needs-rebase qwen

Related to Qwen models

#27131 opened Oct 17, 2025 by gnovack • Draft

5 tasks

[BugFix] bugfix for Flash Attention MLA with full cuda graph IMA following pr-25490 ready

ONLY add when PR is ready to merge/full CI is needed

#27128 opened Oct 17, 2025 by Daisy-Ma-coder

Loading…

[compile] Enable sequence parallelism matching w/o custom ops enabled torch.compile

#27126 opened Oct 17, 2025 by angelayi

Loading…

vllm==v0.12.0/torch==2.9.0 compilation improvements

make flash_attn ViT upgrade opt-in ci/build ci-failure

Issue about an unexpected test failure in CI

qwen

Related to Qwen models

rocm

Related to AMD ROCm

#27124 opened Oct 17, 2025 by bradleyhd

Loading…

[Kernels] Swap quant method

#27123 opened Oct 17, 2025 by bnellnm

Loading…

[Bugfix] Fix allocation & free logic of SingleWriterShmRingBuffer

#27117 opened Oct 17, 2025 by imkero

Loading…

5 tasks

[CI/Build]Add eval config for Qwen3-235B-A22B-Thinking-2507-FP8 and Qwen3-8B ci/build qwen

Related to Qwen models

#27113 opened Oct 17, 2025 by hl475 • Draft

5 tasks

Add missing opentelemetry dependency to base docker image ci/build

#27109 opened Oct 17, 2025 by Aymendje

Loading…

3 of 5 tasks

[CI] Fix mypy for vllm/v1/core and vllm/v1/engine ready

ONLY add when PR is ready to merge/full CI is needed

#27108 opened Oct 17, 2025 by yewentao256

Loading…

Previous 1 2 3 4 5 … 46 47 Next

Previous Next

ProTip! Updated in the last three days: updated:>2025-10-15.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Uh oh!