Skip to content

Pull requests: vllm-project/vllm

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

output type conversion fix
#27159 opened Oct 19, 2025 by jianyuh Draft
[BugFix] Fix lazy imports involving outlines_core ready ONLY add when PR is ready to merge/full CI is needed structured-output v1
#27158 opened Oct 19, 2025 by 22quinn Loading…
5 tasks
[Feature] Pydantic validation for speculative.py
#27156 opened Oct 18, 2025 by Navya1707 Loading…
Add auto max model len for available memory with --max-model-len -1 codex documentation Improvements or additions to documentation v1
#27155 opened Oct 18, 2025 by mgoin Loading…
[Chore] Separate out hashing utilities from vllm.utils kv-connector ready ONLY add when PR is ready to merge/full CI is needed v1
#27151 opened Oct 18, 2025 by dongbo910220 Loading…
[Core] Remove V0 executors frontend tpu Related to Google TPUs v1
#27142 opened Oct 18, 2025 by njhill Draft
[Fix][Spec Decode] Fix llama4 draft loading with different quantization llama Related to Llama models speculative-decoding
#27136 opened Oct 18, 2025 by linzebing Loading…
3 of 5 tasks
feat: enable FlashInfer FP8 Blockscale on SM90
#27134 opened Oct 18, 2025 by djmmoss Draft
1 of 3 tasks
[Bugfix] Fix incorrect kv cache metrics in grafana.json documentation Improvements or additions to documentation
#27133 opened Oct 17, 2025 by fangpings Loading…
5 tasks
Early exit for MoE LoRA kernels ci/build deepseek Related to DeepSeek models gpt-oss Related to GPT-OSS models needs-rebase qwen Related to Qwen models
#27131 opened Oct 17, 2025 by gnovack Draft
5 tasks
[BugFix] bugfix for Flash Attention MLA with full cuda graph IMA following pr-25490 ready ONLY add when PR is ready to merge/full CI is needed v1
#27128 opened Oct 17, 2025 by Daisy-Ma-coder Loading…
make flash_attn ViT upgrade opt-in ci/build ci-failure Issue about an unexpected test failure in CI qwen Related to Qwen models rocm Related to AMD ROCm
#27124 opened Oct 17, 2025 by bradleyhd Loading…
[Kernels] Swap quant method
#27123 opened Oct 17, 2025 by bnellnm Loading…
[Bugfix] Fix allocation & free logic of SingleWriterShmRingBuffer
#27117 opened Oct 17, 2025 by imkero Loading…
5 tasks
Add missing opentelemetry dependency to base docker image ci/build
#27109 opened Oct 17, 2025 by Aymendje Loading…
3 of 5 tasks
[CI] Fix mypy for vllm/v1/core and vllm/v1/engine ready ONLY add when PR is ready to merge/full CI is needed v1
#27108 opened Oct 17, 2025 by yewentao256 Loading…
ProTip! Updated in the last three days: updated:>2025-10-15.