-
-
Notifications
You must be signed in to change notification settings - Fork 10.5k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Notice for deprecation of AutoAWQ
documentation
Improvements or additions to documentation
#26820
opened Oct 14, 2025 by
HDCharles
Loading…
4 tasks done
AI Fix for: [Feature]: Add process_weights_after_loading to AttentionImpl
frontend
#26819
opened Oct 14, 2025 by
shanaya-Gupta
Loading…
[Kernel][MoE] Add MoE tunings for GLM 4.6-FP8 and GLM 4.5 Air on NVidia B200
#26818
opened Oct 14, 2025 by
zklapow
Loading…
3 of 5 tasks
[CI Failure] Fix tests with missing TinyLlama-1.1B-Chat-v1.0-FP8-e2e
llama
Related to Llama models
ready
ONLY add when PR is ready to merge/full CI is needed
#26816
opened Oct 14, 2025 by
mgoin
Loading…
5 tasks
[Bugfix] Fix qwen3-omni audio truncation issue
qwen
Related to Qwen models
#26815
opened Oct 14, 2025 by
Isotr0py
Loading…
1 of 5 tasks
[P/D] KV Load Failure Recovery/Abort Configuration
frontend
kv-connector
v1
#26813
opened Oct 14, 2025 by
wseaton
Loading…
[Nixl] Add metrics to Prometheus-Grafana dashboard
kv-connector
v1
#26811
opened Oct 14, 2025 by
NickLucche
Loading…
[Core] Use envs.__getattr__ for all Unify to environment variable access
multi-modality
Related to multi-modality (#4194)
v1
#26810
opened Oct 14, 2025 by
Jialin
Loading…
3 of 5 tasks
[Docs] update README.md to display logo correctly and fix links
documentation
Improvements or additions to documentation
#26809
opened Oct 14, 2025 by
ddalgrande
Loading…
3 of 5 tasks
[Feature] GatedDeltaNet Automatic Prefix Caching
qwen
Related to Qwen models
v1
#26807
opened Oct 14, 2025 by
simondanielsson
•
Draft
1 of 11 tasks
Fix seed reproducibility issue by adding output.copy_(out)
#26805
opened Oct 14, 2025 by
XuanofXXX
Loading…
3 of 5 tasks
[Metrics] Refactor LoRA state tracking
ready
ONLY add when PR is ready to merge/full CI is needed
v1
#26801
opened Oct 14, 2025 by
markmc
Loading…
[Model] add kosmos2_5 for vllm
new-model
Requests to new models
#26800
opened Oct 14, 2025 by
yugeeklab
Loading…
3 of 5 tasks
[UX] Fallback to native implementation when flashinfer sampler failed to compile
v1
#26799
opened Oct 14, 2025 by
Isotr0py
Loading…
1 of 5 tasks
[Doc] ruff format remaining Python examples
documentation
Improvements or additions to documentation
#26795
opened Oct 14, 2025 by
DarkLight1337
Loading…
5 tasks
make fp4 scaled_mm works for 5090 gpu
ci/build
#26793
opened Oct 14, 2025 by
XiaobingSuper
Loading…
3 of 5 tasks
llama4_vision_rope: add HIP override to accept (q, k) and avoid (positions, q, k) mismatch
llama
Related to Llama models
#26790
opened Oct 14, 2025 by
hl475
Loading…
5 tasks
[bugfix] remove unused parameters to reduce unnecessary vram usage
ready
ONLY add when PR is ready to merge/full CI is needed
#26789
opened Oct 14, 2025 by
ReinForce-II
Loading…
3 of 5 tasks
[Feature] default --extra-body param to disable thinking in vllm bench serve
frontend
performance
Performance-related issues
ready
ONLY add when PR is ready to merge/full CI is needed
#26784
opened Oct 14, 2025 by
lengrongfu
Loading…
5 tasks
[Fix] Avoid UserWarning when creating tensors from base64 embeddings
documentation
Improvements or additions to documentation
#26782
opened Oct 14, 2025 by
mmangkad
Loading…
5 tasks
[Bugfix] DeepSeek V3.2 MTP metadata & CUDA graph issues
deepseek
Related to DeepSeek models
speculative-decoding
v1
#26779
opened Oct 14, 2025 by
xiaohajiayou
Loading…
[CI/Build][Bugfix] fix qutlass cmake error when set QUTLASS_SRC_DIR
bug
Something isn't working
ci/build
ready
ONLY add when PR is ready to merge/full CI is needed
#26773
opened Oct 14, 2025 by
izhuhaoran
Loading…
5 tasks
Previous Next
ProTip!
Exclude everything labeled
bug
with -label:bug.