-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Insights: huggingface/text-generation-inference
Overview
Could not load contribution data
Please try again later
1 Release published by 1 person
-
v3.3.0
published
May 9, 2025
4 Pull requests merged by 3 people
-
adjust the
round_up_seq
logic to align with prefill warmup phase on…#3224 merged
May 12, 2025 -
change HPU warmup logic: seq length should be with exponential growth
#3217 merged
May 10, 2025 -
Prepare for 3.3.0
#3220 merged
May 9, 2025 -
Chunked Prefill VLM
#3188 merged
May 6, 2025
4 Pull requests opened by 4 people
-
Deepseek r1
#3211 opened
May 7, 2025 -
Update to Torch 2.7.0
#3221 opened
May 10, 2025 -
Refine logging for Gaudi warmup
#3222 opened
May 10, 2025 -
Enable Llama4 for gaudi backend
#3223 opened
May 11, 2025
2 Issues closed by 2 people
-
when HF_HUB_OFFLINE == 1, cache config becomes inconsistent
#3212 closed
May 8, 2025 -
Adapt the response_format closer to OpenAIs format
#3058 closed
May 7, 2025
6 Issues opened by 5 people
-
Launching a container with an unprivileged user
#3225 opened
May 12, 2025 -
Docker Image that works on RTX 5090
#3219 opened
May 9, 2025 -
Model request: RecurrentGemma
#3216 opened
May 9, 2025 -
Phi 4 Reasoning Not able to Start
#3215 opened
May 8, 2025 -
Strange output when using Structured output with Gemma 3 12b it
#3214 opened
May 8, 2025 -
Whether it supports Huawei Atlas300 graphics card?
#3213 opened
May 8, 2025
4 Unresolved conversations
Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.
-
RuntimeError: Cannot load 'awq' weight when running Qwen2-VL-72B-Instruct-AWQ model
#2944 commented on
May 7, 2025 • 0 new comments -
Qwen 3 support
#3199 commented on
May 8, 2025 • 0 new comments -
Add support for phi-4-mini and phi-4-multimodal
#3071 commented on
May 9, 2025 • 0 new comments -
Function/tool calling never resolves
#2986 commented on
May 13, 2025 • 0 new comments