Eval bug: Qwen3 30B-A3B throwing assertion error with Vulkan backend #13233

Mushoz · 2025-05-01T11:38:33Z

Name and Version

[docker@104ba42db8f2 ~]$ llama-server --version
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = AMD Radeon RX 7900 XTX (RADV NAVI31) (radv) | uma: 0 | fp16: 1 | warp size: 64 | shared memory: 65536 | int dot: 1 | matrix cores: KHR_coopmat
version: 5234 (3e168be)
built with cc (GCC) 15.1.1 20250425 for x86_64-pc-linux-gnu

Operating systems

Linux

GGML backends

Vulkan

Hardware

7900XTX under archlinux with the radv vulkan driver

Models

Qwen3-30B-A3B

Problem description & steps to reproduce

I am using the following command:

llama-server --port 9001 --metrics --slots -m /models/Qwen3-30B-A3B-UD-Q4_K_XL.gguf -ngl 999 --ctx-size 32768 --no-context-shift

When I use openweb-ui for short inputs, the model works fine, but when using long inputs or the conversation gets too long llama.cpp throws an assertion error:

slot update_slots: id  0 | task 2322 | prompt done, n_past = 668, n_tokens = 609
/home/docker/llama.cpp/ggml/src/ggml-vulkan/ggml-vulkan.cpp:5059: GGML_ASSERT(nei0 * nei1 <= 3072) failed

First Bad Commit

N/A

Relevant log output

See above

The text was updated successfully, but these errors were encountered:

stduhpf · 2025-05-01T11:43:27Z

See #13164. As a workaround, you can try setting a small batch size (under 384 works for me).

Mushoz · 2025-05-01T12:04:39Z

Thanks for the pointer, closing as duplicate

Mushoz added the bug-unconfirmed label May 1, 2025

Mushoz closed this as completed May 1, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Eval bug: Qwen3 30B-A3B throwing assertion error with Vulkan backend #13233

Eval bug: Qwen3 30B-A3B throwing assertion error with Vulkan backend #13233

Mushoz commented May 1, 2025 •

edited

Loading

stduhpf commented May 1, 2025

Mushoz commented May 1, 2025

Eval bug: Qwen3 30B-A3B throwing assertion error with Vulkan backend #13233

Eval bug: Qwen3 30B-A3B throwing assertion error with Vulkan backend #13233

Comments

Mushoz commented May 1, 2025 • edited Loading

Name and Version

Operating systems

GGML backends

Hardware

Models

Problem description & steps to reproduce

First Bad Commit

Relevant log output

stduhpf commented May 1, 2025

Mushoz commented May 1, 2025

Mushoz commented May 1, 2025 •

edited

Loading