Eval bug: Qwen3 30B adds spaces to end of each line #13508

Nepherpitou · 2025-05-13T13:47:36Z

Name and Version

.\llamacpp\cuda12\llama-server.exe --version
ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 3 CUDA devices:
Device 0: NVIDIA GeForce RTX 4090, compute capability 8.9, VMM: yes
Device 1: NVIDIA GeForce RTX 3090, compute capability 8.6, VMM: yes
Device 2: NVIDIA GeForce RTX 3090, compute capability 8.6, VMM: yes
load_backend: loaded CUDA backend from C:\Users\unat\llm\llamacpp\cuda12\ggml-cuda.dll
load_backend: loaded RPC backend from C:\Users\unat\llm\llamacpp\cuda12\ggml-rpc.dll
load_backend: loaded CPU backend from C:\Users\unat\llm\llamacpp\cuda12\ggml-cpu-icelake.dll
version: 5338 (43dfd74)
built with MSVC 19.29.30159.0 for Windows AMD64

Operating systems

Windows

GGML backends

Vulkan

Hardware

Ryzen 9 7900X
Device 1: NVIDIA GeForce RTX 3090, compute capability 8.6, VMM: yes
Device 2: NVIDIA GeForce RTX 3090, compute capability 8.6, VMM: yes

Models

Qwen3 30B Q6_K_XL

Problem description & steps to reproduce

Start server command

./llamacpp/vulkan/llama-server.exe --jinja --reasoning-format deepseek --no-mmap --no-warmup --host 0.0.0.0 --port 5102 --metrics --slots -m ./models/Qwen3-30B-A3B-128K-UD-Q6_K_XL.gguf -ngl 99 --flash-attn --ctx-size 65536 -ctk q8_0 -ctv q8_0 --min-p 0 --top-k 20 --no-context-shift -dev VULKAN1,VULKAN2 -ts 100,100 -t 12 --log-colors

Post request

POST http://localhost:5102/v1/chat/completions
Content-Type: application/json

{
  "model": "qwen3-30b",
  "messages": [
    {
      "content": "Write twenty words. Each from new line.",
      "role": "user"
    }
  ],
  "stream_options": {
    "include_usage": true
  },
  "stream": false
}

Response

{
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "<think>\nOkay, the user wants me to write twenty words, each on a new line. Let me start by thinking of different categories to make sure the words are varied. Maybe start with some common nouns, then verbs, adjectives, maybe a few adverbs or other parts of speech.\n\nFirst, \"apple\" is a simple noun. Then \"run\" as a verb. \"Happy\" as an adjective. \"Quickly\" as an adverb. \"Tree\" is another noun. \"Jump\" as a verb. \"Beautiful\" for adjective. \"Silently\" as adverb. \"Ocean\" noun. \"Sing\" verb. \"Brave\" adjective. \"Suddenly\" adverb. \"Mountain\" noun. \"Laugh\" verb. \"Dark\" adjective. \"Gently\" adverb. \"Light\" noun or adjective. \"Write\" verb. \"Strong\" adjective. \"Forever\" adverb or noun.\n\nWait, I need to check if each word is from a new line. Let me list them out one by one. Make sure there are exactly twenty. Also, avoid repeating the same parts of speech too much. Maybe mix them up. Let me count: apple, run, happy, quickly, tree, jump, beautiful, silently, ocean, sing, brave, suddenly, mountain, laugh, dark, gently, light, write, strong, forever. That's twenty. Each is on a new line. Should be okay. I'll present them as requested.\n</think>\n\napple  \nrun  \nhappy  \nquickly  \ntree  \njump  \nbeautiful  \nsilently  \nocean  \nsing  \nbrave  \nsuddenly  \nmountain  \nlaugh  \ndark  \ngently  \nlight  \nwrite  \nstrong  \nforever"
      }
    }
  ],
  "created": 1747143882,
  "model": "qwen3-30b",
  "system_fingerprint": "b5338-43dfd741",
  "object": "chat.completion",
  "usage": {
    "completion_tokens": 355,
    "prompt_tokens": 17,
    "total_tokens": 372
  },
  "id": "chatcmpl-BaFCh4keAPfFfbfDGrkLl20HMI4apMeO",
  "timings": {
    "prompt_n": 1,
    "prompt_ms": 146.114,
    "prompt_per_token_ms": 146.114,
    "prompt_per_second": 6.843971145817649,
    "predicted_n": 355,
    "predicted_ms": 5332.55,
    "predicted_per_token_ms": 15.021267605633803,
    "predicted_per_second": 66.57227780330236
  }
}

What's wrong

After there is response content: \napple \nrun \nhappy \nquickly \ntree \njump \nbeautiful \nsilently \nocean \nsing \nbrave \nsuddenly \nmountain \nlaugh \ndark \ngently \nlight \nwrite \nstrong \nforever. And there are 2 spaces before each newline character. Everytime, everywhere.

First Bad Commit

No response

Relevant log output

srv  params_from_: Chat format: Content-only
slot launch_slot_: id  0 | task 22047 | processing task
slot update_slots: id  0 | task 22047 | new prompt, n_ctx_slot = 65536, n_keep = 0, n_prompt_tokens = 17
slot update_slots: id  0 | task 22047 | need to evaluate at least 1 token to generate logits, n_past = 17, n_prompt_tokens = 17
slot update_slots: id  0 | task 22047 | kv cache rm [16, end)
slot update_slots: id  0 | task 22047 | prompt processing progress, n_past = 17, n_tokens = 1, progress = 0.058824
slot update_slots: id  0 | task 22047 | prompt done, n_past = 17, n_tokens = 1
slot      release: id  0 | task 22047 | stop processing: n_past = 454, truncated = 0
slot print_timing: id  0 | task 22047 |
prompt eval time =     145.62 ms /     1 tokens (  145.62 ms per token,     6.87 tokens per second)
       eval time =    6362.02 ms /   438 tokens (   14.53 ms per token,    68.85 tokens per second)
      total time =    6507.64 ms /   439 tokens
srv  update_slots: all slots are idle
srv  log_server_r: request: POST /v1/chat/completions 127.0.0.1 200

The text was updated successfully, but these errors were encountered:

pwilkin · 2025-05-13T20:12:48Z

It's a "feature" of the template, I think.

cmdntfnd · 2025-05-13T20:33:08Z

A lot of models these days focus on outputting nicely formatted markdown. Markdown requires two spaces before \n to do a proper linebreak as explained here:
https://daringfireball.net/projects/markdown/syntax#p

When you do want to insert a
break tag using Markdown, you end a line with two or more spaces, then type return.

Not a bug.

Nepherpitou added the bug-unconfirmed label May 13, 2025

CISC closed this as completed May 14, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Eval bug: Qwen3 30B adds spaces to end of each line #13508

Eval bug: Qwen3 30B adds spaces to end of each line #13508

Nepherpitou commented May 13, 2025

pwilkin commented May 13, 2025

cmdntfnd commented May 13, 2025 •

edited

Loading

Eval bug: Qwen3 30B adds spaces to end of each line #13508

Eval bug: Qwen3 30B adds spaces to end of each line #13508

Comments

Nepherpitou commented May 13, 2025

Name and Version

Operating systems

GGML backends

Hardware

Models

Problem description & steps to reproduce

Start server command

Post request

Response

What's wrong

First Bad Commit

Relevant log output

pwilkin commented May 13, 2025

cmdntfnd commented May 13, 2025 • edited Loading

cmdntfnd commented May 13, 2025 •

edited

Loading