You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
POST http://localhost:5102/v1/chat/completionsContent-Type: application/json
{
"model": "qwen3-30b",
"messages": [
{
"content": "Write twenty words. Each from new line.",
"role": "user"
}
],
"stream_options": {
"include_usage": true
},
"stream": false
}
Response
{
"choices": [
{
"finish_reason": "stop",
"index": 0,
"message": {
"role": "assistant",
"content": "<think>\nOkay, the user wants me to write twenty words, each on a new line. Let me start by thinking of different categories to make sure the words are varied. Maybe start with some common nouns, then verbs, adjectives, maybe a few adverbs or other parts of speech.\n\nFirst, \"apple\" is a simple noun. Then \"run\" as a verb. \"Happy\" as an adjective. \"Quickly\" as an adverb. \"Tree\" is another noun. \"Jump\" as a verb. \"Beautiful\" for adjective. \"Silently\" as adverb. \"Ocean\" noun. \"Sing\" verb. \"Brave\" adjective. \"Suddenly\" adverb. \"Mountain\" noun. \"Laugh\" verb. \"Dark\" adjective. \"Gently\" adverb. \"Light\" noun or adjective. \"Write\" verb. \"Strong\" adjective. \"Forever\" adverb or noun.\n\nWait, I need to check if each word is from a new line. Let me list them out one by one. Make sure there are exactly twenty. Also, avoid repeating the same parts of speech too much. Maybe mix them up. Let me count: apple, run, happy, quickly, tree, jump, beautiful, silently, ocean, sing, brave, suddenly, mountain, laugh, dark, gently, light, write, strong, forever. That's twenty. Each is on a new line. Should be okay. I'll present them as requested.\n</think>\n\napple \nrun \nhappy \nquickly \ntree \njump \nbeautiful \nsilently \nocean \nsing \nbrave \nsuddenly \nmountain \nlaugh \ndark \ngently \nlight \nwrite \nstrong \nforever"
}
}
],
"created": 1747143882,
"model": "qwen3-30b",
"system_fingerprint": "b5338-43dfd741",
"object": "chat.completion",
"usage": {
"completion_tokens": 355,
"prompt_tokens": 17,
"total_tokens": 372
},
"id": "chatcmpl-BaFCh4keAPfFfbfDGrkLl20HMI4apMeO",
"timings": {
"prompt_n": 1,
"prompt_ms": 146.114,
"prompt_per_token_ms": 146.114,
"prompt_per_second": 6.843971145817649,
"predicted_n": 355,
"predicted_ms": 5332.55,
"predicted_per_token_ms": 15.021267605633803,
"predicted_per_second": 66.57227780330236
}
}
What's wrong
After there is response content: \napple \nrun \nhappy \nquickly \ntree \njump \nbeautiful \nsilently \nocean \nsing \nbrave \nsuddenly \nmountain \nlaugh \ndark \ngently \nlight \nwrite \nstrong \nforever. And there are 2 spaces before each newline character. Everytime, everywhere.
First Bad Commit
No response
Relevant log output
srv params_from_: Chat format: Content-only
slot launch_slot_: id 0 | task 22047 | processing task
slot update_slots: id 0 | task 22047 | new prompt, n_ctx_slot = 65536, n_keep = 0, n_prompt_tokens = 17
slot update_slots: id 0 | task 22047 | need to evaluate at least 1 token to generate logits, n_past = 17, n_prompt_tokens = 17
slot update_slots: id 0 | task 22047 | kv cache rm [16, end)
slot update_slots: id 0 | task 22047 | prompt processing progress, n_past = 17, n_tokens = 1, progress = 0.058824
slot update_slots: id 0 | task 22047 | prompt done, n_past = 17, n_tokens = 1
slot release: id 0 | task 22047 | stop processing: n_past = 454, truncated = 0
slot print_timing: id 0 | task 22047 |
prompt evaltime= 145.62 ms / 1 tokens ( 145.62 ms per token, 6.87 tokens per second)
evaltime= 6362.02 ms / 438 tokens ( 14.53 ms per token, 68.85 tokens per second)
total time= 6507.64 ms / 439 tokens
srv update_slots: all slots are idle
srv log_server_r: request: POST /v1/chat/completions 127.0.0.1 200
The text was updated successfully, but these errors were encountered:
Name and Version
Operating systems
Windows
GGML backends
Vulkan
Hardware
Ryzen 9 7900X
Device 1: NVIDIA GeForce RTX 3090, compute capability 8.6, VMM: yes
Device 2: NVIDIA GeForce RTX 3090, compute capability 8.6, VMM: yes
Models
Qwen3 30B Q6_K_XL
Problem description & steps to reproduce
Start server command
Post request
Response
What's wrong
After there is response content:
\napple \nrun \nhappy \nquickly \ntree \njump \nbeautiful \nsilently \nocean \nsing \nbrave \nsuddenly \nmountain \nlaugh \ndark \ngently \nlight \nwrite \nstrong \nforever
. And there are 2 spaces before each newline character. Everytime, everywhere.First Bad Commit
No response
Relevant log output
The text was updated successfully, but these errors were encountered: