Skip to content

Misc. bug: since b4800 llama-cli does not prompt and llama-bench shows no results #13452

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
pabpas opened this issue May 11, 2025 · 10 comments
Closed

Comments

@pabpas
Copy link

pabpas commented May 11, 2025

Name and Version

Last working version:

$ llama-cli --version
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = Intel(R) Graphics (BMG G21) (Intel open-source Mesa driver) | uma: 0 | fp16: 1 | warp size: 32 | shared memory: 131072 | matrix cores: none
version: 4799 (14dec0c2)
built with cc (Debian 14.2.0-19) 14.2.0 for x86_64-linux-gnu

Operating systems

Linux

Which llama.cpp modules do you know to be affected?

llama-cli

Command line

Problem description & steps to reproduce

Starting with b4800 llama-cli does not reach prompt input, it stops here:

$ llama-cli -m Ministral-8B-Instruct-2410.q8.gguf -ngl 37
[...]
main: interactive mode on.
sampler seed: 507615108
sampler params: 
        repeat_last_n = 64, repeat_penalty = 1.000, frequency_penalty = 0.000, presence_penalty = 0.000
        dry_multiplier = 0.000, dry_base = 1.750, dry_allowed_length = 2, dry_penalty_last_n = 4096
        top_k = 40, top_p = 0.950, min_p = 0.050, xtc_probability = 0.000, xtc_threshold = 0.100, typical_p = 1.000, top_n_sigma = -1.000, temp = 0.800
        mirostat = 0, mirostat_lr = 0.100, mirostat_ent = 5.000
sampler chain: logits -> logit-bias -> penalties -> dry -> top-n-sigma -> top-k -> typical -> top-p -> min-p -> xtc -> temp-ext -> dist 
generate: n_ctx = 4096, n_batch = 2048, n_predict = -1, n_keep = 1

== Running in interactive mode. ==
 - Press Ctrl+C to interject at any time.
 - Press Return to return control to the AI.
 - To return control without starting a new line, end your input with '/'.
 - If you want to submit another line, end your input with '\'.

and llama-bench shows no results (also no error):

$ llama-bench -m llama-2-7b.Q4_0.gguf
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = Intel(R) Graphics (BMG G21) (Intel open-source Mesa driver) | uma: 0 | fp16: 1 | warp size: 32 | shared memory: 131072 | matrix cores: none

First Bad Commit

cc473ca

Relevant log output

@slaren
Copy link
Member

slaren commented May 11, 2025

Do you have any files with special or non-ASCII characters in the directory?

@pabpas
Copy link
Author

pabpas commented May 11, 2025

Your question reminded me of: #11198
I had that one with debian bookworm but got fixed when upgrading to trixie.

Anyway, in the directory there were many files and could not see any non-ASCII characters, but to make sure I put the .gguf in a directory on its own. Unfortunately the outcome is the same.

@slaren
Copy link
Member

slaren commented May 12, 2025

Please try to obtain a callstack of the crash:

  • Make a debug build by adding -DCMAKE_BUILD_TYPE=Debug to the cmake command line
  • Run gdb --ex run --ex bt --args llama-cli -m <rest of the command line>

@pabpas
Copy link
Author

pabpas commented May 12, 2025

There is no crash, it just stays there and I am not able to input anything.

Built like this:

$ cmake -S . -B build -G Ninja -DGGML_VULKAN=ON -DCMAKE_BUILD_TYPE=Debug -DLLAMA_BUILD_TESTS=OFF -DLLAMA_BUILD_EXAMPLES=ON -DLLAMA_BUILD_SERVER=ON 
-- Warning: ccache not found - consider installing it for faster compilation or disable this warning with GGML_CCACHE=OFF
-- CMAKE_SYSTEM_PROCESSOR: x86_64
-- Including CPU backend
-- x86 detected
-- Adding CPU backend variant ggml-cpu: -march=native 
-- Vulkan found
-- GL_KHR_cooperative_matrix supported by glslc
-- GL_NV_cooperative_matrix2 supported by glslc
-- GL_EXT_integer_dot_product supported by glslc
-- GL_EXT_bfloat16 not supported by glslc
-- Including Vulkan backend
-- Configuring done (0.6s)
-- Generating done (0.1s)
-- Build files have been written to: /home/user/src/llama.cpp-vulkan/build

$ cmake --build build --config Release[2/164] Generating build details from Git
-- Found Git: /usr/bin/git (found version "2.47.2")
[35/164] Generate vulkan shaders
ggml_vulkan: Generating and compiling shaders to SPIR-V
[113/164] Building CXX object ggml/src/g...eFiles/ggml-vulkan.dir/ggml-vulkan.cpp.o
/home/user/src/llama.cpp-vulkan/ggml/src/ggml-vulkan/ggml-vulkan.cpp: In function ‘vk_pipeline ggml_vk_guess_matmul_pipeline(ggml_backend_vk_context*, vk_matmul_pipeline&, uint32_t, uint32_t, bool, ggml_type, ggml_type)’:
/home/user/src/llama.cpp-vulkan/ggml/src/ggml-vulkan/ggml-vulkan.cpp:4428:175: warning: unused parameter ‘src1_type’ [-Wunused-parameter]
 4428 | static vk_pipeline ggml_vk_guess_matmul_pipeline(ggml_backend_vk_context * ctx, vk_matmul_pipeline& mmp, uint32_t m, uint32_t n, bool aligned, ggml_type src0_type, ggml_type src1_type) {
      |                                                                                                                                                                     ~~~~~~~~~~^~~~~~~~~
[164/164] Linking CXX executable bin/llama-server

$ sudo cmake --install build --config Release

gdb output:

$ gdb --ex run --ex bt --args llama-cli -m Ministral-8B-Instruct-2410.q8.gguf -ngl 37
GNU gdb (Debian 16.3-1) 16.3
Copyright (C) 2024 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from llama-cli...
Starting program: /usr/local/bin/llama-cli -m Ministral-8B-Instruct-2410.q8.gguf -ngl 37
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7fffe69ff6c0 (LWP 9042)]
[New Thread 0x7fffe60bd6c0 (LWP 9043)]
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = Intel(R) Graphics (BMG G21) (Intel open-source Mesa driver) | uma: 0 | fp16: 1 | warp size: 32 | shared memory: 131072 | int dot: 1 | matrix cores: none
register_backend: registered backend Vulkan (1 devices)
register_device: registered device Vulkan0 (Intel(R) Graphics (BMG G21))
register_backend: registered backend CPU (1 devices)
register_device: registered device CPU (12th Gen Intel(R) Core(TM) i7-12700K)
[New Thread 0x7fffe881e6c0 (LWP 9044)]
build: 5345 (3eac2093) with cc (Debian 14.2.0-19) 14.2.0 for x86_64-linux-gnu (debug)
main: llama backend init
main: load the model and apply lora adapter, if any
llama_model_load_from_file_impl: using device Vulkan0 (Intel(R) Graphics (BMG G21)) - 12216 MiB free
llama_model_loader: loaded meta data with 37 key-value pairs and 327 tensors from Ministral-8B-Instruct-2410.q8.gguf (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv   0:                       general.architecture str              = llama
llama_model_loader: - kv   1:                               general.type str              = model
llama_model_loader: - kv   2:                               general.name str              = Ministral 8B Instruct 2410
llama_model_loader: - kv   3:                            general.version str              = 2410
llama_model_loader: - kv   4:                           general.finetune str              = Instruct
llama_model_loader: - kv   5:                           general.basename str              = Ministral
llama_model_loader: - kv   6:                         general.size_label str              = 8B
llama_model_loader: - kv   7:                            general.license str              = other
llama_model_loader: - kv   8:                       general.license.name str              = mrl
llama_model_loader: - kv   9:                       general.license.link str              = https://mistral.ai/licenses/MRL-0.1.md
llama_model_loader: - kv  10:                          general.languages arr[str,10]      = ["en", "fr", "de", "es", "it", "pt", ...
llama_model_loader: - kv  11:                          llama.block_count u32              = 36
llama_model_loader: - kv  12:                       llama.context_length u32              = 32768
llama_model_loader: - kv  13:                     llama.embedding_length u32              = 4096
llama_model_loader: - kv  14:                  llama.feed_forward_length u32              = 12288
llama_model_loader: - kv  15:                 llama.attention.head_count u32              = 32
llama_model_loader: - kv  16:              llama.attention.head_count_kv u32              = 8
llama_model_loader: - kv  17:                       llama.rope.freq_base f32              = 100000000.000000
llama_model_loader: - kv  18:     llama.attention.layer_norm_rms_epsilon f32              = 0.000010
llama_model_loader: - kv  19:                 llama.attention.key_length u32              = 128
llama_model_loader: - kv  20:               llama.attention.value_length u32              = 128
llama_model_loader: - kv  21:                           llama.vocab_size u32              = 131072
llama_model_loader: - kv  22:                 llama.rope.dimension_count u32              = 128
llama_model_loader: - kv  23:                       tokenizer.ggml.model str              = gpt2
llama_model_loader: - kv  24:                         tokenizer.ggml.pre str              = tekken
llama_model_loader: - kv  25:                      tokenizer.ggml.tokens arr[str,131072]  = ["<unk>", "<s>", "</s>", "[INST]", "[...
llama_model_loader: - kv  26:                  tokenizer.ggml.token_type arr[i32,131072]  = [3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, ...
llama_model_loader: - kv  27:                      tokenizer.ggml.merges arr[str,269443]  = ["Ġ Ġ", "Ġ t", "e r", "i n", "Ġ �...
llama_model_loader: - kv  28:                tokenizer.ggml.bos_token_id u32              = 1
llama_model_loader: - kv  29:                tokenizer.ggml.eos_token_id u32              = 2
llama_model_loader: - kv  30:            tokenizer.ggml.unknown_token_id u32              = 0
llama_model_loader: - kv  31:               tokenizer.ggml.add_bos_token bool             = true
llama_model_loader: - kv  32:               tokenizer.ggml.add_eos_token bool             = false
llama_model_loader: - kv  33:                    tokenizer.chat_template str              = {%- if messages[0]["role"] == "system...
llama_model_loader: - kv  34:            tokenizer.ggml.add_space_prefix bool             = false
llama_model_loader: - kv  35:               general.quantization_version u32              = 2
llama_model_loader: - kv  36:                          general.file_type u32              = 7
llama_model_loader: - type  f32:   73 tensors
llama_model_loader: - type q8_0:  254 tensors
print_info: file format = GGUF V3 (latest)
print_info: file type   = Q8_0
print_info: file size   = 7.94 GiB (8.50 BPW) 
load: special_eos_id is not in special_eog_ids - the tokenizer config may be incorrect
load: special tokens cache size = 1000
load: token to piece cache size = 0.8498 MB
print_info: arch             = llama
print_info: vocab_only       = 0
print_info: n_ctx_train      = 32768
print_info: n_embd           = 4096
print_info: n_layer          = 36
print_info: n_head           = 32
print_info: n_head_kv        = 8
print_info: n_rot            = 128
print_info: n_swa            = 0
print_info: n_swa_pattern    = 1
print_info: n_embd_head_k    = 128
print_info: n_embd_head_v    = 128
print_info: n_gqa            = 4
print_info: n_embd_k_gqa     = 1024
print_info: n_embd_v_gqa     = 1024
print_info: f_norm_eps       = 0.0e+00
print_info: f_norm_rms_eps   = 1.0e-05
print_info: f_clamp_kqv      = 0.0e+00
print_info: f_max_alibi_bias = 0.0e+00
print_info: f_logit_scale    = 0.0e+00
print_info: f_attn_scale     = 0.0e+00
print_info: n_ff             = 12288
print_info: n_expert         = 0
print_info: n_expert_used    = 0
print_info: causal attn      = 1
print_info: pooling type     = 0
print_info: rope type        = 0
print_info: rope scaling     = linear
print_info: freq_base_train  = 100000000.0
print_info: freq_scale_train = 1
print_info: n_ctx_orig_yarn  = 32768
print_info: rope_finetuned   = unknown
print_info: ssm_d_conv       = 0
print_info: ssm_d_inner      = 0
print_info: ssm_d_state      = 0
print_info: ssm_dt_rank      = 0
print_info: ssm_dt_b_c_rms   = 0
print_info: model type       = 8B
print_info: model params     = 8.02 B
print_info: general.name     = Ministral 8B Instruct 2410
print_info: vocab type       = BPE
print_info: n_vocab          = 131072
print_info: n_merges         = 269443
print_info: BOS token        = 1 '<s>'
print_info: EOS token        = 2 '</s>'
print_info: UNK token        = 0 '<unk>'
print_info: LF token         = 1010 'Ċ'
print_info: EOG token        = 2 '</s>'
print_info: max token length = 150
load_tensors: loading model tensors, this can take a while... (mmap = true)
[New Thread 0x7fffe73fd6c0 (LWP 9051)]
load_tensors: offloading 36 repeating layers to GPU
load_tensors: offloading output layer to GPU
load_tensors: offloaded 37/37 layers to GPU
load_tensors:      Vulkan0 model buffer size =  7583.14 MiB
load_tensors:   CPU_Mapped model buffer size =   544.00 MiB
.........................................................................................
llama_context: constructing llama_context
llama_context: n_seq_max     = 1
llama_context: n_ctx         = 4096
llama_context: n_ctx_per_seq = 4096
llama_context: n_batch       = 2048
llama_context: n_ubatch      = 512
llama_context: causal_attn   = 1
llama_context: flash_attn    = 0
llama_context: freq_base     = 100000000.0
llama_context: freq_scale    = 1
llama_context: n_ctx_per_seq (4096) < n_ctx_train (32768) -- the full capacity of the model will not be utilized
llama_context: Vulkan_Host  output buffer size =     0.50 MiB
llama_kv_cache_unified: kv_size = 4096, type_k = 'f16', type_v = 'f16', n_layer = 36, can_shift = 1, padding = 32
llama_kv_cache_unified:    Vulkan0 KV buffer size =   576.00 MiB
llama_kv_cache_unified: KV self size  =  576.00 MiB, K (f16):  288.00 MiB, V (f16):  288.00 MiB
llama_context:    Vulkan0 compute buffer size =   296.00 MiB
llama_context: Vulkan_Host compute buffer size =    16.01 MiB
llama_context: graph nodes  = 1230
llama_context: graph splits = 2
common_init_from_params: setting dry_penalty_last_n to ctx_size = 4096
common_init_from_params: warming up the model with an empty run - please wait ... (--no-warmup to disable)
[New Thread 0x7fffe7bff6c0 (LWP 9053)]
[New Thread 0x7fffe48b86c0 (LWP 9054)]
[New Thread 0x7fffd66fa6c0 (LWP 9055)]
[New Thread 0x7fffd5ef96c0 (LWP 9056)]
[New Thread 0x7fffd56f86c0 (LWP 9057)]
[New Thread 0x7fffd4ef76c0 (LWP 9058)]
[New Thread 0x7fff7f05c6c0 (LWP 9059)]
[New Thread 0x7fff7e85b6c0 (LWP 9060)]
[New Thread 0x7fff7e05a6c0 (LWP 9061)]
[New Thread 0x7fff7d8596c0 (LWP 9062)]
[Thread 0x7fffe7bff6c0 (LWP 9053) exited]
[New Thread 0x7fff7d0586c0 (LWP 9063)]
[New Thread 0x7fff7c8576c0 (LWP 9064)]
[New Thread 0x7fff4ffff6c0 (LWP 9065)]
[New Thread 0x7fff4f7fe6c0 (LWP 9066)]
[Thread 0x7fffe48b86c0 (LWP 9054) exited]
[Thread 0x7fff7d0586c0 (LWP 9063) exited]
[Thread 0x7fffd4ef76c0 (LWP 9058) exited]
[Thread 0x7fffd56f86c0 (LWP 9057) exited]
[New Thread 0x7fff4effd6c0 (LWP 9067)]
[Thread 0x7fff7e85b6c0 (LWP 9060) exited]
[Thread 0x7fff7f05c6c0 (LWP 9059) exited]
[Thread 0x7fff7e05a6c0 (LWP 9061) exited]
[Thread 0x7fff7d8596c0 (LWP 9062) exited]
[Thread 0x7fffd5ef96c0 (LWP 9056) exited]
[New Thread 0x7fff4e7fc6c0 (LWP 9068)]
[Thread 0x7fff4ffff6c0 (LWP 9065) exited]
[Thread 0x7fffd66fa6c0 (LWP 9055) exited]
[Thread 0x7fff4f7fe6c0 (LWP 9066) exited]
[Thread 0x7fff7c8576c0 (LWP 9064) exited]
[Thread 0x7fff4e7fc6c0 (LWP 9068) exited]
[Thread 0x7fff4effd6c0 (LWP 9067) exited]
main: llama threadpool init, n_threads = 8
main: chat template is available, enabling conversation mode (disable it with -no-cnv)
main: chat template example:
[INST]You are a helpful assistant

Hello[/INST]Hi there</s>[INST]How are you?[/INST]

system_info: n_threads = 8 (n_threads_batch = 8) / 20 | CPU : SSE3 = 1 | SSSE3 = 1 | AVX = 1 | AVX_VNNI = 1 | AVX2 = 1 | F16C = 1 | FMA = 1 | BMI2 = 1 | LLAMAFILE = 1 | OPENMP = 1 | AARCH64_REPACK = 1 | 

main: interactive mode on.
sampler seed: 1577293089
sampler params: 
        repeat_last_n = 64, repeat_penalty = 1.000, frequency_penalty = 0.000, presence_penalty = 0.000
        dry_multiplier = 0.000, dry_base = 1.750, dry_allowed_length = 2, dry_penalty_last_n = 4096
        top_k = 40, top_p = 0.950, min_p = 0.050, xtc_probability = 0.000, xtc_threshold = 0.100, typical_p = 1.000, top_n_sigma = -1.000, temp = 0.800
        mirostat = 0, mirostat_lr = 0.100, mirostat_ent = 5.000
sampler chain: logits -> logit-bias -> penalties -> dry -> top-n-sigma -> top-k -> typical -> top-p -> min-p -> xtc -> temp-ext -> dist 
generate: n_ctx = 4096, n_batch = 2048, n_predict = -1, n_keep = 1

== Running in interactive mode. ==
 - Press Ctrl+C to interject at any time.
 - Press Return to return control to the AI.
 - To return control without starting a new line, end your input with '/'.
 - If you want to submit another line, end your input with '\'.
 - Not using system message. To change it, set a different value via -sys PROMPT

@slaren
Copy link
Member

slaren commented May 12, 2025

So it gets stuck, but it doesn't crash or do anything else? You should still be able to get a callstack if you press Ctrl+C.

@pabpas
Copy link
Author

pabpas commented May 12, 2025

Ctrl+C

Thread 1 "llama-cli" received signal SIGINT, Interrupt.
0x00007ffff55b49ee in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#0  0x00007ffff55b49ee in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00007ffff55a9668 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#2  0x00007ffff55a96ad in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#3  0x00007ffff561dea6 in read () from /lib/x86_64-linux-gnu/libc.so.6
#4  0x00007ffff559eb8d in _IO_wfile_underflow ()
   from /lib/x86_64-linux-gnu/libc.so.6
#5  0x00007ffff559d2db in _IO_wdefault_uflow ()
   from /lib/x86_64-linux-gnu/libc.so.6
#6  0x00007ffff559b785 in getwchar () from /lib/x86_64-linux-gnu/libc.so.6
#7  0x00005555557522d3 in console::getchar32 ()
    at /home/user/src/llama.cpp-vulkan/common/console.cpp:197
#8  0x00005555557526f3 in console::readline_advanced (line="", 
--Type <RET> for more, q to quit, c to continue without paging--
    multiline_input=false)
    at /home/user/src/llama.cpp-vulkan/common/console.cpp:368
#9  0x0000555555752b96 in console::readline (line="", multiline_input=false)
    at /home/user/src/llama.cpp-vulkan/common/console.cpp:501
#10 0x00005555555d904d in main (argc=5, argv=0x7fffffffdaa8)
    at /home/user/src/llama.cpp-vulkan/tools/main/main.cpp:857
(gdb) q

Also the output of llama-bench, which shows no results but exits without error.

$ gdb --ex run --ex bt --args llama-bench -m Ministral-8B-Instruct-2410.q8.gguf -ngl 37
GNU gdb (Debian 16.3-1) 16.3
Copyright (C) 2024 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from llama-bench...
Starting program: /usr/local/bin/llama-bench -m Ministral-8B-Instruct-2410.q8.gguf -ngl 37
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
warning: asserts enabled, performance may be affected
warning: debug build, performance may be affected
[New Thread 0x7fffe69ff6c0 (LWP 9974)]
[New Thread 0x7fffe60bd6c0 (LWP 9975)]
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = Intel(R) Graphics (BMG G21) (Intel open-source Mesa driver) | uma: 0 | fp16: 1 | warp size: 32 | shared memory: 131072 | int dot: 1 | matrix cores: none
register_backend: registered backend Vulkan (1 devices)
register_device: registered device Vulkan0 (Intel(R) Graphics (BMG G21))
register_backend: registered backend CPU (1 devices)
register_device: registered device CPU (12th Gen Intel(R) Core(TM) i7-12700K)
[New Thread 0x7fffe801c6c0 (LWP 9983)]
[New Thread 0x7fffe58bc6c0 (LWP 9984)]
[New Thread 0x7fffe50bb6c0 (LWP 9985)]
[New Thread 0x7fffe48ba6c0 (LWP 9986)]
[New Thread 0x7fffa6d3e6c0 (LWP 9987)]
[New Thread 0x7fffa653d6c0 (LWP 9988)]
[New Thread 0x7fffa5d3c6c0 (LWP 9989)]
[Thread 0x7fffe58bc6c0 (LWP 9984) exited]
[New Thread 0x7fffa553b6c0 (LWP 9990)]
[New Thread 0x7fffa4d3a6c0 (LWP 9991)]
[New Thread 0x7fff8ffff6c0 (LWP 9992)]
[Thread 0x7fffa5d3c6c0 (LWP 9989) exited]
[Thread 0x7fffa553b6c0 (LWP 9990) exited]
[Thread 0x7fffa4d3a6c0 (LWP 9991) exited]
[Thread 0x7fffa653d6c0 (LWP 9988) exited]
[Thread 0x7fffa6d3e6c0 (LWP 9987) exited]
[Thread 0x7fffe50bb6c0 (LWP 9985) exited]
[New Thread 0x7fff8f7fe6c0 (LWP 9993)]
[Thread 0x7fff8ffff6c0 (LWP 9992) exited]
[New Thread 0x7fff8effd6c0 (LWP 9994)]
[New Thread 0x7fff8e7fc6c0 (LWP 9995)]
[New Thread 0x7fff8dffb6c0 (LWP 9996)]
[Thread 0x7fff8f7fe6c0 (LWP 9993) exited]
[Thread 0x7fff8effd6c0 (LWP 9994) exited]
[New Thread 0x7fff8d7fa6c0 (LWP 9997)]
[New Thread 0x7fff8cff96c0 (LWP 9998)]
[Thread 0x7fff8dffb6c0 (LWP 9996) exited]
[Thread 0x7fff8e7fc6c0 (LWP 9995) exited]
[New Thread 0x7fff83fff6c0 (LWP 9999)]
[New Thread 0x7fff837fe6c0 (LWP 10000)]
[Thread 0x7fff8d7fa6c0 (LWP 9997) exited]
[Thread 0x7fff8cff96c0 (LWP 9998) exited]
[Thread 0x7fff837fe6c0 (LWP 10000) exited]
[Thread 0x7fff83fff6c0 (LWP 9999) exited]
[Thread 0x7fffe48ba6c0 (LWP 9986) exited]
[New Thread 0x7fff837fe6c0 (LWP 10003)]
[New Thread 0x7fff83fff6c0 (LWP 10004)]
[Thread 0x7fff837fe6c0 (LWP 10003) exited]
[New Thread 0x7fff8cff96c0 (LWP 10005)]
[Thread 0x7fff83fff6c0 (LWP 10004) exited]
[Thread 0x7fff8cff96c0 (LWP 10005) exited]
[New Thread 0x7fff8cff96c0 (LWP 10012)]
[Thread 0x7fff8cff96c0 (LWP 10012) exited]
[Thread 0x7fffe801c6c0 (LWP 9983) exited]
[Thread 0x7fffe60bd6c0 (LWP 9975) exited]
[Thread 0x7fffe69ff6c0 (LWP 9974) exited]
[Inferior 1 (process 9971) exited normally]
No stack.
(gdb) q

@slaren
Copy link
Member

slaren commented May 12, 2025

The first case shows that it is waiting on getwchar for your input, so that seems to be working as expected. You have to type the first line of the dialog. The llama-bench result makes no sense to me, I don't know what could cause the process to exit without error and print nothing. You could try setting a breaking on exit with catch syscall exit_group, then use bt to print the callstack.

@pabpas
Copy link
Author

pabpas commented May 12, 2025

It all started with this commit: cc473ca
Any clue there?

@slaren
Copy link
Member

slaren commented May 12, 2025

No, I don't see any code there that could cause this.

@pabpas
Copy link
Author

pabpas commented May 13, 2025

Today debian trixie updated some mesa libs from 25.0.4-1 to 25.0.5-1. After recompiling current master I am not able to reproduce this anymore. It works as expected so closing.

Thanks for your support @slaren!

@pabpas pabpas closed this as completed May 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants