You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am getting the following error/segfault
when submitting an image to ggml-org/Qwen2.5-VL-7B-Instruct-GGUF
using llama-server and the Python llm package as a client.
The same image is processed perfectly fine with
ggml-org/gemma-3-4b-it-GGUF
and
ggml-org/SmolVLM2-2.2B-Instruct-GGUF.
This is the output of "identify" on the image file:
neocube-one-layer-pattern.jpg JPEG 2592x1944 2592x1944+0+0 8-bit sRGB 858245B 0.000u 0:00.000
And here is the output I get from llama-server before it segfaults.
Note that it wants to allocate 44GB RAM, which is likely an error somewhere.
slot launch_slot_: id 0 | task 0 | processing task
slot update_slots: id 0 | task 0 | new prompt, n_ctx_slot = 4096, n_keep = 0, n_prompt_tokens = 11
slot update_slots: id 0 | task 0 | kv cache rm [0, end)
slot update_slots: id 0 | task 0 | prompt processing progress, n_past = 4, n_tokens = 4, progress = 0.363636
encoding image or slice...
slot update_slots: id 0 | task 0 | kv cache rm [4, end)
srv process_chun: processing image...
ggml_aligned_malloc: insufficient memory (attempted to allocate 44668.09 MB)
ggml_backend_cpu_buffer_type_alloc_buffer: failed to allocate buffer of size 46837887616
ggml_gallocr_reserve_n: failed to allocate CPU buffer of size 46837887616
ggml_aligned_malloc: insufficient memory (attempted to allocate 44668.09 MB)
ggml_backend_cpu_buffer_type_alloc_buffer: failed to allocate buffer of size 46837887616
ggml_gallocr_reserve_n: failed to allocate CPU buffer of size 46837887616
make: *** [Makefile:20: llm-api] Segmentation fault
The text was updated successfully, but these errors were encountered:
I am getting the following error/segfault
when submitting an image to ggml-org/Qwen2.5-VL-7B-Instruct-GGUF
using llama-server and the Python llm package as a client.
The same image is processed perfectly fine with
ggml-org/gemma-3-4b-it-GGUF
and
ggml-org/SmolVLM2-2.2B-Instruct-GGUF.
This is the output of "identify" on the image file:
neocube-one-layer-pattern.jpg JPEG 2592x1944 2592x1944+0+0 8-bit sRGB 858245B 0.000u 0:00.000
And here is the output I get from llama-server before it segfaults.
Note that it wants to allocate 44GB RAM, which is likely an error somewhere.
The text was updated successfully, but these errors were encountered: