Releases: ggml-org/llama.cpp
Releases · ggml-org/llama.cpp
b5410
b5409
server : do not return error out of context (with ctx shift disabled)…
b5406
releases : use arm version of curl for arm releases (#13592)
b5405
metal : add FA-vec kernel for head size 64 (#13583) ggml-ci
b5404
llama : print hint when loading a model when no backends are loaded (…
b5402
sycl : fixed compilation warnings (#13582)
b5401
minja: sync (qwen3) (#13573) * minja: sync https://github.com/google/minja/commit/f06140fa52fd140fe38e531ec373d8dc9c86aa06 - https://github.com/google/minja/pull/67 (@grf53) - https://github.com/google/minja/pull/66 (@taha-yassine) - https://github.com/google/minja/pull/63 (@grf53) - https://github.com/google/minja/pull/58 --------- Co-authored-by: ochafik <[email protected]>
b5400
gguf : use ggml log system (#13571) * gguf : use ggml log system * llama : remove unnecessary new lines in exception messages
b5395
sycl: use oneDNN for matrices multiplication (#12972)
b5394
llama-bench : fix -ot with dl backends (#13563)