Skip to content

Releases: ggml-org/llama.cpp

b5410

16 May 21:16
3e0be1c
Compare
Choose a tag to compare
llguidance : official v0.7.20 release (no actual changes) [noci] (#13…

b5409

16 May 20:18
6aa892e
Compare
Choose a tag to compare
server : do not return error out of context (with ctx shift disabled)…

b5406

16 May 18:42
415e40a
Compare
Choose a tag to compare
releases : use arm version of curl for arm releases (#13592)

b5405

16 May 18:23
654a677
Compare
Choose a tag to compare
metal : add FA-vec kernel for head size 64 (#13583)

ggml-ci

b5404

16 May 15:16
5364ae4
Compare
Choose a tag to compare
llama : print hint when loading a model when no backends are loaded (…

b5402

16 May 10:36
0a338ed
Compare
Choose a tag to compare
sycl : fixed compilation warnings (#13582)

b5401

15 May 23:12
bc098c3
Compare
Choose a tag to compare
minja: sync (qwen3) (#13573)

* minja: sync https://github.com/google/minja/commit/f06140fa52fd140fe38e531ec373d8dc9c86aa06

- https://github.com/google/minja/pull/67 (@grf53)
- https://github.com/google/minja/pull/66 (@taha-yassine)
- https://github.com/google/minja/pull/63 (@grf53)
- https://github.com/google/minja/pull/58

---------

Co-authored-by: ochafik <[email protected]>

b5400

15 May 17:35
c6a2c9e
Compare
Choose a tag to compare
gguf : use ggml log system (#13571)

* gguf : use ggml log system

* llama : remove unnecessary new lines in exception messages

b5395

15 May 15:19
9c404ed
Compare
Choose a tag to compare
sycl: use oneDNN for matrices multiplication (#12972)

b5394

15 May 14:10
6c8b915
Compare
Choose a tag to compare
llama-bench : fix -ot with dl backends (#13563)