Skip to content

Releases: ggml-org/llama.cpp

b5302

07 May 10:50
39e73ae
Compare
Choose a tag to compare
common : Add a warning when we can't match samplers from a string or …

b5301

07 May 10:32
1f73301
Compare
Choose a tag to compare
cuda : remove nrows_x in mul_mat_q_process_tile (#13325)

Signed-off-by: Xiaodong Ye <[email protected]>

b5300

07 May 10:11
4773d7a
Compare
Choose a tag to compare
examples : remove infill (#13283)

ggml-ci

b5299

07 May 09:16
6c7fd67
Compare
Choose a tag to compare
llama : support tie embedding for chatglm models (#13328)

b5298

06 May 23:10
141a908
Compare
Choose a tag to compare
CUDA: mix virt/real CUDA archs for GGML_NATIVE=OFF (#13135)

b5297

06 May 22:26
32916a4
Compare
Choose a tag to compare
clip : refactor graph builder (#13321)

* mtmd : refactor graph builder

* fix qwen2vl

* clean up siglip cgraph

* pixtral migrated

* move minicpmv to a dedicated build function

* move max_feature_layer to build_llava

* use build_attn for minicpm resampler

* fix windows build

* add comment for batch_size

* also support tinygemma3 test model

* qwen2vl does not use RMS norm

* fix qwen2vl norm (2)

b5296

06 May 22:16
ffc7272
Compare
Choose a tag to compare
sampling : make top_n_sigma no-op at <=0 or a single candidate (#13345)

b5295

06 May 19:17
91a86a6
Compare
Choose a tag to compare
sampling : don't consider -infinity values in top_n_sigma (#13344)

b5293

06 May 16:31
1e333d5
Compare
Choose a tag to compare
SYCL: Disable reorder optimize by default and stop setting tensor ext…

b5292

06 May 14:29
2f54e34
Compare
Choose a tag to compare
llama : fix build_ffn without gate (#13336)

* llama : fix build_ffn without gate

* fix build on windows

* Revert "fix build on windows"

This reverts commit fc420d3c7eef3481d3d2f313fef2757cb33a7c56.