ggml : various fixes #1450

ggerganov · 2023-05-14T12:38:40Z

fix ggml_rope() when not inplace ggml-org/ggml@788381e
fix ggml_rope() GPT-NeoX mode (hopefully) ggml-org/ggml@788381e
fix data race in multi-threaded ggml_diag_mask_inf() operator ggml-org/ggml@a483bb2
compatibility with scratch buffers

The ggml_rope() fixes are irrelevant for LLaMA since n_rot == (n_embd / n_head), but it makes a difference for other models like GPT-J and GPT-NeoX where n_rot < (n_embd / n_head). I'm still not sure if this is the correct implementation, especially for the GPT-NeoX mode, but results kind of seem a bit better than before.

The non-inplace multi-thread ggml_diag_mask_inf() was broken here: #1428 . Again, irrelevant since in LLaMA forward we use ggml_diag_mask_inf_inplace(). Might be relevant to @xaedes

The "scratch buffers" fix might be relevant for LLaMA. See the new ggml_scratch_save() and ggml_scratch_load() functions and their usage in ggml.c: https://github.com/ggerganov/llama.cpp/blob/fixes/ggml.c#LL3925C1-L3939C1
The scratch buffers are mechanism for reusing memory from previous ops when it is no longer needed. The current way of using them is manual and very error-prone. Will hopefully come up with something better in the future.
More info here: ggml-org/whisper.cpp#431

- `ggml_rope()` - `ggml_diag_mask_inf()` multi-threaded - compatibility with scratch buffers

ggml : various fixes

9c7dea1

- `ggml_rope()` - `ggml_diag_mask_inf()` multi-threaded - compatibility with scratch buffers

ggerganov mentioned this pull request May 14, 2023

Fix race condition bug in non-inplace ggml_compute_forward_diag_mask_f32 #1454

Merged

ggerganov merged commit 13c351a into master May 14, 2023

ggerganov deleted the fixes branch May 14, 2023 15:22

Bearsaerker mentioned this pull request Mar 12, 2025

Eval bug: Gemma 3 extremly slow prompt processing when using quantized kv cache. #12352

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ggml : various fixes #1450

ggml : various fixes #1450

ggerganov commented May 14, 2023

ggml : various fixes #1450

ggml : various fixes #1450

Conversation

ggerganov commented May 14, 2023