-
Notifications
You must be signed in to change notification settings - Fork 13.6k
Comparing changes
Open a pull request
base repository: ggml-org/llama.cpp
base: master@{1day}
head repository: ggml-org/llama.cpp
compare: master
- 15 commits
- 28 files changed
- 11 contributors
Commits on Nov 10, 2025
-
Configuration menu - View commit details
-
Copy full SHA for 15274c0 - Browse repository at this point
Copy the full SHA 15274c0View commit details -
cuda/vulkan : bicubic interpolation (#17022)
* vulkan : implement upscale with bicubic interpolation * cuda : implement upscale with bicubic interpolation * tests : add ggml_interpolate with GGML_SCALE_MODE_BICUBIC to backend tests * adapt OpenCL backend to not support the OP in that case so tests don't fail * print scale mode & flags in test-backend-ops
Configuration menu - View commit details
-
Copy full SHA for 1032256 - Browse repository at this point
Copy the full SHA 1032256View commit details -
Configuration menu - View commit details
-
Copy full SHA for 9898b57 - Browse repository at this point
Copy the full SHA 9898b57View commit details -
Configuration menu - View commit details
-
Copy full SHA for 4b13a68 - Browse repository at this point
Copy the full SHA 4b13a68View commit details -
Configuration menu - View commit details
-
Copy full SHA for f914544 - Browse repository at this point
Copy the full SHA f914544View commit details -
arm64: add i8mm route with SVE ggml_vec_dot_q4_K_q8_K and ggml_vec_do…
…t_q6_K_… (#15277) * add i8mm route with SVE ggml_vec_dot_q4_K_q8_K and ggml_vec_dot_q6_K_q8_K * Surround SVE function with compiler directive * fix compile switch * fix coding style * ggml : fix indent --------- Co-authored-by: Georgi Gerganov <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for df70bed - Browse repository at this point
Copy the full SHA df70bedView commit details -
Configuration menu - View commit details
-
Copy full SHA for c27efd2 - Browse repository at this point
Copy the full SHA c27efd2View commit details -
memory: Hybrid context shift (#17009)
* feat(memory): Only fail partial erasure of recurrent tail The recurrent state is always assumed to be the state as of the last update from the final token in the sequence. When doing a partial erasure, if the range does not include the final token, the erasure can be considered a success since any memory used for the sequence prior to the final token (which is no memory) has been successfully removed. There is one potential case that this doesn't address which is the pruning of cache to remove sensitive data from the context. This wouldn't work for attention cache partial removal (in the middle) either since the KV state is linearly-dependent and states in later sequence positions would still be based on the state from the sensitive data, even if that data is no longer cached, so I don't think this is relevant, but it is worth noting that the semantics of this change for a partial erasure in the middle of the cache are essentially "my context is already compressed" and not "all trace of the removed tokens has been removed." #16768 Branch: HybridContextShift-16768 Signed-off-by: Gabe Goodhart <[email protected]> * fix(main): Check the output of seq_rm for prefix matching This prefix matching is explicitly attempting to remove the tokens at the end of the sequence that don't match. This is the operation that can't be performed on a recurrent cache due to the state being updated in place, so if this removal fails, we need to clear the whole cache. #16768 Branch: HybridContextShift-16768 Signed-off-by: Gabe Goodhart <[email protected]> * fix(memory): Fix condition for partial erasure failure if p0 > pos Signed-off-by: Gabe Goodhart <[email protected]> Co-authored-by: compilade <[email protected]> * style: Fix extra parens Signed-off-by: Gabe Goodhart <[email protected]> Co-authored-by: Georgi Gerganov <[email protected]> * fix(main.cpp): Set n_matching_session_tokens to 0 on cache clear #16768 Branch: HybridContextShift-16768 Signed-off-by: Gabe Goodhart <[email protected]> --------- Signed-off-by: Gabe Goodhart <[email protected]> Co-authored-by: compilade <[email protected]> Co-authored-by: Georgi Gerganov <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 0c74f32 - Browse repository at this point
Copy the full SHA 0c74f32View commit details -
Configuration menu - View commit details
-
Copy full SHA for 85234a4 - Browse repository at this point
Copy the full SHA 85234a4View commit details -
Configuration menu - View commit details
-
Copy full SHA for f117be1 - Browse repository at this point
Copy the full SHA f117be1View commit details -
ggml-cpu : inspect -march and -mcpu to found the CPU (#16333)
Signed-off-by: Adrien Gallouët <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 967eb4b - Browse repository at this point
Copy the full SHA 967eb4bView commit details -
Configuration menu - View commit details
-
Copy full SHA for 13730c1 - Browse repository at this point
Copy the full SHA 13730c1View commit details -
cpu: skip NOPs to avoid barriers (#17133)
* cpu: skip NOPs to avoid barriers * cpu: use ggml_op_is_empty
Configuration menu - View commit details
-
Copy full SHA for 395e286 - Browse repository at this point
Copy the full SHA 395e286View commit details -
models : move build_inp_out_ids outside loop (#17151)
* move build_inp_out_ids outside loop * realign
Configuration menu - View commit details
-
Copy full SHA for 7bef684 - Browse repository at this point
Copy the full SHA 7bef684View commit details -
opencl: add fastdiv and use it in set_rows, ported from cuda (#17090)
* opencl: add fastdiv for mm q8_0 * opencl: use uint4 for fastdiv vals * opencl: use fastdiv for set_rows * opencl: do not use fastdiv for q8_0 mm
Configuration menu - View commit details
-
Copy full SHA for ece0f5c - Browse repository at this point
Copy the full SHA ece0f5cView commit details
This comparison is taking too long to generate.
Unfortunately it looks like we can’t render this comparison for you right now. It might be too big, or there might be something weird with your repository.
You can try running this command locally to see the comparison on your machine:
git diff master@{1day}...master