Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: ggml-org/llama.cpp
Failed to load repositories. Confirm that selected base ref is valid, then try again.
Loading
base: master@{1day}
Choose a base ref
...
head repository: ggml-org/llama.cpp
Failed to load repositories. Confirm that selected head ref is valid, then try again.
Loading
compare: master
Choose a head ref
  • 14 commits
  • 77 files changed
  • 11 contributors

Commits on May 9, 2025

  1. server : (webui) rename has_multimodal --> modalities (#13393)

    * server : (webui) rename has_multimodal --> modalities
    
    * allow converting SVG to PNG
    
    * less complicated code
    ngxson authored May 9, 2025
    Configuration menu
    Copy the full SHA
    d9c4acc View commit details
    Browse the repository at this point in the history
  2. vulkan: Allow up to 4096 elements for mul_mat_id row_ids (#13326)

    This assert fired running Qwen_Qwen3-30B-A3B-Q2_K.gguf:
    
    GGML_ASSERT(nei0 * nei1 <= 3072);
    
    The tensor is 8 x 512. Increase this array size to accommodate.
    jeffbolznv authored May 9, 2025
    Configuration menu
    Copy the full SHA
    02115dc View commit details
    Browse the repository at this point in the history
  3. rpc : add rpc_msg_set_tensor_hash_req (#13353)

    * rpc : add rpc_msg_set_tensor_hash_req
    
    Use a dedicated struct for the request of RPC_CMD_SET_TENSOR_HASH which
    makes the code cleaner.
    
    * fix
    rgerganov authored May 9, 2025
    Configuration menu
    Copy the full SHA
    b486ba0 View commit details
    Browse the repository at this point in the history
  4. llama : one-off chat template fix for Mistral-Small-2503 (#13398)

    * llama : one-off chat template fix for Mistral-Small-2503
    
    * update readme
    
    * add mistral-v7-tekken
    ngxson authored May 9, 2025
    Configuration menu
    Copy the full SHA
    3f96aef View commit details
    Browse the repository at this point in the history
  5. mtmd : fix batch_view for m-rope (#13397)

    * mtmd : fix batch_view for m-rope
    
    * nits : fix comment
    ngxson authored May 9, 2025
    Configuration menu
    Copy the full SHA
    2189fd3 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    0527771 View commit details
    Browse the repository at this point in the history
  7. imatrix : Add --parse-special for enabling parsing of special tokens …

    …in imatrix calculation (#13389)
    
    * Add --parse-special for enabling parsing of special tokens in imatrix calculation
    
    * whitespace
    bartowski1182 authored May 9, 2025
    Configuration menu
    Copy the full SHA
    efb8b47 View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    5c86c9e View commit details
    Browse the repository at this point in the history
  9. llama : do not crash if there is no CPU backend (#13395)

    * llama : do not crash if there is no CPU backend
    
    * add checks to examples
    slaren authored May 9, 2025
    Configuration menu
    Copy the full SHA
    27ebfca View commit details
    Browse the repository at this point in the history
  10. CUDA: FA support for Deepseek (Ampere or newer) (#13306)

    * CUDA: FA support for Deepseek (Ampere or newer)
    
    * do loop unrolling via C++ template
    JohannesGaessler authored May 9, 2025
    Configuration menu
    Copy the full SHA
    0cf6725 View commit details
    Browse the repository at this point in the history
  11. Configuration menu
    Copy the full SHA
    611aa91 View commit details
    Browse the repository at this point in the history
  12. sycl : implementation of reordered Q4_0 MMVQ for Intel GPUs (#12858)

    * sycl : Implemented reorder Q4_0 mmvq
    
    Signed-off-by: Alberto Cabrera <[email protected]>
    
    * sycl : Fixed mmvq being called when reorder is disabled
    
    * sycl : Improved comments in the quants header
    
    Signed-off-by: Alberto Cabrera <[email protected]>
    
    * Use static_assert
    
    * safe_div -> ceil_div
    
    * Clarify qi comment
    
    * change the reorder tensor from init to execute OP
    
    * dbg
    
    * Undo changes to test-backend-ops
    
    * Refactor changes on top of q4_0 reorder fix
    
    * Missing Reverts
    
    * Refactored opt_for_reorder logic to simplify code path
    
    * Explicit inlining and unroll
    
    * Renamed mul_mat_algo enum for consistency
    
    ---------
    
    Signed-off-by: Alberto Cabrera <[email protected]>
    Co-authored-by: romain.biessy <[email protected]>
    Alcpz and Rbiessy authored May 9, 2025
    Configuration menu
    Copy the full SHA
    17512a9 View commit details
    Browse the repository at this point in the history
  13. server : vision support via libmtmd (#12898)

    * server : (experimental) vision support via libmtmd
    
    * mtmd : add more api around mtmd_image_tokens
    
    * mtmd : add more api around mtmd_image_tokens
    
    * mtmd : ability to calc image hash
    
    * shared_ptr for mtmd_image_tokens
    
    * move hash to user-define ID (fixed)
    
    * abstract out the batch management
    
    * small fix
    
    * refactor logic adding tokens to batch
    
    * implement hashing image
    
    * use FNV hash, now hash bitmap instead of file data
    
    * allow decoding image embedding to be split into batches
    
    * rm whitespace
    
    * disable some features when mtmd is on
    
    * fix --no-mmproj-offload
    
    * mtmd_context_params no timings
    
    * refactor server_inp to server_tokens
    
    * fix the failing test case
    
    * init
    
    * wip
    
    * working version
    
    * add mtmd::bitmaps
    
    * add test target
    
    * rm redundant define
    
    * test: mtmd_input_chunks_free
    
    * rm outdated comment
    
    * fix merging issue
    
    * explicitly create mtmd::input_chunks
    
    * mtmd_input_chunk_copy
    
    * add clone()
    
    * improve server_input struct
    
    * clip :  fix confused naming ffn_up and ffn_down
    
    * rm ffn_i/o/g naming
    
    * rename n_embd, n_ff
    
    * small fix
    
    * no check n_ff
    
    * fix detokenize
    
    * add const to various places
    
    * add warning about breaking changes
    
    * add c api
    
    * helper: use mtmd_image_tokens_get_n_pos
    
    * fix ctx_shift
    
    * fix name shadowing
    
    * more strict condition
    
    * support remote image_url
    
    * remote image_url log
    
    * add CI test
    
    * do not log base64
    
    * add "has_multimodal" to /props
    
    * remove dangling image
    
    * speculative: use slot.cache_tokens.insert
    
    * Apply suggestions from code review
    
    Co-authored-by: Georgi Gerganov <[email protected]>
    
    * rm can_be_detokenized
    
    * on prmpt processing done, assert cache_tokens.size
    
    * handle_completions_impl returns void
    
    * adapt the new web ui
    
    * update docs and hot topics
    
    * rm assert
    
    * small fix (2)
    
    ---------
    
    Co-authored-by: Georgi Gerganov <[email protected]>
    ngxson and ggerganov authored May 9, 2025
    Configuration menu
    Copy the full SHA
    33eff40 View commit details
    Browse the repository at this point in the history
  14. Configuration menu
    Copy the full SHA
    7c28a74 View commit details
    Browse the repository at this point in the history
Loading