Skip to content

Pull requests: ggml-org/llama.cpp

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

HIP: RDNA4 tensor core support for MMF ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs
#17077 opened Nov 7, 2025 by zhang-hui-yulo Loading… updated Nov 11, 2025
vulkan: change graph_compute to be async and enable get_tensor_async ggml changes relating to the ggml tensor library for machine learning Vulkan Issues specific to the Vulkan backend
#17158 opened Nov 10, 2025 by jeffbolznv Loading… updated Nov 11, 2025
convert : register UMT5Model architecture for T5 conversion python python script changes
#17160 opened Nov 11, 2025 by levkropp Loading… updated Nov 11, 2025
llama.android : Rewrite Android binding android Issues specific to Android documentation Improvements or additions to documentation examples ggml changes relating to the ggml tensor library for machine learning
#17152 opened Nov 10, 2025 by hanyin-arm Loading… updated Nov 11, 2025
sycl: flash-attention implementation ggml changes relating to the ggml tensor library for machine learning SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language
#16969 opened Nov 3, 2025 by ye-NX Loading… updated Nov 11, 2025
cuda : Add conv2d Implicit GEMM ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs testing Everything test related
#15805 opened Sep 4, 2025 by bssrdf Loading… updated Nov 11, 2025
server: implement GLM-style MTP examples hot Something that is hot server
#15225 opened Aug 11, 2025 by F1LM1 Draft updated Nov 11, 2025
Add ops needed for new hybrid models: SOFTPLUS, EXPM1, TRI, SOLVE_TRI, CUMSUM documentation Improvements or additions to documentation ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs testing Everything test related
#17063 opened Nov 6, 2025 by pwilkin Loading… updated Nov 11, 2025
HIP: WMMA-MMQ kernels for RDNA 4 ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs
#17156 opened Nov 10, 2025 by jiachengjason Draft updated Nov 11, 2025
hexagon: various Op fixes ggml changes relating to the ggml tensor library for machine learning
#17135 opened Nov 10, 2025 by max-krasnyansky Loading… updated Nov 10, 2025
Implement SparseK Attention mechanism — new GGML operator with CPU backend (GPU planned next) ggml changes relating to the ggml tensor library for machine learning testing Everything test related
#16817 opened Oct 28, 2025 by yael-works Loading… updated Nov 10, 2025
CUDA: add implicit conv3d ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs testing Everything test related
#16948 opened Nov 2, 2025 by bssrdf Loading… updated Nov 10, 2025
common : implement parser combinators for chat parsing [WIP] testing Everything test related
#17136 opened Nov 10, 2025 by aldehir Draft updated Nov 10, 2025
5 of 9 tasks
vendor: split httplib to cpp/h files build Compilation issues examples python python script changes script Script related server
#17150 opened Nov 10, 2025 by ngxson Loading… updated Nov 10, 2025
rpc : reuse compute graphs ggml changes relating to the ggml tensor library for machine learning
#15405 opened Aug 18, 2025 by rgerganov Loading… updated Nov 10, 2025
CPU SIMD and pipeline optimizations across vec/mmq/ops/kv-cache/repack ggml changes relating to the ggml tensor library for machine learning
#17113 opened Nov 8, 2025 by NoahOksuz Loading… updated Nov 10, 2025
ggml-cpu: handle 3d tensors in repack mat_mul ggml changes relating to the ggml tensor library for machine learning
#17030 opened Nov 5, 2025 by Alcpz Loading… updated Nov 10, 2025
Install rpc-server when GGML_RPC is ON. devops improvements to build systems and github actions examples nix Issues specific to consuming flake.nix, or generally concerned with ❄ Nix-based llama.cpp deployment
#17149 opened Nov 10, 2025 by nbp Loading… updated Nov 10, 2025
rpc : fix alloc size logic Apple Metal https://en.wikipedia.org/wiki/Metal_(API) ggml changes relating to the ggml tensor library for machine learning
#17116 opened Nov 9, 2025 by ggerganov Loading… updated Nov 10, 2025
2 tasks
llama-cli: add support for reasoning examples
#16603 opened Oct 16, 2025 by bandoti Loading… updated Nov 10, 2025
add version to all shared object files examples ggml changes relating to the ggml tensor library for machine learning
#17091 opened Nov 7, 2025 by furrysalamander Loading… updated Nov 10, 2025
[WIP] Rpc split row Apple Metal https://en.wikipedia.org/wiki/Metal_(API) examples ggml changes relating to the ggml tensor library for machine learning
#16020 opened Sep 16, 2025 by LeaveNhA Loading… updated Nov 10, 2025
ci: add Arm-hosted Graviton4 runner devops improvements to build systems and github actions
#17021 opened Nov 5, 2025 by sudhiarm Loading… updated Nov 10, 2025
ProTip! no:milestone will show everything without a milestone.