-
Notifications
You must be signed in to change notification settings - Fork 13.6k
Pull requests: ggml-org/llama.cpp
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Model: Qwen3 Next
examples
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
python
python script changes
testing
Everything test related
#16095
opened Sep 18, 2025 by
pwilkin
Loading…
common: Generalized XML-style tool-call parsing with streaming support (GLM 4.5/4.6 + MiniMax M2 + SeedOSS)
testing
Everything test related
#16932
opened Nov 2, 2025 by
hksdpc255
Loading…
Modern Bert Support
python
python script changes
#15641
opened Aug 28, 2025 by
ryan-mangeno
Loading…
llama: Attempt to add ModernBert
model
Model specific
python
python script changes
#14014
opened Jun 4, 2025 by
huydt84
Loading…
add FP8 support to gguf/llama:
build
Compilation issues
examples
ggml
changes relating to the ggml tensor library for machine learning
script
Script related
Tensor Encoding Scheme
https://github.com/ggerganov/llama.cpp/wiki/Tensor-Encoding-Schemes
testing
Everything test related
tool: add convertation of text/parquet to custom format
build
Compilation issues
examples
#14622
opened Jul 10, 2025 by
lexasub
Loading…
imatrix: calculate activation-based statistics for new format (GGUF) imatrices
examples
#14891
opened Jul 26, 2025 by
EAddario
Loading…
Implementation of a sequence repetition penalty sampler
enhancement
New feature or request
generation quality
Quality of model output
need feedback
Testing and feedback with results are needed
#2593
opened Aug 12, 2023 by
KerfuffleV2
•
Draft
WIP: Add model Demonstrate some concept or idea, not intended to be merged
help wanted
Needs help from the community
merge example
demo
cuda : Add conv2d Implicit GEMM
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
testing
Everything test related
#15805
opened Sep 4, 2025 by
bssrdf
Loading…
[MPI] Add support for per-node options, thread counts, and layer allocations
build
Compilation issues
examples
ggml
changes relating to the ggml tensor library for machine learning
server
#3334
opened Sep 26, 2023 by
AutonomicPerfectionist
•
Draft
2 of 5 tasks
Update gpt2 preprocess and add deepseek coder preprocess
#4070
opened Nov 14, 2023 by
DOGEwbx
Loading…
Generic Chat templating code with text/json file based config; main chat updated to drive its in-prefix, in-suffix and reverse-prompt from same; chat-apply-template equivalent c-api to allow use by other codes also
enhancement
New feature or request
Review Complexity : Medium
Generally require more time to grok but manageable by beginner to medium expertise level
support MiniCPM-V-2
demo
Demonstrate some concept or idea, not intended to be merged
enhancement
New feature or request
examples
python
python script changes
Review Complexity : High
Generally require indepth knowledge of LLMs or GPUs
#6919
opened Apr 26, 2024 by
Achazwl
Loading…
Layer skipping/self-speculation demo
demo
Demonstrate some concept or idea, not intended to be merged
research 🔬
#3565
opened Oct 10, 2023 by
KerfuffleV2
•
Draft
Add ops needed for new hybrid models: SOFTPLUS, EXPM1, TRI, SOLVE_TRI, CUMSUM
documentation
Improvements or additions to documentation
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
testing
Everything test related
#17063
opened Nov 6, 2025 by
pwilkin
Loading…
Server: enable lookup decoding
enhancement
New feature or request
examples
Review Complexity : Medium
Generally require more time to grok but manageable by beginner to medium expertise level
#6828
opened Apr 22, 2024 by
JohannesGaessler
Loading…
Introduce New Lookup-Table(LUT)-Based Matrix Multiplication Method
ggml
changes relating to the ggml tensor library for machine learning
python
python script changes
Tensor Encoding Scheme
https://github.com/ggerganov/llama.cpp/wiki/Tensor-Encoding-Schemes
#10181
opened Nov 5, 2024 by
QingtaoLi1
Loading…
2 of 4 tasks
ggml-cuda: Vulkan direct conv 2D ported to CUDA
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
#16088
opened Sep 18, 2025 by
etasnadi
Loading…
Previous Next
ProTip!
Updated in the last three days: updated:>2025-11-07.