compare-commits.sh: support both llama-bench and test-backend-ops #14392

yeahdongcn · 2025-06-26T11:26:26Z

Make sure to read the contributing guidelines before submitting a PR

This is a follow-up to #14368, adding support for comparing test-backend-ops performance results between two commits.

Testing Done

Generated Tables

❯ cd /Users/yexiaodong/go/src/github.com/ggerganov/llama.cpp && python3 scripts/compare-llama-bench.py -b 1d5f25c53 -c ecd7fdb4c --tool test-backend-ops -i ./test-backend-ops.sqlite
| Backend   | Operation   | Parameters                              |   Bandwidth (GB/s) 1d5f25c53 |   Bandwidth (GB/s) xd/compare |   Speedup |
|:----------|:------------|:----------------------------------------|-----------------------------:|------------------------------:|----------:|
| Metal     | ADD         | type=f32,ne=[4096,1,1,1],nr=[1,1,1,1]   |                        28.42 |                         28.45 |      1.00 |
| Metal     | ADD         | type=f32,ne=[4096,1,1,1],nr=[1,512,1,1] |                        86.60 |                         97.14 |      1.12 |

❯ cd /Users/yexiaodong/go/src/github.com/ggerganov/llama.cpp && python3 scripts/compare-llama-bench.py -b ecd7fdb4c -c ecd7fdb4c --tool test-backend-ops -i ./test-backend-ops.sqlite                 
| Backend   | Operation   | Parameters                                                                       |   GFLOPS xd/test-backend-ops_sql |   GFLOPS xd/test-backend-ops_sql |   Speedup |
|:----------|:------------|:---------------------------------------------------------------------------------|---------------------------------:|---------------------------------:|----------:|
| Metal     | MUL_MAT     | type_a=f16,type_b=f32,m=128,n=1,k=16416,bs=[8,1],nr=[4,1],per=[0,1,2,3],v=1      |                           127.90 |                           127.90 |      1.00 |
| Metal     | MUL_MAT     | type_a=f16,type_b=f32,m=16416,n=1,k=128,bs=[8,1],nr=[4,1],per=[0,2,1,3],v=0      |                            33.98 |                            33.98 |      1.00 |
| Metal     | MUL_MAT     | type_a=f16,type_b=f32,m=4096,n=1,k=14336,bs=[1,1],nr=[1,1],per=[0,1,2,3],v=0     |                            57.91 |                            57.91 |      1.00 |
| Metal     | MUL_MAT     | type_a=f16,type_b=f32,m=4096,n=2,k=14336,bs=[1,1],nr=[1,1],per=[0,1,2,3],v=0     |                           115.00 |                           115.00 |      1.00 |

Generated Plot

Full Logs

root@deccddc39743:/ws# CMAKE_OPTS="-DGGML_MUSA=ON -DMUSA_ARCHITECTURES=21" ./scripts/compare-commits.sh ac36da5ee 0ff71998b  test-backend-ops -o ADD
+ commit1=ac36da5ee
+ commit2=0ff71998b
+ tool=test-backend-ops
+ additional_args='-o ADD'
+ '[' test-backend-ops '!=' llama-bench ']'
+ '[' test-backend-ops '!=' test-backend-ops ']'
+ ./scripts/compare-llama-bench.py --check
+ '[' test-backend-ops = llama-bench ']'
+ db_file=test-backend-ops.sqlite
+ target=test-backend-ops
+ run_args='perf --output sql -o ADD'
+ rm -f test-backend-ops.sqlite
+ '[' -n '' ']'
+ dir=build-bench
+ git checkout ac36da5ee
HEAD is now at ac36da5ee a
+ run
+ rm -fr build-bench
+ cmake -B build-bench -S . -DGGML_MUSA=ON -DMUSA_ARCHITECTURES=21
++ nproc
+ cmake --build build-bench -t test-backend-ops -j 12
+ build-bench/bin/test-backend-ops perf --output sql -o ADD
+ sqlite3 test-backend-ops.sqlite
ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 MUSA devices:
  Device 0: MTT S80, compute capability 2.1, VMM: yes
+ git checkout 0ff71998b
Previous HEAD position was ac36da5ee a
HEAD is now at 0ff71998b musa: apply mublas API changes
+ run
+ rm -fr build-bench
+ cmake -B build-bench -S . -DGGML_MUSA=ON -DMUSA_ARCHITECTURES=21
++ nproc
+ cmake --build build-bench -t test-backend-ops -j 12
+ build-bench/bin/test-backend-ops perf --output sql -o ADD
+ sqlite3 test-backend-ops.sqlite
ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 MUSA devices:
  Device 0: MTT S80, compute capability 2.1, VMM: yes
+ ./scripts/compare-llama-bench.py -b ac36da5ee -c 0ff71998b --tool test-backend-ops -i test-backend-ops.sqlite
| Backend   | Operation   | Parameters                              |   Bandwidth (GB/s) xd/compare-commits |   Bandwidth (GB/s) 0ff71998b |   Speedup |
|:----------|:------------|:----------------------------------------|--------------------------------------:|-----------------------------:|----------:|
| MUSA0     | ADD         | type=f32,ne=[4096,1,1,1],nr=[1,1,1,1]   |                                  4.53 |                         4.54 |      1.00 |
| MUSA0     | ADD         | type=f32,ne=[4096,1,1,1],nr=[1,512,1,1] |                                247.02 |                       247.10 |      1.00 |
root@deccddc39743:/ws#

Copilot

Pull Request Overview

This PR adds support for comparing performance results from both llama-bench and test-backend-ops by introducing tool-specific database schemas, CLI argument parsing, and formatting functions. Key changes include:

Refactoring database field and key property definitions to support both tools.
Updating table queries and input file handling based on a new --tool argument.
Enhancing the CLI script (compare-commits.sh) to allow selection of the tool and additional arguments.

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File	Description
scripts/compare-llama-bench.py	Adjusts SQLite table creation, queries, and result formatting for dual tool support.
scripts/compare-commits.sh	Updates argument parsing and build/run logic to handle multiple tools.

scripts/compare-llama-bench.py

yeahdongcn · 2025-07-05T04:14:21Z

Hi @JohannesGaessler @slaren @ggerganov I’ve merged #14368 into master. Could you please continue reviewing this one when you have a moment? Thanks!

Signed-off-by: Xiaodong Ye <[email protected]>

yeahdongcn requested a review from Copilot June 26, 2025 11:28

Copilot AI reviewed Jun 26, 2025

View reviewed changes

scripts/compare-llama-bench.py Show resolved Hide resolved

github-actions bot added script Script related python python script changes labels Jun 26, 2025

yeahdongcn requested review from ggerganov, slaren and JohannesGaessler June 26, 2025 11:32

yeahdongcn marked this pull request as ready for review June 26, 2025 11:33

yeahdongcn mentioned this pull request Jul 1, 2025

test-backend-ops: add support for specifying output format #14368

Merged

compare-commits.sh: support both llama-bench and test-backend-ops

b5ea15f

Signed-off-by: Xiaodong Ye <[email protected]>

yeahdongcn force-pushed the xd/compare-commits branch from 5c1951b to b5ea15f Compare July 5, 2025 04:15

yeahdongcn added 2 commits July 7, 2025 11:30

Speed up the build by specifying -j 12

c5ee92d

Signed-off-by: Xiaodong Ye <[email protected]>

Remove build_number from test-backend-ops db

a43da96

Signed-off-by: Xiaodong Ye <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

compare-commits.sh: support both llama-bench and test-backend-ops #14392

compare-commits.sh: support both llama-bench and test-backend-ops #14392

yeahdongcn commented Jun 26, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

yeahdongcn commented Jul 5, 2025

Uh oh!

Uh oh!

compare-commits.sh: support both llama-bench and test-backend-ops #14392

Are you sure you want to change the base?

compare-commits.sh: support both llama-bench and test-backend-ops #14392

Conversation

yeahdongcn commented Jun 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Testing Done

Generated Tables

Generated Plot

Full Logs

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

yeahdongcn commented Jul 5, 2025

Uh oh!

Uh oh!

yeahdongcn commented Jun 26, 2025 •

edited

Loading