Skip to content

Misc. bug: vulkan: performance regression after fd123cfead49eb32e386e26b8ef7a6d41554dda5 #12553

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
rhjdvsgsgks opened this issue Mar 24, 2025 · 2 comments

Comments

@rhjdvsgsgks
Copy link
Contributor

rhjdvsgsgks commented Mar 24, 2025

Name and Version

fd123cf

Operating systems

Linux

Which llama.cpp modules do you know to be affected?

vulkan backend

Command line

Problem description & steps to reproduce

model size params backend ngl test t/s
gemma3 12B Q5_K - Medium 8.09 GiB 11.77 B Vulkan 99 pp512 61.69 ± 0.04
gemma3 12B Q5_K - Medium 8.09 GiB 11.77 B Vulkan 99 tg128 21.87 ± 0.01

build: a53f7f7 (4908)

model size params backend ngl test t/s
gemma3 12B Q5_K - Medium 8.09 GiB 11.77 B Vulkan 99 pp512 59.69 ± 0.05
gemma3 12B Q5_K - Medium 8.09 GiB 11.77 B Vulkan 99 tg128 21.00 ± 0.25

build: fd123cf (4909)

First Bad Commit

fd123cf

Relevant log output

@rhjdvsgsgks rhjdvsgsgks changed the title Misc. bug: performance regression after fd123cfead49eb32e386e26b8ef7a6d41554dda5 Misc. bug: vulkan: performance regression after fd123cfead49eb32e386e26b8ef7a6d41554dda5 Mar 24, 2025
@0cc4m
Copy link
Collaborator

0cc4m commented Mar 25, 2025

What GPU and OS are you using?

This is a little bit of a damned if you do, damned if you don't situation. Defaulting to smaller allocations fixes a number of OOM crashes due to fragmentation or driver problems, many of which have been reported over the last year. If it reduces performance slightly, that's regrettable, but I can't just go back to the old behaviour.

Please check if setting GGML_VK_SUBALLOCATION_BLOCK_SIZE=2147483648 fixes your performance regression already, otherwise you can go back to the old behaviour and performance by setting GGML_VK_SUBALLOCATION_BLOCK_SIZE=4294967296.

@github-actions github-actions bot added the stale label Apr 25, 2025
Copy link
Contributor

github-actions bot commented May 9, 2025

This issue was closed because it has been inactive for 14 days since being marked as stale.

@github-actions github-actions bot closed this as completed May 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants