Skip to content

sycl: addressing non-contiguous src1 mul_mats (nc and batched) #13343

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
May 8, 2025

Conversation

Alcpz
Copy link
Collaborator

@Alcpz Alcpz commented May 6, 2025

#13308 disabled the mul_mats for non-contiguous matrices since we were producing wrong results.

This PR fixes the non-contiguous mulmats based on #13155.
I had to disable the path for ggml_sycl_mul_mat_vec_nc since that doesn't deal with non-contiguous src1.

model / test size params backend ngl sm mmap t/s (27aa259) t/s (5215b91) t/s (e317791)
qwen2 1.5B Q4_0 · pp512 1013.62 MiB 1.78 B SYCL 99 none 0 6309.62 ± 33.92 2381.98 ± 33.28 6011.69 ± 14.02
qwen2 1.5B Q4_0 · tg128 1013.62 MiB 1.78 B SYCL 99 none 0 105.86 ± 0.50 102.06 ± 2.82 102.55 ± 0.71
llama 7B Q4_0 · pp512 3.57 GiB 6.74 B SYCL 99 none 0 1649.66 ± 2.95 492.51 ± 16.00 1622.15 ± 2.65
llama 7B Q4_0 · tg128 3.57 GiB 6.74 B SYCL 99 none 0 40.79 ± 0.28 40.76 ± 0.16 40.39 ± 0.14
phi3 3B Q4_0 · pp512 2.03 GiB 3.82 B SYCL 99 none 0 2465.55 ± 5.34 588.31 ± 10.27 2404.85 ± 4.10
phi3 3B Q4_0 · tg128 2.03 GiB 3.82 B SYCL 99 none 0 61.67 ± 0.70 62.00 ± 0.62 62.50 ± 0.64

27aa259 broken mul_mat performance (Before disabling non-contiguous src1)
5215b91 the current performance
e317791 this PR

We have lost a little bit of performance due to disabling ggml_sycl_mul_mat_vec_nc, but I think it's better to rethink the whole mul_mat dispatch and refactor matrix multiplications.

I'd prefer to address non-critical concerns in a different PR due to the massive performance regression we have (assuming I didn't mess up something along the way).

I am not entirely sure if the performance loss in qwen2  tg128 is noise or caused by this PR

@github-actions github-actions bot added ggml changes relating to the ggml tensor library for machine learning SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language labels May 6, 2025
Copy link
Collaborator

@s-Nick s-Nick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Alcpz Alcpz merged commit 8733e0c into ggml-org:master May 8, 2025
46 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ggml changes relating to the ggml tensor library for machine learning SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants