RFC: We could do with a  DGGML_CUDA_ALL_VARIANTS?

DGGML_CPU_ALL_VARIANTS creates separate libraries for a load of different x86_64 variants to cater for newer capabilities

This of course r3ecognises that whisper.cpp will be used on, potentially, any old junk

The same is true for CUDA. The default action is to build for the build machine's GPU but in the same way that DGGML_CPU_ALL_VARIANTS allows for a variety of CPUs it would be desirable to build CUDA libraries covering all the bases.

If you manually specify the CUDA_ARCHITECTURE to be lots of variants (e.g. 50 - 120) you end up with a humungous library

It seems sensible to have a DGGML_CUDA_ALL_VARIANTS flag that writes libraries for the specific targets in the list - e.g.
-DCUDA_ARCHITECTURE="50;52;53;60;61;62;70;72;75;80;86;87;89;90;100;101;120" // Current list according to [Wiki](https://en.wikipedia.org/wiki/CUDA) - pre-75 next on defunct list for newer CUDA

For exaample 120 is 50x0, 89 is 40x0, 86 is 30x0 etc

This would create lots of smaller libraries allowing a version of ggml_backend_score to be checked against the card's CUDA spec.

And smaller CUDA library = less required RAM (which never hurts)

The only thing that raises an issue I guess would be the extremely rare possibility of having multiple cards with multiple architectures (I chatted to a guy some years ago with exactly this setup)

Could also split CMAKE_CUDA_ARCHITECTURES I guess


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

RFC: We could do with a DGGML_CUDA_ALL_VARIANTS? #3110

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

RFC: We could do with a DGGML_CUDA_ALL_VARIANTS? #3110

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions