Skip to content

RFC: We could do with a DGGML_CUDA_ALL_VARIANTS? #3110

Closed
@peardox

Description

@peardox

DGGML_CPU_ALL_VARIANTS creates separate libraries for a load of different x86_64 variants to cater for newer capabilities

This of course r3ecognises that whisper.cpp will be used on, potentially, any old junk

The same is true for CUDA. The default action is to build for the build machine's GPU but in the same way that DGGML_CPU_ALL_VARIANTS allows for a variety of CPUs it would be desirable to build CUDA libraries covering all the bases.

If you manually specify the CUDA_ARCHITECTURE to be lots of variants (e.g. 50 - 120) you end up with a humungous library

It seems sensible to have a DGGML_CUDA_ALL_VARIANTS flag that writes libraries for the specific targets in the list - e.g.
-DCUDA_ARCHITECTURE="50;52;53;60;61;62;70;72;75;80;86;87;89;90;100;101;120" // Current list according to Wiki - pre-75 next on defunct list for newer CUDA

For exaample 120 is 50x0, 89 is 40x0, 86 is 30x0 etc

This would create lots of smaller libraries allowing a version of ggml_backend_score to be checked against the card's CUDA spec.

And smaller CUDA library = less required RAM (which never hurts)

The only thing that raises an issue I guess would be the extremely rare possibility of having multiple cards with multiple architectures (I chatted to a guy some years ago with exactly this setup)

Could also split CMAKE_CUDA_ARCHITECTURES I guess

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions