You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
DGGML_CPU_ALL_VARIANTS creates separate libraries for a load of different x86_64 variants to cater for newer capabilities
This of course r3ecognises that whisper.cpp will be used on, potentially, any old junk
The same is true for CUDA. The default action is to build for the build machine's GPU but in the same way that DGGML_CPU_ALL_VARIANTS allows for a variety of CPUs it would be desirable to build CUDA libraries covering all the bases.
If you manually specify the CUDA_ARCHITECTURE to be lots of variants (e.g. 50 - 120) you end up with a humungous library
It seems sensible to have a DGGML_CUDA_ALL_VARIANTS flag that writes libraries for the specific targets in the list - e.g.
-DCUDA_ARCHITECTURE="50;52;53;60;61;62;70;72;75;80;86;87;89;90;100;101;120" // Current list according to Wiki - pre-75 next on defunct list for newer CUDA
For exaample 120 is 50x0, 89 is 40x0, 86 is 30x0 etc
This would create lots of smaller libraries allowing a version of ggml_backend_score to be checked against the card's CUDA spec.
And smaller CUDA library = less required RAM (which never hurts)
The only thing that raises an issue I guess would be the extremely rare possibility of having multiple cards with multiple architectures (I chatted to a guy some years ago with exactly this setup)
Could also split CMAKE_CUDA_ARCHITECTURES I guess
The text was updated successfully, but these errors were encountered:
DGGML_CPU_ALL_VARIANTS creates separate libraries for a load of different x86_64 variants to cater for newer capabilities
This of course r3ecognises that whisper.cpp will be used on, potentially, any old junk
The same is true for CUDA. The default action is to build for the build machine's GPU but in the same way that DGGML_CPU_ALL_VARIANTS allows for a variety of CPUs it would be desirable to build CUDA libraries covering all the bases.
If you manually specify the CUDA_ARCHITECTURE to be lots of variants (e.g. 50 - 120) you end up with a humungous library
It seems sensible to have a DGGML_CUDA_ALL_VARIANTS flag that writes libraries for the specific targets in the list - e.g.
-DCUDA_ARCHITECTURE="50;52;53;60;61;62;70;72;75;80;86;87;89;90;100;101;120" // Current list according to Wiki - pre-75 next on defunct list for newer CUDA
For exaample 120 is 50x0, 89 is 40x0, 86 is 30x0 etc
This would create lots of smaller libraries allowing a version of ggml_backend_score to be checked against the card's CUDA spec.
And smaller CUDA library = less required RAM (which never hurts)
The only thing that raises an issue I guess would be the extremely rare possibility of having multiple cards with multiple architectures (I chatted to a guy some years ago with exactly this setup)
Could also split CMAKE_CUDA_ARCHITECTURES I guess
The text was updated successfully, but these errors were encountered: