An example of writing a C++/CUDA extension for PyTorch. See
here for the accompanying tutorial.
This repo demonstrates how to write an example extension_cpp.ops.mymuladd
custom op that has both custom CPU and CUDA kernels.
The examples in this repo work with PyTorch 2.4+.
To get it to work on Axon with an A40 GPU, I performed the following steps:
conda create -n extension-cpp python=3.11pip install -r requirements.txtconda install nvidia/label/cuda-12.4.1::cuda-toolkitml gcc/10.4export CPATH=/home/hc3190/.conda/envs/extension-cpp/targets/x86_64-linux/include/:$CPATHpip install .
To test:
python test/test_extension.py