You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
when i build this whole project,something bad happened。my ninja version is 1.13.0。Has anyone encountered the same problem? What is going on?
Operating systems
Linux
GGML backends
CUDA
Problem description & steps to reproduce
[main] 正在生成文件夹: /home/lcq/projects/llama.cpp/out/build/使用工具链文件配置预设
[build] 正在启动生成
[proc] 执行命令: /usr/local/bin/cmake --build /home/lcq/projects/llama.cpp/out/build/使用工具链文件配置预设 --parallel 26 --
[build] [1/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-q8_0-q8_0.cu.o
[build] [2/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-q4_0-q4_0.cu.o
[build] [3/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-q4_0-q4_0.cu.o
[build] [4/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-q8_0-q8_0.cu.o
[build] [5/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-f16-f16.cu.o
[build] [6/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-f16-f16.cu.o
[build] [7/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs64-f16-f16.cu.o
[build] [8/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs256-f16-f16.cu.o
[build] [9/230] Linking CXX shared library bin/libggml-cpu.so
[build] [10/230] Building CXX object src/CMakeFiles/llama.dir/llama-batch.cpp.o
[build] [11/230] Building CXX object src/CMakeFiles/llama.dir/llama.cpp.o
[build] [12/230] Building CXX object src/CMakeFiles/llama.dir/llama-chat.cpp.o
[build] [13/230] Building CXX object ggml/src/CMakeFiles/ggml.dir/ggml-backend-reg.cpp.o
[build] [14/230] Building CXX object src/CMakeFiles/llama.dir/llama-adapter.cpp.o
[build] [15/230] Building CXX object src/CMakeFiles/llama.dir/llama-hparams.cpp.o
[build] [16/230] Building CXX object src/CMakeFiles/llama.dir/llama-arch.cpp.o
[build] [17/230] Building CXX object src/CMakeFiles/llama.dir/llama-io.cpp.o
[build] [18/230] Building CXX object src/CMakeFiles/llama.dir/llama-memory.cpp.o
[build] [19/230] Building CXX object src/CMakeFiles/llama.dir/llama-impl.cpp.o
[build] [20/230] Building CXX object src/CMakeFiles/llama.dir/llama-graph.cpp.o
[build] [21/230] Building CXX object src/CMakeFiles/llama.dir/llama-context.cpp.o
[build] [22/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_8.cu.o
[build] FAILED: [code=137] ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_8.cu.o
[build] /usr/local/cuda-12.5/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_SHARED -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -D_GLIBCXX_ASSERTIONS -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_cuda_EXPORTS -I/home/lcq/projects/llama.cpp/ggml/src/ggml-cuda/.. -I/home/lcq/projects/llama.cpp/ggml/src/../include -isystem=/usr/local/cuda-12.5/include -g -arch=native -Xcompiler=-fPIC -use_fast_math -Xcompiler "-Wmissing-declarations -Wmissing-noreturn -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-array-bounds -Wextra-semi -Wno-pedantic" -G -std=c++17 -MD -MT ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_8.cu.o -MF ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_8.cu.o.d -x cu -c /home/lcq/projects/llama.cpp/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_8.cu -o ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_8.cu.o
[build] Killed
[build] [23/230] Building CXX object src/CMakeFiles/llama.dir/llama-mmap.cpp.o
[build] [24/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_8.cu.o
[build] FAILED: [code=137] ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_8.cu.o
[build] /usr/local/cuda-12.5/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_SHARED -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -D_GLIBCXX_ASSERTIONS -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_cuda_EXPORTS -I/home/lcq/projects/llama.cpp/ggml/src/ggml-cuda/.. -I/home/lcq/projects/llama.cpp/ggml/src/../include -isystem=/usr/local/cuda-12.5/include -g -arch=native -Xcompiler=-fPIC -use_fast_math -Xcompiler "-Wmissing-declarations -Wmissing-noreturn -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-array-bounds -Wextra-semi -Wno-pedantic" -G -std=c++17 -MD -MT ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_8.cu.o -MF ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_8.cu.o.d -x cu -c /home/lcq/projects/llama.cpp/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_8.cu -o ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_8.cu.o
[build] Killed
[build] [25/230] Building CXX object src/CMakeFiles/llama.dir/llama-kv-cache.cpp.o
[build] [26/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_2.cu.o
[build] FAILED: [code=137] ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_2.cu.o
[build] /usr/local/cuda-12.5/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_SHARED -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -D_GLIBCXX_ASSERTIONS -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_cuda_EXPORTS -I/home/lcq/projects/llama.cpp/ggml/src/ggml-cuda/.. -I/home/lcq/projects/llama.cpp/ggml/src/../include -isystem=/usr/local/cuda-12.5/include -g -arch=native -Xcompiler=-fPIC -use_fast_math -Xcompiler "-Wmissing-declarations -Wmissing-noreturn -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-array-bounds -Wextra-semi -Wno-pedantic" -G -std=c++17 -MD -MT ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_2.cu.o -MF ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_2.cu.o.d -x cu -c /home/lcq/projects/llama.cpp/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_2.cu -o ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_2.cu.o
[build] Killed
[build] [27/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_4.cu.o
[build] FAILED: [code=137] ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_4.cu.o
[build] /usr/local/cuda-12.5/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_SHARED -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -D_GLIBCXX_ASSERTIONS -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_cuda_EXPORTS -I/home/lcq/projects/llama.cpp/ggml/src/ggml-cuda/.. -I/home/lcq/projects/llama.cpp/ggml/src/../include -isystem=/usr/local/cuda-12.5/include -g -arch=native -Xcompiler=-fPIC -use_fast_math -Xcompiler "-Wmissing-declarations -Wmissing-noreturn -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-array-bounds -Wextra-semi -Wno-pedantic" -G -std=c++17 -MD -MT ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_4.cu.o -MF ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_4.cu.o.d -x cu -c /home/lcq/projects/llama.cpp/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_4.cu -o ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_4.cu.o
[build] Killed
[build] [28/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs256-f16-f16.cu.o
[build] [29/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_2.cu.o
[build] FAILED: [code=137] ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_2.cu.o
[build] /usr/local/cuda-12.5/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_SHARED -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -D_GLIBCXX_ASSERTIONS -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_cuda_EXPORTS -I/home/lcq/projects/llama.cpp/ggml/src/ggml-cuda/.. -I/home/lcq/projects/llama.cpp/ggml/src/../include -isystem=/usr/local/cuda-12.5/include -g -arch=native -Xcompiler=-fPIC -use_fast_math -Xcompiler "-Wmissing-declarations -Wmissing-noreturn -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-array-bounds -Wextra-semi -Wno-pedantic" -G -std=c++17 -MD -MT ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_2.cu.o -MF ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_2.cu.o.d -x cu -c /home/lcq/projects/llama.cpp/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_2.cu -o ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_2.cu.o
[build] Killed
[build] [30/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs64-f16-f16.cu.o
[build] [31/230] Building CXX object src/CMakeFiles/llama.dir/llama-model-loader.cpp.o
[build] [32/230] Building CXX object src/CMakeFiles/llama.dir/llama-grammar.cpp.o
[build] [33/230] Building CXX object src/CMakeFiles/llama.dir/llama-quant.cpp.o
[build] [34/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_0.cu.o
[build] [35/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q8_0.cu.o
[build] [36/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_0.cu.o
[build] [37/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_1.cu.o
[build] [38/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_1.cu.o
[build] [39/230] Building CXX object src/CMakeFiles/llama.dir/llama-model.cpp.o
[build] [40/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_k.cu.o
[build] [41/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_k.cu.o
[build] [42/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q3_k.cu.o
[build] [43/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q6_k.cu.o
[build] [44/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_8.cu.o
[build] [45/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_1.cu.o
[build] [46/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_1.cu.o
[build] [47/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_4.cu.o
[build] ninja: build stopped: subcommand failed.
[proc] 命令“/usr/local/bin/cmake --build /home/lcq/projects/llama.cpp/out/build/使用工具链文件配置预设 --parallel 26 --”已退出,代码为 137
[driver] 生成完毕: 00:00:58.627
[build] 生成已完成,退出代码为 137
First Bad Commit
No response
Compile command
use cmake in vscode,turn CUDA on
Relevant log output
[build] [42/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q3_k.cu.o
[build] [43/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q6_k.cu.o
[build] [44/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_8.cu.o
[build] [45/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_1.cu.o
[build] [46/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_1.cu.o
[build] [47/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_4.cu.o
[build] ninja: build stopped: subcommand failed.
[proc] 命令“/usr/local/bin/cmake --build /home/lcq/projects/llama.cpp/out/build/使用工具链文件配置预设 --parallel 26 --”已退出,代码为 137
[driver] 生成完毕: 00:00:58.627
[build] 生成已完成,退出代码为 137
The text was updated successfully, but these errors were encountered:
System memory (RAM), not GPU memory. The process is likely being killed because it runs out of memory. Reducing the number of threads when building with -j 1 may also fix it.
Git commit
when i build this whole project,something bad happened。my ninja version is 1.13.0。Has anyone encountered the same problem? What is going on?
Operating systems
Linux
GGML backends
CUDA
Problem description & steps to reproduce
[main] 正在生成文件夹: /home/lcq/projects/llama.cpp/out/build/使用工具链文件配置预设
[build] 正在启动生成
[proc] 执行命令: /usr/local/bin/cmake --build /home/lcq/projects/llama.cpp/out/build/使用工具链文件配置预设 --parallel 26 --
[build] [1/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-q8_0-q8_0.cu.o
[build] [2/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-q4_0-q4_0.cu.o
[build] [3/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-q4_0-q4_0.cu.o
[build] [4/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-q8_0-q8_0.cu.o
[build] [5/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-f16-f16.cu.o
[build] [6/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-f16-f16.cu.o
[build] [7/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs64-f16-f16.cu.o
[build] [8/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs256-f16-f16.cu.o
[build] [9/230] Linking CXX shared library bin/libggml-cpu.so
[build] [10/230] Building CXX object src/CMakeFiles/llama.dir/llama-batch.cpp.o
[build] [11/230] Building CXX object src/CMakeFiles/llama.dir/llama.cpp.o
[build] [12/230] Building CXX object src/CMakeFiles/llama.dir/llama-chat.cpp.o
[build] [13/230] Building CXX object ggml/src/CMakeFiles/ggml.dir/ggml-backend-reg.cpp.o
[build] [14/230] Building CXX object src/CMakeFiles/llama.dir/llama-adapter.cpp.o
[build] [15/230] Building CXX object src/CMakeFiles/llama.dir/llama-hparams.cpp.o
[build] [16/230] Building CXX object src/CMakeFiles/llama.dir/llama-arch.cpp.o
[build] [17/230] Building CXX object src/CMakeFiles/llama.dir/llama-io.cpp.o
[build] [18/230] Building CXX object src/CMakeFiles/llama.dir/llama-memory.cpp.o
[build] [19/230] Building CXX object src/CMakeFiles/llama.dir/llama-impl.cpp.o
[build] [20/230] Building CXX object src/CMakeFiles/llama.dir/llama-graph.cpp.o
[build] [21/230] Building CXX object src/CMakeFiles/llama.dir/llama-context.cpp.o
[build] [22/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_8.cu.o
[build] FAILED: [code=137] ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_8.cu.o
[build] /usr/local/cuda-12.5/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_SHARED -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -D_GLIBCXX_ASSERTIONS -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_cuda_EXPORTS -I/home/lcq/projects/llama.cpp/ggml/src/ggml-cuda/.. -I/home/lcq/projects/llama.cpp/ggml/src/../include -isystem=/usr/local/cuda-12.5/include -g -arch=native -Xcompiler=-fPIC -use_fast_math -Xcompiler "-Wmissing-declarations -Wmissing-noreturn -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-array-bounds -Wextra-semi -Wno-pedantic" -G -std=c++17 -MD -MT ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_8.cu.o -MF ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_8.cu.o.d -x cu -c /home/lcq/projects/llama.cpp/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_8.cu -o ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_8.cu.o
[build] Killed
[build] [23/230] Building CXX object src/CMakeFiles/llama.dir/llama-mmap.cpp.o
[build] [24/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_8.cu.o
[build] FAILED: [code=137] ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_8.cu.o
[build] /usr/local/cuda-12.5/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_SHARED -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -D_GLIBCXX_ASSERTIONS -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_cuda_EXPORTS -I/home/lcq/projects/llama.cpp/ggml/src/ggml-cuda/.. -I/home/lcq/projects/llama.cpp/ggml/src/../include -isystem=/usr/local/cuda-12.5/include -g -arch=native -Xcompiler=-fPIC -use_fast_math -Xcompiler "-Wmissing-declarations -Wmissing-noreturn -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-array-bounds -Wextra-semi -Wno-pedantic" -G -std=c++17 -MD -MT ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_8.cu.o -MF ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_8.cu.o.d -x cu -c /home/lcq/projects/llama.cpp/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_8.cu -o ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_8.cu.o
[build] Killed
[build] [25/230] Building CXX object src/CMakeFiles/llama.dir/llama-kv-cache.cpp.o
[build] [26/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_2.cu.o
[build] FAILED: [code=137] ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_2.cu.o
[build] /usr/local/cuda-12.5/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_SHARED -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -D_GLIBCXX_ASSERTIONS -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_cuda_EXPORTS -I/home/lcq/projects/llama.cpp/ggml/src/ggml-cuda/.. -I/home/lcq/projects/llama.cpp/ggml/src/../include -isystem=/usr/local/cuda-12.5/include -g -arch=native -Xcompiler=-fPIC -use_fast_math -Xcompiler "-Wmissing-declarations -Wmissing-noreturn -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-array-bounds -Wextra-semi -Wno-pedantic" -G -std=c++17 -MD -MT ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_2.cu.o -MF ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_2.cu.o.d -x cu -c /home/lcq/projects/llama.cpp/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_2.cu -o ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_2.cu.o
[build] Killed
[build] [27/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_4.cu.o
[build] FAILED: [code=137] ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_4.cu.o
[build] /usr/local/cuda-12.5/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_SHARED -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -D_GLIBCXX_ASSERTIONS -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_cuda_EXPORTS -I/home/lcq/projects/llama.cpp/ggml/src/ggml-cuda/.. -I/home/lcq/projects/llama.cpp/ggml/src/../include -isystem=/usr/local/cuda-12.5/include -g -arch=native -Xcompiler=-fPIC -use_fast_math -Xcompiler "-Wmissing-declarations -Wmissing-noreturn -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-array-bounds -Wextra-semi -Wno-pedantic" -G -std=c++17 -MD -MT ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_4.cu.o -MF ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_4.cu.o.d -x cu -c /home/lcq/projects/llama.cpp/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_4.cu -o ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_4.cu.o
[build] Killed
[build] [28/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs256-f16-f16.cu.o
[build] [29/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_2.cu.o
[build] FAILED: [code=137] ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_2.cu.o
[build] /usr/local/cuda-12.5/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_SHARED -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -D_GLIBCXX_ASSERTIONS -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_cuda_EXPORTS -I/home/lcq/projects/llama.cpp/ggml/src/ggml-cuda/.. -I/home/lcq/projects/llama.cpp/ggml/src/../include -isystem=/usr/local/cuda-12.5/include -g -arch=native -Xcompiler=-fPIC -use_fast_math -Xcompiler "-Wmissing-declarations -Wmissing-noreturn -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-array-bounds -Wextra-semi -Wno-pedantic" -G -std=c++17 -MD -MT ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_2.cu.o -MF ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_2.cu.o.d -x cu -c /home/lcq/projects/llama.cpp/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_2.cu -o ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_2.cu.o
[build] Killed
[build] [30/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs64-f16-f16.cu.o
[build] [31/230] Building CXX object src/CMakeFiles/llama.dir/llama-model-loader.cpp.o
[build] [32/230] Building CXX object src/CMakeFiles/llama.dir/llama-grammar.cpp.o
[build] [33/230] Building CXX object src/CMakeFiles/llama.dir/llama-quant.cpp.o
[build] [34/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_0.cu.o
[build] [35/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q8_0.cu.o
[build] [36/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_0.cu.o
[build] [37/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_1.cu.o
[build] [38/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_1.cu.o
[build] [39/230] Building CXX object src/CMakeFiles/llama.dir/llama-model.cpp.o
[build] [40/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_k.cu.o
[build] [41/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_k.cu.o
[build] [42/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q3_k.cu.o
[build] [43/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q6_k.cu.o
[build] [44/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_8.cu.o
[build] [45/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_1.cu.o
[build] [46/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_1.cu.o
[build] [47/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_4.cu.o
[build] ninja: build stopped: subcommand failed.
[proc] 命令“/usr/local/bin/cmake --build /home/lcq/projects/llama.cpp/out/build/使用工具链文件配置预设 --parallel 26 --”已退出,代码为 137
[driver] 生成完毕: 00:00:58.627
[build] 生成已完成,退出代码为 137
First Bad Commit
No response
Compile command
use cmake in vscode,turn CUDA on
Relevant log output
The text was updated successfully, but these errors were encountered: