Compile bug: ninja: build stopped: subcommand failed. #13375

charkyli · 2025-05-08T05:02:54Z

Git commit

when i build this whole project，something bad happened。my ninja version is 1.13.0。Has anyone encountered the same problem? What is going on?

Operating systems

Linux

GGML backends

CUDA

Problem description & steps to reproduce

[main] 正在生成文件夹: /home/lcq/projects/llama.cpp/out/build/使用工具链文件配置预设
[build] 正在启动生成
[proc] 执行命令: /usr/local/bin/cmake --build /home/lcq/projects/llama.cpp/out/build/使用工具链文件配置预设 --parallel 26 --
[build] [1/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-q8_0-q8_0.cu.o
[build] [2/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-q4_0-q4_0.cu.o
[build] [3/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-q4_0-q4_0.cu.o
[build] [4/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-q8_0-q8_0.cu.o
[build] [5/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs128-f16-f16.cu.o
[build] [6/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs128-f16-f16.cu.o
[build] [7/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs64-f16-f16.cu.o
[build] [8/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f16-instance-hs256-f16-f16.cu.o
[build] [9/230] Linking CXX shared library bin/libggml-cpu.so
[build] [10/230] Building CXX object src/CMakeFiles/llama.dir/llama-batch.cpp.o
[build] [11/230] Building CXX object src/CMakeFiles/llama.dir/llama.cpp.o
[build] [12/230] Building CXX object src/CMakeFiles/llama.dir/llama-chat.cpp.o
[build] [13/230] Building CXX object ggml/src/CMakeFiles/ggml.dir/ggml-backend-reg.cpp.o
[build] [14/230] Building CXX object src/CMakeFiles/llama.dir/llama-adapter.cpp.o
[build] [15/230] Building CXX object src/CMakeFiles/llama.dir/llama-hparams.cpp.o
[build] [16/230] Building CXX object src/CMakeFiles/llama.dir/llama-arch.cpp.o
[build] [17/230] Building CXX object src/CMakeFiles/llama.dir/llama-io.cpp.o
[build] [18/230] Building CXX object src/CMakeFiles/llama.dir/llama-memory.cpp.o
[build] [19/230] Building CXX object src/CMakeFiles/llama.dir/llama-impl.cpp.o
[build] [20/230] Building CXX object src/CMakeFiles/llama.dir/llama-graph.cpp.o
[build] [21/230] Building CXX object src/CMakeFiles/llama.dir/llama-context.cpp.o
[build] [22/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_8.cu.o
[build] FAILED: [code=137] ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_8.cu.o
[build] /usr/local/cuda-12.5/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_SHARED -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -D_GLIBCXX_ASSERTIONS -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_cuda_EXPORTS -I/home/lcq/projects/llama.cpp/ggml/src/ggml-cuda/.. -I/home/lcq/projects/llama.cpp/ggml/src/../include -isystem=/usr/local/cuda-12.5/include -g -arch=native -Xcompiler=-fPIC -use_fast_math -Xcompiler "-Wmissing-declarations -Wmissing-noreturn -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-array-bounds -Wextra-semi -Wno-pedantic" -G -std=c++17 -MD -MT ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_8.cu.o -MF ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_8.cu.o.d -x cu -c /home/lcq/projects/llama.cpp/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_8.cu -o ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_8.cu.o
[build] Killed
[build] [23/230] Building CXX object src/CMakeFiles/llama.dir/llama-mmap.cpp.o
[build] [24/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_8.cu.o
[build] FAILED: [code=137] ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_8.cu.o
[build] /usr/local/cuda-12.5/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_SHARED -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -D_GLIBCXX_ASSERTIONS -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_cuda_EXPORTS -I/home/lcq/projects/llama.cpp/ggml/src/ggml-cuda/.. -I/home/lcq/projects/llama.cpp/ggml/src/../include -isystem=/usr/local/cuda-12.5/include -g -arch=native -Xcompiler=-fPIC -use_fast_math -Xcompiler "-Wmissing-declarations -Wmissing-noreturn -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-array-bounds -Wextra-semi -Wno-pedantic" -G -std=c++17 -MD -MT ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_8.cu.o -MF ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_8.cu.o.d -x cu -c /home/lcq/projects/llama.cpp/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_8.cu -o ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_8.cu.o
[build] Killed
[build] [25/230] Building CXX object src/CMakeFiles/llama.dir/llama-kv-cache.cpp.o
[build] [26/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_2.cu.o
[build] FAILED: [code=137] ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_2.cu.o
[build] /usr/local/cuda-12.5/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_SHARED -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -D_GLIBCXX_ASSERTIONS -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_cuda_EXPORTS -I/home/lcq/projects/llama.cpp/ggml/src/ggml-cuda/.. -I/home/lcq/projects/llama.cpp/ggml/src/../include -isystem=/usr/local/cuda-12.5/include -g -arch=native -Xcompiler=-fPIC -use_fast_math -Xcompiler "-Wmissing-declarations -Wmissing-noreturn -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-array-bounds -Wextra-semi -Wno-pedantic" -G -std=c++17 -MD -MT ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_2.cu.o -MF ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_2.cu.o.d -x cu -c /home/lcq/projects/llama.cpp/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_2.cu -o ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_2.cu.o
[build] Killed
[build] [27/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_4.cu.o
[build] FAILED: [code=137] ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_4.cu.o
[build] /usr/local/cuda-12.5/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_SHARED -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -D_GLIBCXX_ASSERTIONS -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_cuda_EXPORTS -I/home/lcq/projects/llama.cpp/ggml/src/ggml-cuda/.. -I/home/lcq/projects/llama.cpp/ggml/src/../include -isystem=/usr/local/cuda-12.5/include -g -arch=native -Xcompiler=-fPIC -use_fast_math -Xcompiler "-Wmissing-declarations -Wmissing-noreturn -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-array-bounds -Wextra-semi -Wno-pedantic" -G -std=c++17 -MD -MT ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_4.cu.o -MF ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_4.cu.o.d -x cu -c /home/lcq/projects/llama.cpp/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_4.cu -o ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_4.cu.o
[build] Killed
[build] [28/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs256-f16-f16.cu.o
[build] [29/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_2.cu.o
[build] FAILED: [code=137] ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_2.cu.o
[build] /usr/local/cuda-12.5/bin/nvcc -forward-unknown-to-host-compiler -DGGML_BACKEND_BUILD -DGGML_BACKEND_SHARED -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -DGGML_CUDA_USE_GRAPHS -DGGML_SCHED_MAX_COPIES=4 -DGGML_SHARED -D_GLIBCXX_ASSERTIONS -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -Dggml_cuda_EXPORTS -I/home/lcq/projects/llama.cpp/ggml/src/ggml-cuda/.. -I/home/lcq/projects/llama.cpp/ggml/src/../include -isystem=/usr/local/cuda-12.5/include -g -arch=native -Xcompiler=-fPIC -use_fast_math -Xcompiler "-Wmissing-declarations -Wmissing-noreturn -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-array-bounds -Wextra-semi -Wno-pedantic" -G -std=c++17 -MD -MT ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_2.cu.o -MF ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_2.cu.o.d -x cu -c /home/lcq/projects/llama.cpp/ggml/src/ggml-cuda/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_2.cu -o ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_2.cu.o
[build] Killed
[build] [30/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-f32-instance-hs64-f16-f16.cu.o
[build] [31/230] Building CXX object src/CMakeFiles/llama.dir/llama-model-loader.cpp.o
[build] [32/230] Building CXX object src/CMakeFiles/llama.dir/llama-grammar.cpp.o
[build] [33/230] Building CXX object src/CMakeFiles/llama.dir/llama-quant.cpp.o
[build] [34/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_0.cu.o
[build] [35/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q8_0.cu.o
[build] [36/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_0.cu.o
[build] [37/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_1.cu.o
[build] [38/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_1.cu.o
[build] [39/230] Building CXX object src/CMakeFiles/llama.dir/llama-model.cpp.o
[build] [40/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_k.cu.o
[build] [41/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_k.cu.o
[build] [42/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q3_k.cu.o
[build] [43/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q6_k.cu.o
[build] [44/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_8.cu.o
[build] [45/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_1.cu.o
[build] [46/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_1.cu.o
[build] [47/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_4.cu.o
[build] ninja: build stopped: subcommand failed.
[proc] 命令“/usr/local/bin/cmake --build /home/lcq/projects/llama.cpp/out/build/使用工具链文件配置预设 --parallel 26 --”已退出，代码为 137
[driver] 生成完毕: 00:00:58.627
[build] 生成已完成，退出代码为 137

First Bad Commit

No response

Compile command

use cmake in vscode，turn CUDA on

Relevant log output

[build] [42/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q3_k.cu.o
[build] [43/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q6_k.cu.o
[build] [44/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_8.cu.o
[build] [45/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_1.cu.o
[build] [46/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_1.cu.o
[build] [47/230] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_4.cu.o
[build] ninja: build stopped: subcommand failed.
[proc] 命令“/usr/local/bin/cmake --build /home/lcq/projects/llama.cpp/out/build/使用工具链文件配置预设 --parallel 26 --”已退出，代码为 137
[driver] 生成完毕: 00:00:58.627
[build] 生成已完成，退出代码为 137

slaren · 2025-05-08T11:40:11Z

[build] FAILED: [code=137] ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_8.cu.o

Code 137 may indicate that you are running out of memory.

charkyli · 2025-05-08T12:54:42Z

[build] FAILED: [code=137] ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_8.cu.o

Code 137 may indicate that you are running out of memory.

Okay, thank you, I will check my GPU memory usage.

slaren · 2025-05-08T12:56:19Z

System memory (RAM), not GPU memory. The process is likely being killed because it runs out of memory. Reducing the number of threads when building with -j 1 may also fix it.

charkyli · 2025-05-08T13:14:17Z

系统内存（RAM），而不是 GPU 内存。该进程可能因为内存不足而被终止。使用构建时减少线程数也可能可以解决这个问题。-j 1
我是新手，请问我应该怎么减少构建时的线程数呢，在哪里添加-j 1呢

slaren · 2025-05-08T13:15:48Z

cmake --build build -j 1

Note: -j is the same as --parallel

charkyli added the bug-unconfirmed label May 8, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Compile bug: ninja: build stopped: subcommand failed. #13375

Compile bug: ninja: build stopped: subcommand failed. #13375

charkyli commented May 8, 2025

slaren commented May 8, 2025

charkyli commented May 8, 2025

slaren commented May 8, 2025

charkyli commented May 8, 2025

slaren commented May 8, 2025 •

edited

Loading

Compile bug: ninja: build stopped: subcommand failed. #13375

Compile bug: ninja: build stopped: subcommand failed. #13375

Comments

charkyli commented May 8, 2025

Git commit

Operating systems

GGML backends

Problem description & steps to reproduce

First Bad Commit

Compile command

Relevant log output

slaren commented May 8, 2025

charkyli commented May 8, 2025

slaren commented May 8, 2025

charkyli commented May 8, 2025

slaren commented May 8, 2025 • edited Loading

slaren commented May 8, 2025 •

edited

Loading