Partial offload support for training #13486

JohannesGaessler · 2025-05-12T19:00:32Z

Right now the finetuning example seems to only work correctly for CPU-only training or max. GPU layers. But in principle it should be possible to use the same partial offloading logic that is used for prompt processing to accelerate the training of models that need more memory than there is VRAM available.

JohannesGaessler added this to ggml model training May 12, 2025

JohannesGaessler moved this to Todo in ggml model training May 12, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Partial offload support for training #13486

Partial offload support for training #13486

JohannesGaessler commented May 12, 2025

Partial offload support for training #13486

Partial offload support for training #13486

Comments

JohannesGaessler commented May 12, 2025