Tags: CodeLinaro/llama.cpp
Tags
convert : ability to lazy-load safetensors remotely without downloadi… …ng to disk (ggml-org#12820) * gguf util : add SafetensorRemote * fix style * convert: add --remote option * convert : allow using lazy remote tensors It's a bit slow for now since everything is blocking and single-threaded. * correct metadata.name * small style fix * support HF_TOKEN * convert : use writeable buffer for remote lazy tensors * convert : fix flake8 lint regarding lamdba assigment * multithreaded download * multithread: print debug * fix style * Revert "multithreaded download" This reverts commit 42fc895. * bring back _get_request_headers --------- Co-authored-by: Francis Couture-Harpin <[email protected]>
opencl : fix memory allocation size (ggml-org#12649) issue: #17 (comment) This patch fixes the memory allocation size not exceeding the maximum size of the OpenCL device.
opencl: simplify kernel embedding logic in cmakefile (ggml-org#12503) Co-authored-by: Max Krasnyansky <[email protected]>
llguidance build fixes for Windows (ggml-org#11664) * setup windows linking for llguidance; thanks @phil-scott-78 * add build instructions for windows and update script link * change VS Community link from DE to EN * whitespace fix
Make logging more verbose (ggml-org#11714) Debugged an issue with a user who was on a read-only filesystem. Signed-off-by: Eric Curtin <[email protected]>
PreviousNext