You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
$ ./llama-server --version
version: 5372 (ab3971f2)
built with Apple clang version 16.0.0 (clang-1600.0.26.6) for arm64-apple-darwin23.6.0
and
Docker Container with an A100
$ ./llama-server --version
ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 CUDA devices:
Device 0: NVIDIA A100-SXM4-80GB, compute capability 8.0, VMM: yes
load_backend: loaded CUDA backend from /app/libggml-cuda.so
load_backend: loaded CPU backend from /app/libggml-cpu-haswell.so
version: 5361 (cf0a43bb)
built with cc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 for x86_64-linux-gnu
This is strange, tokenizer.ggml.precompiled_charsmap should be UINT8, yet it seems to be INT32 here.
The charsmap array comes from sentencepiece and should be in the form of bytes, which gets automatically stored as a UINT8 array by GGUFWriter, but not for whomever converted the linked model...
I used the gguf-editor-gui contributed in #12930 by @christopherthompson81 to adjust the model name. That must have corrupted the model files. I will see if I can revert to the original GGUFs, or just reconvert them from scratch.
edit: The original, unedited model files have been restored. I would appreciate if someone could open a bug report against the gguf-editor-gui tool.
Name and Version
Tested on MacBook Pro:
and
Docker Container with an A100
Operating systems
Mac, Linux
GGML backends
CUDA, Metal
Hardware
Apple M3 Pro
+
server-cuda
container with an A100Models
nomic-ai/nomic-embed-text-v2-moe-GGUF
Problem description & steps to reproduce
Running the command in nomic's model card:
causes the following error:
I have found the same the GGUF for f32, and the same error running in the
server-cuda
container.First Bad Commit
No response
Relevant log output
The text was updated successfully, but these errors were encountered: