Comparing changes

* feat: save and restore a context sequence state * feat: stream function call parameters * feat: configure Hugging Face remote endpoint for resolving URIs * feat: Qwen 3 support * feat(`QwenChatWrapper`): support discouraging the generation of thoughts * feat(`getLlama`): `dryRun` option * feat: `getLlamaGpuTypes` function * fix: adapt to breaking `llama.cpp` changes * fix: capture multi-token segment separators * fix: race condition when reading extremely long gguf metadata * fix: adapt memory estimation to new added model architectures * fix: skip binary testing on certain problematic conditions * fix: improve GPU backend loading error description * fix: update gguf types * fix: performance improvements * docs: update the awesome list * docs: solutions to more CUDA issues

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comparing changes

Open a pull request

Commits on May 17, 2025

This comparison is taking too long to generate.