Tags · bssrdf/llama.cpp

b4681

sync: minja (google/minja@a72057e) (ggml-org#11774)

Feb 10, 2025
d7b31a9
zip
tar.gz
Downloads

b4524

Add Jinja template support (ggml-org#11016)

* Copy minja from google/minja@58f0ca6

* Add --jinja and --chat-template-file flags

* Add missing <optional> include

* Avoid print in get_hf_chat_template.py

* No designated initializers yet

* Try and work around msvc++ non-macro max resolution quirk

* Update test_chat_completion.py

* Wire LLM_KV_TOKENIZER_CHAT_TEMPLATE_N in llama_model_chat_template

* Refactor test-chat-template

* Test templates w/ minja

* Fix deprecation

* Add --jinja to llama-run

* Update common_chat_format_example to use minja template wrapper

* Test chat_template in e2e test

* Update utils.py

* Update test_chat_completion.py

* Update run.cpp

* Update arg.cpp

* Refactor common_chat_* functions to accept minja template + use_jinja option

* Attempt to fix linkage of LLAMA_CHATML_TEMPLATE

* Revert LLAMA_CHATML_TEMPLATE refactor

* Normalize newlines in test-chat-templates for windows tests

* Forward decl minja::chat_template to avoid eager json dep

* Flush stdout in chat template before potential crash

* Fix copy elision warning

* Rm unused optional include

* Add missing optional include to server.cpp

* Disable jinja test that has a cryptic windows failure

* minja: fix vigogne (google/minja#22)

* Apply suggestions from code review

Co-authored-by: Xuan Son Nguyen <[email protected]>
Co-authored-by: Georgi Gerganov <[email protected]>

* Finish suggested renamings

* Move chat_templates inside server_context + remove mutex

* Update --chat-template-file w/ recent change to --chat-template

* Refactor chat template validation

* Guard against missing eos/bos tokens (null token otherwise throws in llama_vocab::impl::token_get_attr)

* Warn against missing eos / bos tokens when jinja template references them

* rename: common_chat_template[s]

* reinstate assert on chat_templates.template_default

* Update minja to google/minja@b8437df

* Update minja to google/minja#25

* Update minja from google/minja#27

* rm unused optional header

---------

Co-authored-by: Xuan Son Nguyen <[email protected]>
Co-authored-by: Georgi Gerganov <[email protected]>

Jan 21, 2025
6171c9d
zip
tar.gz
Downloads

b2703

llava : use logger in llava-cli (ggml-org#6797)

This change removes printf() logging so llava-cli is shell scriptable.

Apr 21, 2024
89b0bf0
zip
tar.gz
Downloads

b2699

ci: add ubuntu latest release and fix missing build number (mac & ubu…

…ntu) (ggml-org#6748)

Apr 19, 2024
0e4802b
zip
tar.gz
Downloads

b2251

server : add KV cache quantization options (ggml-org#5684)

Feb 23, 2024
fd43d66
zip
tar.gz
Downloads

b2116

metal : use autoreleasepool to avoid memory leaks (ggml-org#5437)

There appears to be a known memory leak when using the
`MLTCommandBuffer`. It is suggested to use `@autoreleasepool` in
[1,2]

[1] https://developer.apple.com/forums/thread/662721
[2] https://forums.developer.apple.com/forums/thread/120931

This change-set wraps the `ggml_metal_graph_compute` in a
`@autoreleasepool`.

This commit addresses ggml-org#5436

Feb 10, 2024
f026f81
zip
tar.gz
Downloads

b1967

android : use release cmake build type by default (ggml-org#5123)

Jan 25, 2024
256d1bb
zip
tar.gz
Downloads

b1803

llava-cli : don't crash if --image flag is invalid (ggml-org#4835)

This change fixes an issue where supplying `--image missing-file` would
result in a segfault due to a null pointer being dereferenced. This can
result in distracting info being printed if robust crash analysis tools
are being used.

Jan 9, 2024
36e5a08
zip
tar.gz

b1796

ggml : fix vld1q_s8_x4 32-bit compat (ggml-org#4828)

* ggml : fix vld1q_s8_x4 32-bit compat

ggml-ci

* ggml : fix 32-bit ARM compat (cont)

ggml-ci

Jan 9, 2024
18c2e17
zip
tar.gz

b1795

CUDA: faster softmax via shared memory + fp16 math (ggml-org#4742)

Jan 9, 2024
8f900ab
zip
tar.gz

PreviousNext

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

b4681

b4524

b2703

b2699

b2251

b2116

b1967

b1803

b1796

b1795

Tags: bssrdf/llama.cpp