Tags · philiptaron/llama.cpp

b2128

CUDA: mul_mat_vec_q tiling, refactor mul mat logic (ggml-org#5434)

* CUDA: mul_mat_vec_q tiling, refactor mul mat logic

Co-authored-by: slaren <[email protected]>

---------

Co-authored-by: slaren <[email protected]>

Feb 11, 2024
3bdc4cd
zip
tar.gz

b1894

py : remove unnecessary hasattr (ggml-org#4903)

Jan 16, 2024
5c99960
zip
tar.gz

b1886

android : introduce starter project example (ggml-org#4926)

* Introduce starter project for Android

Based on examples/llama.swiftui.

* Add github workflow

* Set NDK version

* Only build arm64-v8a in CI

* Sync bench code

* Rename CI prop to skip-armeabi-v7a

* Remove unused tests

Jan 16, 2024
862f5e4
zip
tar.gz

b1840

llama.swiftui : update models layout (ggml-org#4826)

* Updated Models Layout

- Added a models drawer
- Added downloading directly from Hugging Face
- Load custom models from local folder
- Delete models by swiping left

* trimmed trailing white space

* Updated Models Layout

Jan 12, 2024
e790eef
zip
tar.gz

b1708

llama : add AWQ for llama, llama2, mpt, and mistral models (ggml-org#…

…4593)

* update: awq support llama-7b model

* update: change order

* update: benchmark results for llama2-7b

* update: mistral 7b v1 benchmark

* update: support 4 models

* fix: Readme

* update: ready for PR

* update: readme

* fix: readme

* update: change order import

* black

* format code

* update: work for bot mpt and awqmpt

* update: readme

* Rename to llm_build_ffn_mpt_awq

* Formatted other files

* Fixed params count

* fix: remove code

* update: more detail for mpt

* fix: readme

* fix: readme

* update: change folder architecture

* fix: common.cpp

* fix: readme

* fix: remove ggml_repeat

* update: cicd

* update: cicd

* uppdate: remove use_awq arg

* update: readme

* llama : adapt plamo to new ffn

ggml-ci

---------

Co-authored-by: Trần Đức Nam <[email protected]>
Co-authored-by: Le Hoang Anh <[email protected]>
Co-authored-by: Georgi Gerganov <[email protected]>

Dec 27, 2023
f679349
zip
tar.gz

b1699

simplify bug issue template (ggml-org#4623)

Dec 24, 2023
b9f4795
zip
tar.gz

b1696

fallback to CPU buffer if host buffer alloc fails (ggml-org#4610)

Dec 23, 2023
708e179
zip
tar.gz

b1680

ggml : change ggml_scale to take a float instead of tensor (ggml-org#…

…4573)

* ggml : change ggml_scale to take a float instead of tensor

* ggml : fix CPU implementation

* tests : fix test-grad0

ggml-ci

Dec 21, 2023
afefa31
zip
tar.gz

gguf-v0.4.0

gguf version 0.4.0

Oct 2, 2023
0fe3210
zip
tar.gz

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

b2128

b1894

b1886

b1840

b1708

b1699

b1696

b1680

gguf-v0.4.0

Tags: philiptaron/llama.cpp