Skip to content

MPT support in llama.cpp #3417

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 17 commits into from
Oct 10, 2023
Merged
Changes from 1 commit
Commits
Show all changes
17 commits
Select commit Hold shift + click to select a range
b49792b
CUDA: added support for ggml_clamp (see also: https://github.com/gger…
jploski Sep 30, 2023
15236e8
mpt : added an implementation based (mostly) on falcon integration, m…
jploski Sep 30, 2023
84e30e8
mpt : protect against "clip_qkv": null in mpt-7b
jploski Sep 30, 2023
00e8c5c
mpt : quick fix to avoid "Strange model" warning when quantizing MPT …
jploski Sep 30, 2023
1be89c4
mpt : addendum to changeset:84e30e8 - leave parameter clamp_kqv out f…
jploski Sep 30, 2023
26c253e
mpt : standardized all tensor names to follow GGUF spec
jploski Sep 30, 2023
df072d2
mpt : addendum to changeset:1be89c40 - use "req" parameter of GGUF_GE…
jploski Sep 30, 2023
90e7d6d
mpt : fixed comment s/gptneox/mpt/
jploski Oct 2, 2023
4708012
mpt : remove tabs, trailing whitespace
jploski Oct 2, 2023
1364bcd
mpt : removed ne01 + n_past == ne00 assertion from alibi (cuda/f32) a…
jploski Oct 3, 2023
7d6a24a
mpt : updated convert-mpt-hf-to-gguf.py to reflect changes made to co…
jploski Oct 6, 2023
292363e
Merge branch 'master' of https://github.com/ggerganov/llama.cpp into …
cebtenzzre Oct 9, 2023
ad3c2f3
comment out n_past instead of marking it unused
cebtenzzre Oct 9, 2023
1a454eb
mpt : removed hardcoded +178 from convert script in favor of utilizin…
jploski Oct 9, 2023
32172f1
mpt : remove unused tokenizer_json in convert script
cebtenzzre Oct 9, 2023
96cf3f5
ggml : remove obsolete n_past assert in ggml_alibi
ggerganov Oct 10, 2023
9b66378
llama : print clam_kqv and max_alibi_bias hparams
ggerganov Oct 10, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
ggml : remove obsolete n_past assert in ggml_alibi
  • Loading branch information
ggerganov committed Oct 10, 2023
commit 96cf3f5dc3e145a9555df377947ac57ecabaa708
2 changes: 0 additions & 2 deletions ggml.c
Original file line number Diff line number Diff line change
Expand Up @@ -13064,8 +13064,6 @@ static void ggml_compute_forward_alibi_f32(
float max_bias;
memcpy(&max_bias, (int32_t *) dst->op_params + 2, sizeof(float));

assert(n_past >= 0);

const int64_t ne0 = src0->ne[0]; // all_seq_len = n_past + ne1
const int64_t ne1 = src0->ne[1]; // seq_len_without_past
const int64_t ne2 = src0->ne[2]; // n_head -> this is k
Expand Down