Skip to content

feat: save and restore a context sequence state #460

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 38 commits into from
May 17, 2025
Merged
Changes from 1 commit
Commits
Show all changes
38 commits
Select commit Hold shift + click to select a range
11b5404
fix: adapt to breaking `llama.cpp` changes
giladgd May 11, 2025
8b98cf0
fix: improve GPU backend loading error description
giladgd May 11, 2025
1e8111c
chore: update template dependencies
giladgd May 11, 2025
2f9858a
test: Qwen 3 template
giladgd May 11, 2025
4c6e2b1
feat: configure Hugging Face remote endpoint for resolving URIs
giladgd May 11, 2025
d39d261
fix: race condition when reading extremely long gguf metadata
giladgd May 11, 2025
e740078
docs: typo
giladgd May 11, 2025
d6e852e
fix: update gguf types
giladgd May 11, 2025
9ab3c6d
fix: capture multi-token segment separators
giladgd May 11, 2025
656f2be
docs: solutions to more CUDA issues
giladgd May 11, 2025
6926425
feat: stream function call parameters
giladgd May 11, 2025
b369eaf
docs: update the awesome list
giladgd May 11, 2025
72c30dc
chore: update modules
giladgd May 11, 2025
df05d70
docs: more clear default values for custom cmake options
giladgd May 11, 2025
b3d510e
chore: reorder Vitepress config keys
giladgd May 11, 2025
3233603
fix: update gguf types
giladgd May 11, 2025
96c78da
docs: document new env vars
giladgd May 11, 2025
f7063d8
chore: module versions
giladgd May 12, 2025
123e524
chore: update GitHub issue templates
giladgd May 12, 2025
53a5206
test: check recommended model URIs
giladgd May 13, 2025
2e1a7ce
test: fix tests
giladgd May 14, 2025
9463ccc
feat(`QwenChatWrapper`): support discouraging the generation of thoughts
giladgd May 15, 2025
631a7e7
test: fix tests
giladgd May 15, 2025
a0cc198
feat: save and restore context sequence state
giladgd May 15, 2025
185b734
docs: save and restore context sequence state
giladgd May 15, 2025
d36670c
fix: adapt memory estimation to new added model architectures
giladgd May 15, 2025
a68590a
feat(`getLlama`): `dryRun` option
giladgd May 16, 2025
8c6134d
feat: `getLlamaGpuTypes` to get the list of available GPU types for t…
giladgd May 16, 2025
71babfa
fix: skip binary testing on certain problematic conditions
giladgd May 16, 2025
12cec69
docs: fix dead link
giladgd May 16, 2025
de3a360
fix: Paperspace tests setup script nodejs version
giladgd May 16, 2025
8eff306
fix: Windows build
giladgd May 17, 2025
f76e899
fix: types
giladgd May 17, 2025
0cbb572
test: fix tests
giladgd May 17, 2025
2c01084
fix: performance improvements
giladgd May 17, 2025
5d4c8c3
fix: remove unused files from the build dir
giladgd May 17, 2025
69d30cd
fix: remove unused line
giladgd May 17, 2025
62c8020
fix: performance improvements
giladgd May 17, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
fix: update gguf types
  • Loading branch information
giladgd committed May 11, 2025
commit 3233603c223661f1271c31cc064379cf79eb0da4
14 changes: 14 additions & 0 deletions src/gguf/types/GgufMetadataTypes.ts
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
export const enum GgufArchitectureType {
llama = "llama",
llama4 = "llama4",
deci = "deci",
falcon = "falcon",
grok = "grok",
gpt2 = "gpt2",
Expand All @@ -11,14 +13,19 @@ export const enum GgufArchitectureType {
refact = "refact",
bert = "bert",
nomicBert = "nomic-bert",
nomicBertMoe = "nomic-bert-moe",
jinaBertV2 = "jina-bert-v2",
bloom = "bloom",
stablelm = "stablelm",
qwen = "qwen",
qwen2 = "qwen2",
qwen2moe = "qwen2moe",
qwen2vl = "qwen2vl",
qwen3 = "qwen3",
qwen3moe = "qwen3moe",
phi2 = "phi2",
phi3 = "phi3",
phimoe = "phimoe",
plamo = "plamo",
codeshell = "codeshell",
orion = "orion",
Expand All @@ -27,25 +34,32 @@ export const enum GgufArchitectureType {
minicpm3 = "minicpm3",
gemma = "gemma",
gemma2 = "gemma2",
gemma3 = "gemma3",
starcoder2 = "starcoder2",
mamba = "mamba",
xverse = "xverse",
commandR = "command-r",
cohere2 = "cohere2",
dbrx = "dbrx",
olmo = "olmo",
olmo2 = "olmo2",
olmoe = "olmoe",
openelm = "openelm",
arctic = "arctic",
deepseek = "deepseek",
deepseek2 = "deepseek2",
chatglm = "chatglm",
glm4 = "glm4",
bitnet = "bitnet",
t5 = "t5",
t5encoder = "t5encoder",
jais = "jais",
nemotron = "nemotron",
exaone = "exaone",
rwkv6 = "rwkv6",
rwkv6qwen2 = "rwkv6qwen2",
rwkv7 = "rwkv7",
arwkv7 = "arwkv7",
granite = "granite",
granitemoe = "granitemoe",
chameleon = "chameleon",
Expand Down