Skip to content

convert : support rope_scaling type and rope_type #13349

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
May 8, 2025
Merged

Conversation

CISC
Copy link
Collaborator

@CISC CISC commented May 7, 2025

At some point transformers renamed rope_scaling type to rope_type, so support both.

@github-actions github-actions bot added the python python script changes label May 7, 2025
@CISC CISC requested a review from ngxson May 7, 2025 07:17
@ngxson
Copy link
Collaborator

ngxson commented May 7, 2025

The same code is copied in multiple places, so I think it's better to group them into a new function like self.set_rope_config()

Edit: or we can extend self.set_gguf_parameters() to support this

Btw, which model(s) you have been testing with?

@CISC
Copy link
Collaborator Author

CISC commented May 7, 2025

The problem with extending the base set_gguf_parameters method is that you will have to merge all special cases, like f.ex. DeepseekV2Model, it can get messy.

I tested with a few models I still had the original files for, not every single one I touched, but I'm fairly sure I didn't break anything. :)

Copy link
Collaborator

@ngxson ngxson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think having this as a dedicated function like self.set_rope_config() will make it easier to maintain. We can optionally check if rope_scaling["factor"] has a good value (i.e. non-null), but it's up to you anyway.

@CISC
Copy link
Collaborator Author

CISC commented May 8, 2025

I absolutely agree, but I worry about the special cases. :)

I'll see what I can do...

@CISC
Copy link
Collaborator Author

CISC commented May 8, 2025

I will merge this as-is for now and make a new PR later.

Deduplication of the rope code requires careful thought (I think we also can deduplicate the llama3 rope_freqs calculations).

@CISC CISC merged commit 1a844be into master May 8, 2025
7 checks passed
@CISC CISC deleted the cisc/convert-rope-type branch May 8, 2025 13:34
gabe-l-hart added a commit to gabe-l-hart/llama.cpp that referenced this pull request May 9, 2025
* origin/master: (39 commits)
server : vision support via libmtmd (ggml-org#12898)
sycl : implementation of reordered Q4_0 MMVQ for Intel GPUs (ggml-org#12858)
metal : optimize MoE for large batches (ggml-org#13388)
CUDA: FA support for Deepseek (Ampere or newer) (ggml-org#13306)
llama : do not crash if there is no CPU backend (ggml-org#13395)
CUDA: fix crash on large batch size for MoE models (ggml-org#13384)
imatrix : Add --parse-special for enabling parsing of special tokens in imatrix calculation (ggml-org#13389)
llama-run: add support for downloading models from ModelScope (ggml-org#13370)
mtmd : fix batch_view for m-rope (ggml-org#13397)
llama : one-off chat template fix for Mistral-Small-2503 (ggml-org#13398)
rpc : add rpc_msg_set_tensor_hash_req (ggml-org#13353)
vulkan: Allow up to 4096 elements for mul_mat_id row_ids (ggml-org#13326)
server : (webui) rename has_multimodal --> modalities (ggml-org#13393)
ci : limit write permission to only the release step + fixes (ggml-org#13392)
mtmd : Expose helper_decode_image_chunk (ggml-org#13366)
server : (webui) fix a very small misalignment (ggml-org#13387)
server : (webui) revamp the input area, plus many small UI improvements (ggml-org#13365)
convert : support rope_scaling type and rope_type (ggml-org#13349)
mtmd : fix the calculation of n_tokens for smolvlm (ggml-org#13381)
context : allow cache-less context for embeddings (ggml-org#13108)
...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
python python script changes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants