convert : support rope_scaling type and rope_type #13349

CISC · 2025-05-07T07:16:40Z

At some point transformers renamed rope_scaling type to rope_type, so support both.

ngxson · 2025-05-07T09:18:22Z

The same code is copied in multiple places, so I think it's better to group them into a new function like self.set_rope_config()

Edit: or we can extend self.set_gguf_parameters() to support this

Btw, which model(s) you have been testing with?

CISC · 2025-05-07T10:19:20Z

The problem with extending the base set_gguf_parameters method is that you will have to merge all special cases, like f.ex. DeepseekV2Model, it can get messy.

I tested with a few models I still had the original files for, not every single one I touched, but I'm fairly sure I didn't break anything. :)

ngxson

I still think having this as a dedicated function like self.set_rope_config() will make it easier to maintain. We can optionally check if rope_scaling["factor"] has a good value (i.e. non-null), but it's up to you anyway.

CISC · 2025-05-08T06:49:43Z

I absolutely agree, but I worry about the special cases. :)

I'll see what I can do...

CISC · 2025-05-08T13:33:50Z

I will merge this as-is for now and make a new PR later.

Deduplication of the rope code requires careful thought (I think we also can deduplicate the llama3 rope_freqs calculations).

* origin/master: (39 commits) server : vision support via libmtmd (ggml-org#12898) sycl : implementation of reordered Q4_0 MMVQ for Intel GPUs (ggml-org#12858) metal : optimize MoE for large batches (ggml-org#13388) CUDA: FA support for Deepseek (Ampere or newer) (ggml-org#13306) llama : do not crash if there is no CPU backend (ggml-org#13395) CUDA: fix crash on large batch size for MoE models (ggml-org#13384) imatrix : Add --parse-special for enabling parsing of special tokens in imatrix calculation (ggml-org#13389) llama-run: add support for downloading models from ModelScope (ggml-org#13370) mtmd : fix batch_view for m-rope (ggml-org#13397) llama : one-off chat template fix for Mistral-Small-2503 (ggml-org#13398) rpc : add rpc_msg_set_tensor_hash_req (ggml-org#13353) vulkan: Allow up to 4096 elements for mul_mat_id row_ids (ggml-org#13326) server : (webui) rename has_multimodal --> modalities (ggml-org#13393) ci : limit write permission to only the release step + fixes (ggml-org#13392) mtmd : Expose helper_decode_image_chunk (ggml-org#13366) server : (webui) fix a very small misalignment (ggml-org#13387) server : (webui) revamp the input area, plus many small UI improvements (ggml-org#13365) convert : support rope_scaling type and rope_type (ggml-org#13349) mtmd : fix the calculation of n_tokens for smolvlm (ggml-org#13381) context : allow cache-less context for embeddings (ggml-org#13108) ...

CISC added 2 commits May 7, 2025 09:10

support rope_scaling type and rope_type

25cd3b4

indent fix

d55c699

github-actions bot added the python python script changes label May 7, 2025

CISC requested a review from ngxson May 7, 2025 07:17

check for rope_type first

e68782e

ngxson approved these changes May 7, 2025

View reviewed changes

CISC merged commit 1a844be into master May 8, 2025
7 checks passed

CISC deleted the cisc/convert-rope-type branch May 8, 2025 13:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

convert : support rope_scaling type and rope_type #13349

convert : support rope_scaling type and rope_type #13349

CISC commented May 7, 2025

ngxson commented May 7, 2025 •

edited

Loading

CISC commented May 7, 2025

ngxson left a comment

CISC commented May 8, 2025

CISC commented May 8, 2025

convert : support rope_scaling type and rope_type #13349

convert : support rope_scaling type and rope_type #13349

Conversation

CISC commented May 7, 2025

ngxson commented May 7, 2025 • edited Loading

CISC commented May 7, 2025

ngxson left a comment

Choose a reason for hiding this comment

CISC commented May 8, 2025

CISC commented May 8, 2025

ngxson commented May 7, 2025 •

edited

Loading