-
Notifications
You must be signed in to change notification settings - Fork 733
Insights: modelscope/ms-swift
Overview
Could not load contribution data
Please try again later
13 Pull requests merged by 6 people
-
[web-ui]Modify open parameter for Accordion
#4859 merged
Jul 8, 2025 -
[dataset] fix dataset ddp write conflict
#4860 merged
Jul 7, 2025 -
Support Kwai-Keye/Keye-VL-8B-Preview
#4856 merged
Jul 7, 2025 -
[template] fix qwen3 remove '<think></think>'
#4857 merged
Jul 7, 2025 -
[grpo] update doc
#4853 merged
Jul 7, 2025 -
Fix test bug
#4851 merged
Jul 7, 2025 -
[grpo] fix offpolicy check
#4852 merged
Jul 7, 2025 -
[grpo]Fix bug when repeatedly call inputs_to_rolloutrequest
#4823 merged
Jul 7, 2025 -
[grpo] deprecated params for 3.6
#4848 merged
Jul 7, 2025 -
[megatron] fix eval_iters -1
#4847 merged
Jul 7, 2025 -
fix bug: grpo train error for deepseek model
#4833 merged
Jul 7, 2025 -
[megatron] Fix the display issue for train_type=lora
#4845 merged
Jul 7, 2025 -
update stream & fix bugs
#4842 merged
Jul 7, 2025
1 Pull request opened by 1 person
-
[grpo] entropy mask
#4850 opened
Jul 7, 2025
6 Issues closed by 4 people
-
error when finetuning qwen3 in modelscope notebook.
#4811 closed
Jul 8, 2025 -
DDP环境下FileNotFoundError问题
#4840 closed
Jul 7, 2025 -
开启了ignore_empty_think,框架会自动删除<think>\n\n</think>\n\n,导致模型不思考
#4854 closed
Jul 7, 2025 -
ALL_PARALLEL_STYLES argument of type 'NoneType' is not iterable
#4843 closed
Jul 7, 2025 -
GRPO训练结果异常
#4800 closed
Jul 7, 2025 -
grpo + gen_rm 流程中的GenRMPlugin是否重复跑了数据
#4846 closed
Jul 7, 2025
6 Issues opened by 6 people
-
Support for fine-tuning more multimodal embedding models (beyond GME)
#4861 opened
Jul 7, 2025 -
per_device_train_batch_size 变大 代码报错
#4858 opened
Jul 7, 2025 -
Evaluation don't run during training for custom dataset
#4855 opened
Jul 7, 2025 -
qwen2.5vl是否支持4bit的kv_cache量化?
#4849 opened
Jul 7, 2025 -
使用ms-swift sft之后模型的config.json文件变了,导致我不能直接使用vllm部署模型
#4844 opened
Jul 7, 2025 -
grpo + gen_rm padding index error
#4841 opened
Jul 7, 2025
8 Unresolved conversations
Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.
-
[有人碰到过吗?]qwen2.5vl微调agent出现坐标点偏移问题
#4831 commented on
Jul 7, 2025 • 0 new comments -
Qwen2-Audio using flash attention: error occurs:RuntimeError: cu_seqlens_q must have shape (batch_size + 1)
#2542 commented on
Jul 7, 2025 • 0 new comments -
Trained Qwen 3 model seems to be broken.
#4835 commented on
Jul 7, 2025 • 0 new comments -
请教下GRPO训练时出现模型多次异常触碰到Max_length的问题
#4758 commented on
Jul 7, 2025 • 0 new comments -
deepspeed AutoTP + ZeRO
#3797 commented on
Jul 8, 2025 • 0 new comments -
关于ms-swift eval 回测自定义数据集的问题, 而不得不使用evalscope来解决评测,希望尽快支持system字段
#3792 commented on
Jul 8, 2025 • 0 new comments -
How to specify the split (train/validation) for the dataset in cli
#3789 commented on
Jul 8, 2025 • 0 new comments -
[WIP][megatron] support LoRA
#4812 commented on
Jul 7, 2025 • 0 new comments