Skip to content

InternVL3 lora 训练时解冻vit,freeze llm,训练新场景时,eval_acc 一直很低 #3890

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
zhengxingmao opened this issue Apr 15, 2025 · 2 comments

Comments

@zhengxingmao
Copy link

输入的是视频数据,5秒一段,数据格式是多选题。训练命令如下:
CUDA_VISIBLE_DEVICES=0,1,2,3
VIDEO_MAX_PIXELS=50176
VIDEO_SEGMENTS=8
swift sft
--model internvl3
--dataset dataset_internvl
--train_type lora
--torch_dtype bfloat16
--num_train_epochs 1
--per_device_train_batch_size 1
--per_device_eval_batch_size 1
--learning_rate 2e-6
--lora_rank 8
--lora_alpha 32
--freeze_llm true
--freeze_vit false
--gradient_accumulation_steps $(expr 16 / $nproc_per_node)
--eval_steps 50
--save_steps 50
--save_total_limit 2
--logging_steps 5
--max_length 2048
--output_dir output
--warmup_ratio 0.15
--dataloader_num_workers 4
--deepspeed zero3

@zhengxingmao zhengxingmao changed the title InternVL3 lora 解冻vit,freeze llm,训练新场景时,eval_acc 一直很低 InternVL3 lora 训练时解冻vit,freeze llm,训练新场景时,eval_acc 一直很低 Apr 15, 2025
@Jintao-Huang
Copy link
Collaborator

建议 freeze_llm false

@zhengxingmao
Copy link
Author

感谢你的回复,freeze_llm false 的意义是什么呢?特别是对应新场景来说

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants