Skip to content

对InternVL3-8B进行微调时报错 #3959

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
sylvia-ljq opened this issue Apr 22, 2025 · 1 comment
Closed

对InternVL3-8B进行微调时报错 #3959

sylvia-ljq opened this issue Apr 22, 2025 · 1 comment

Comments

@sylvia-ljq
Copy link

运行脚本:
NPROC_PER_NODE=4
CUDA_VISIBLE_DEVICES=3,5,7,8
MAX_PIXELS=1003520
swift sft
--model /pre_llms/InternVL3-8B
--model_type internvl3
--dataset /ms-swift-250417/data_train.jsonl
--train_type custom
--optimizer custom
--external_plugins /examples/train/multimodal/lora_llm_full_vit/custom_plugin.py
--torch_dtype bfloat16
--num_train_epochs 1
--per_device_train_batch_size 1
--per_device_eval_batch_size 1
--learning_rate 2e-5
--lora_rank 16
--lora_alpha 32
--gradient_accumulation_steps 8
--eval_steps 100
--save_steps 100
--save_total_limit 2
--logging_steps 5
--max_length 8192
--output_dir output/internvl3
--warmup_ratio 0.05
--dataloader_num_workers 4
--dataset_num_proc 4
--deepspeed zero2
--save_only_model true

报错:
ValueError: Target modules ^model.*.(o_proj|down_proj|gate_proj|v_proj|q_proj|k_proj|up_proj)$ not found in the base model. Please check the target modules and try again.

@Jintao-Huang
Copy link
Collaborator

main分支修复了

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants