-
Notifications
You must be signed in to change notification settings - Fork 637
steps如何计算的 #3954
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
加了packing |
或者你看看 NPROC_PER_NODE是否设置正常 |
nproc_per_node=4 |
我使用了以下脚本进行训练,数据集大小约为33000条数据,per_device_batch_size=16,gradient_accumenlation_steps=32,epochs=3,4张GPU。
nproc_per_node=4
NPROC_PER_NODE=$nproc_per_node
CUDA_VISIBLE_DEVICES=0,1,2,3
swift pt
--model Qwen/Qwen2.5-7B
--train_type full
--dataset $CUSTOM_DATASET
--torch_dtype bfloat16
--num_train_epochs 3
--per_device_train_batch_size 16
--per_device_eval_batch_size 1
--learning_rate 1e-5
--gradient_accumulation_steps $(expr 128 / $nproc_per_node)
--packing true
--eval_steps 10
--save_steps 50
--save_total_limit 2
--logging_steps 5
--deepspeed zero3
--max_length 8192
--warmup_ratio 0.05
--save_only_model true
--output_dir XXXXX
如果正常计算应该是33000*3/16/32/4=48,但是实际进度条显示是193steps。请问ms_swift如何自动计算step数的?
The text was updated successfully, but these errors were encountered: