训练正常 eval时报assert error #4081

shwangshoudao · 2025-05-05T08:20:29Z

[rank2]: batch_encoded_inputs = self._prepare_batch_inputs(inputs, total_rewards)
[rank2]: File "/opt/conda/lib/python3.10/site-packages/swift/trainers/rlhf_trainer/grpo_trainer.py", line 1026, in _prepare_batch_inputs
[rank2]: assert len(inputs) == bs * gas, f'Expected {bs * gas} inputs, got {len(inputs)}'
[rank2]: AssertionError: Expected 32 inputs, got 4 使用grpo训练模型是报了这个错误，
对应的训练脚本是这个
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7
NPROC_PER_NODE=6
swift rlhf
--rlhf_type grpo
--model ${model_path}
--external_plugins ${external_plugins_name}
--reward_funcs custom_acc custom_format
--use_vllm true
--vllm_device auto
--vllm_gpu_memory_utilization 0.9
--vllm_max_model_len 5192
--num_infer_workers 2
--num_generations 24
--train_type lora
--lora_rank 64
--lora_alpha 256
--torch_dtype bfloat16
--dataset ${data_path}
--max_completion_length 3072
--num_train_epochs 1
--per_device_train_batch_size 4
--per_device_eval_batch_size 4
--learning_rate 1e-6
--gradient_accumulation_steps 8
# --eval_steps 200
--save_steps 1
--save_total_limit 20
--logging_steps 1
--max_length 5192
--output_dir ${model_output}
--warmup_ratio 0.01
--dataloader_num_workers 4
--dataset_num_proc 4
--temperature 0.7
--top_p 0.95
--top_k 20
--deepspeed zero3
--log_completions true
这里我有几个问题，我参考了官方脚本GRPO.md，没有显式的制定eval_datasets，这里是应该显式指定吗，如果不显式指定，默认是怎么分的呢？

shwangshoudao · 2025-05-05T08:20:45Z

@Jintao-Huang 麻烦大佬帮忙看下

hjh0119 · 2025-05-06T02:11:55Z

The shape issue has been resolved in the main branch.

By default, the eval_dataset takes 0.01 of the train_dataset (as determined by the split_dataset_ratio parameter).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

训练正常 eval时报assert error #4081

训练正常 eval时报assert error #4081

shwangshoudao commented May 5, 2025

shwangshoudao commented May 5, 2025

hjh0119 commented May 6, 2025

训练正常 eval时报assert error #4081

训练正常 eval时报assert error #4081

Comments

shwangshoudao commented May 5, 2025

shwangshoudao commented May 5, 2025

hjh0119 commented May 6, 2025