Skip to content

训练正常 eval时报assert error #4081

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
shwangshoudao opened this issue May 5, 2025 · 2 comments
Open

训练正常 eval时报assert error #4081

shwangshoudao opened this issue May 5, 2025 · 2 comments

Comments

@shwangshoudao
Copy link

[rank2]: batch_encoded_inputs = self._prepare_batch_inputs(inputs, total_rewards)
[rank2]: File "/opt/conda/lib/python3.10/site-packages/swift/trainers/rlhf_trainer/grpo_trainer.py", line 1026, in _prepare_batch_inputs
[rank2]: assert len(inputs) == bs * gas, f'Expected {bs * gas} inputs, got {len(inputs)}'
[rank2]: AssertionError: Expected 32 inputs, got 4 使用grpo训练模型是报了这个错误,
对应的训练脚本是这个
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7
NPROC_PER_NODE=6
swift rlhf
--rlhf_type grpo
--model ${model_path}
--external_plugins ${external_plugins_name}
--reward_funcs custom_acc custom_format
--use_vllm true
--vllm_device auto
--vllm_gpu_memory_utilization 0.9
--vllm_max_model_len 5192
--num_infer_workers 2
--num_generations 24
--train_type lora
--lora_rank 64
--lora_alpha 256
--torch_dtype bfloat16
--dataset ${data_path}
--max_completion_length 3072
--num_train_epochs 1
--per_device_train_batch_size 4
--per_device_eval_batch_size 4
--learning_rate 1e-6
--gradient_accumulation_steps 8
# --eval_steps 200
--save_steps 1
--save_total_limit 20
--logging_steps 1
--max_length 5192
--output_dir ${model_output}
--warmup_ratio 0.01
--dataloader_num_workers 4
--dataset_num_proc 4
--temperature 0.7
--top_p 0.95
--top_k 20
--deepspeed zero3
--log_completions true
这里我有几个问题,我参考了官方脚本GRPO.md,没有显式的制定eval_datasets,这里是应该显式指定吗,如果不显式指定,默认是怎么分的呢?

@shwangshoudao
Copy link
Author

@Jintao-Huang 麻烦大佬帮忙看下

@hjh0119
Copy link
Collaborator

hjh0119 commented May 6, 2025

The shape issue has been resolved in the main branch.

By default, the eval_dataset takes 0.01 of the train_dataset (as determined by the split_dataset_ratio parameter).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants