[WARNING:swift] No training was carried out, which may be due to the dataset being too small or incorrect usage of resume_from_checkpoint.

I finished my GRPO training, however on the final epoch it gave me this message:
```
Train: 100%|█████████████████████████████████████| 811/812 [53:11:32<03:56, 236.12s/it]
[WARNING:swift] No training was carried out, which may be due to the dataset being too small or incorrect usage of resume_from_checkpoint.
[INFO:swift] End time of running main: 2025-04-13 06:05:03.922258
```
And so I did not get any trained checkpoints from this after such a long training time. I have been able to get checkpoints before, but I just changed some hyperparameters (data remains the same 16k size) and I do not use `resume_from_checkpoint`, so I am confused why this happened and how to fix.

My bash script is this, which I run with the command `bash train_GRPO.sh`:
```
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6 \
NPROC_PER_NODE=5 \
swift rlhf \
    --rlhf_type grpo \
    --model /vault/ultraz/open-r1/data/llama3-3b-lora-checkpoint_1023 \
    --model_type llama3_2 \
    --train_type lora \
    --dataset /vault/ultraz/unsloth_grpo/skythought_no_tokens.csv \
    --torch_dtype bfloat16 \
    --num_train_epochs 1 \
    --per_device_train_batch_size 4 \
    --per_device_eval_batch_size 4 \
    --gradient_accumulation_steps 4 \
    --eval_steps 2000 \
    --save_steps 2000 \
    --learning_rate 1e-6 \
    --save_total_limit 2 \
    --logging_steps 1 \
    --output_dir output \
    --warmup_ratio 0.05 \
    --dataloader_num_workers 8 \
    --reward_funcs accuracy format cosine repetition \
    --num_generations 4 \
    --use_vllm true \
    --vllm_gpu_memory_utilization 0.9 \
    --vllm_max_model_len 5000 \
    --max_completion_length 5000 \
    --max_length 7000 \
    --num_infer_workers 2 \
    --deepspeed zero3 \
    --temperature 1.0 \
    --system examples/train/grpo/prompt.txt \
    --deepspeed zero2 \
    --log_completions true \
```

My library versions are:
```
vllm==0.8.3
ms_swift version: 3.3.0.dev0
```



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[WARNING:swift] No training was carried out, which may be due to the dataset being too small or incorrect usage of resume_from_checkpoint. #3863

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[WARNING:swift] No training was carried out, which may be due to the dataset being too small or incorrect usage of resume_from_checkpoint. #3863

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions