Skip to content

为什么没有loss #4652

Open
Open
@AHFJE

Description

@AHFJE

我用如下脚本微调 qwen3-reranker-0.6B模型
NPROC_PER_NODE=2
swift sft
--model /data0/dikaer/jinyangyi/models/Qwen3-Reranker-0.6B
--dataset data/train_rerank.jsonl
--model_type qwen3
--torch_dtype bfloat16
--train_type full
--max_length 2048
--split_dataset_ratio 0
--output_dir output_qwen0.6B_rerank
--save_total_limit 1
--num_train_epochs 3
--logging_steps 100
--save_steps 500
--per_device_train_batch_size 32
--gradient_accumulation_steps 8
--learning_rate 8e-5
--loss_scale ignore_empty_think
--warmup_ratio 0.05 \

以下是我数据集的 jsonl 的格式
{"system": "Judge whether the Document meets the requirements based on the Query and the Instruct provided. Note that the answer can only be "yes" or "no".", "query": "3pcs Fierce Muscle Brown Bear Vinyl Decals - Durable Car Window Stickers for Glass & Metal Surfaces, Self-Adhesive Auto Accessories", "document": "Automotive-Exterior Accessories-Bumper Stickers, Decals & Magnets-Decals", "label": "yes"}

然后训练
{'loss': 0.0, 'grad_norm': 0.0, 'learning_rate': 1.9e-07, 'memory(GiB)': 34.06, 'train_speed(iter/s)': 0.226097, 'epoch': 0.0, 'global_step/max_steps': '1/8397', 'percentage': '0.01%', 'elapsed_time': '4s', 'remaining_time': '9h 23m 1s'}
loss 为0

请教一下是哪里出了问题

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions