Skip to content

grpo + gen_rm padding index error #4841

Open
@fffffurina

Description

@fffffurina

Describe the bug
model: deepseek distill qwen3 8b 数据sft过
rm model:qwq sft过

Image

Metadata

Metadata

Assignees

No one assigned

    Labels

    needs more infoAdditional information or clarification is required to proceed

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions