Skip to content

feat: rlhf generation samples log to swanlab #4907

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jul 11, 2025

Conversation

Zeyi-Lin
Copy link
Contributor

@Zeyi-Lin Zeyi-Lin commented Jul 10, 2025

PR type

  • Bug Fix
  • New Feature
  • Document Updates
  • More Models or Datasets Support

PR information

This PR supports using swanlab to record intermediate results when performing grpo and rm reinforcement learning training in ms-swift. The results can be seen in the following link:

:( I am unable to upload images to the Github Issue normally.

Demo:

@hjh0119
Copy link
Collaborator

hjh0119 commented Jul 10, 2025

Thanks! It's really needed.

could you also patch the TRL profiling function to upload profiling_metrics to SwanLab?

Related code:
trl: https://github.com/huggingface/trl/blob/v0.19.1/trl/extras/profiling.py#L31-L68
swift: https://github.com/modelscope/ms-swift/blob/main/swift/trainers/rlhf_trainer/grpo_trainer.py#L30C1

@Zeyi-Lin
Copy link
Contributor Author

@hjh0119 Great suggestion, I have implemented the patch function. Please check if there are other areas that need improvement.

demo:https://swanlab.cn/@ZeyiLin/ms-swift-rlhf/runs/40xdzwqpxpnf50edof9zg/chart

@hjh0119
Copy link
Collaborator

hjh0119 commented Jul 11, 2025

LGTM

Nit: maybe patch profiling_context as well?

@Zeyi-Lin
Copy link
Contributor Author

LGTM

Nit: maybe patch profiling_context as well?Nit:也许也修补 profiling_context?

Thanks. I guess you mean profiling_decorator? In the latest commit I have added patch_profiling_decorator.

:( Unit test errors do not seem to be caused by my changes; it seems to be an error caused by a Docker container.

@hjh0119 hjh0119 merged commit 1e8727f into modelscope:main Jul 11, 2025
1 of 2 checks passed
@hjh0119 hjh0119 mentioned this pull request Jul 11, 2025
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants