Pulse · huggingface/open-r1 · GitHub

May 7, 2025 – May 10, 2025

Overview

6 Active pull requests

1 Active issue
- 5 Merged pull requests
- 1 Open pull request
- 1 Closed issue
- 0 New issues

5 Pull requests merged by 3 people

Add time to Slurm
#639 merged May 9, 2025
Use pass@1 for all evals
#633 merged May 9, 2025
soft_overlong_punishment from DAPO paper
#638 merged May 9, 2025
Fix style again :)
#636 merged May 8, 2025
Code Execution using Morph Cloud
#614 merged May 8, 2025

1 Pull request opened by 1 person

Add dataset filtering script
#637 opened May 9, 2025

1 Issue closed by 1 person

Release 32B math-220k supervised fine-tuned weights
#634 closed May 8, 2025

6 Unresolved conversations

Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.

Sequence length problem
#579 commented on May 8, 2025 • 0 new comments
Reproducing GRPO based on Qwen2.5-1.5B-Instruct and using Math-220K dataset Yields Unexpected Results
#538 commented on May 9, 2025 • 0 new comments
The kl divergence collapses but the format reward becomes larger
#373 commented on May 9, 2025 • 0 new comments
OpenR1-Qwen-7B achieves 47.40 on AIME24, better than reported!
#622 commented on May 9, 2025 • 0 new comments
When I run the GRPO demo, I find that format_reward is always 0！！！
#235 commented on May 9, 2025 • 0 new comments
[WIP] R1-Zero-like experiments
#569 commented on May 9, 2025 • 0 new comments