-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Insights: huggingface/open-r1
Overview
-
- 5 Merged pull requests
- 1 Open pull request
- 1 Closed issue
- 0 New issues
Could not load contribution data
Please try again later
5 Pull requests merged by 3 people
-
Add time to Slurm
#639 merged
May 9, 2025 -
Use pass@1 for all evals
#633 merged
May 9, 2025 -
soft_overlong_punishment from DAPO paper
#638 merged
May 9, 2025 -
Fix style again :)
#636 merged
May 8, 2025 -
Code Execution using Morph Cloud
#614 merged
May 8, 2025
1 Pull request opened by 1 person
-
Add dataset filtering script
#637 opened
May 9, 2025
1 Issue closed by 1 person
-
Release 32B math-220k supervised fine-tuned weights
#634 closed
May 8, 2025
6 Unresolved conversations
Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.
-
Sequence length problem
#579 commented on
May 8, 2025 • 0 new comments -
Reproducing GRPO based on Qwen2.5-1.5B-Instruct and using Math-220K dataset Yields Unexpected Results
#538 commented on
May 9, 2025 • 0 new comments -
The kl divergence collapses but the format reward becomes larger
#373 commented on
May 9, 2025 • 0 new comments -
OpenR1-Qwen-7B achieves 47.40 on AIME24, better than reported!
#622 commented on
May 9, 2025 • 0 new comments -
When I run the GRPO demo, I find that format_reward is always 0!!!
#235 commented on
May 9, 2025 • 0 new comments -
[WIP] R1-Zero-like experiments
#569 commented on
May 9, 2025 • 0 new comments