Skip to content
@PRIME-RL

PRIME-RL

Researching scalable (RL) methods on language models.

Pinned Loading

  1. Entropy-Mechanism-of-RL Entropy-Mechanism-of-RL Public

    The Entropy Mechanism of Reinforcement Learning for Large Language Model Reasoning.

    Python 185 6

  2. SimpleVLA-RL SimpleVLA-RL Public

    Online RL with Simple Reward Enables Training VLA Models with Only One Trajectory

    Python 230 6

  3. PRIME PRIME Public

    Scalable RL solution for advanced reasoning of language models

    Python 1.6k 96

  4. TTRL TTRL Public

    TTRL: Test-Time Reinforcement Learning

    Python 644 46

  5. ImplicitPRM ImplicitPRM Public

    Repo of paper "Free Process Rewards without Process Labels"

    Python 152 10

Repositories

Showing 5 of 5 repositories

Top languages

Python

Most used topics

Loading…