GitHub - modelscope/Trinity-RFT: Trinity-RFT is a general-purpose, flexible and scalable framework designed for reinforcement fine-tuning (RFT) of large language models (LLM).

Trinity-RFT is a general-purpose, flexible and scalable framework designed for reinforcement fine-tuning (RFT) of large language models (LLM).

Built with a decoupled design, seamless integration for agentic workflows, and systematic data processing pipelines, Trinity-RFT can be easily adapted for diverse application scenarios, and serve as a platform for exploring advanced reinforcement learning (RL) paradigms.

Vision of this project

Current RFT approaches, such as RLHF (Reinforcement Learning from Human Feedback) with proxy reward models or training long-CoT reasoning models with rule-based rewards, are limited in their ability to handle dynamic, real-world learning.

Trinity-RFT envisions a future where AI agents learn by interacting directly with environments, collecting delayed or complex reward signals, and continuously refining their behavior through RL.

For example, imagine an AI scientist that designs an experiment, executes it, waits for feedback (while working on other tasks concurrently), and iteratively updates itself based on true environmental rewards when the experiment is finally finished.

Trinity-RFT offers a path into this future by addressing critical gaps in existing solutions.

Key features

Unified RFT modes & algorithm support. Trinity-RFT unifies and generalizes existing RFT methodologies into a flexible and configurable framework, supporting synchronous/asynchronous and on-policy/off-policy/offline training, as well as hybrid modes that combine them seamlessly into a single learning process.
Agent-environment interaction as a first-class citizen. Trinity-RFT allows delayed rewards in multi-step/time-lagged feedback loops, handles long-tailed latencies and environment/agent failures gracefully, and supports distributed deployment where explorers and trainers can operate across separate devices and scale up independently.
Data processing pipelines optimized for RFT with diverse/messy data. These include converting raw datasets to prompt/task sets for RL, cleaning/filtering/prioritizing experiences stored in the replay buffer, synthesizing data for tasks and experiences, offering user interfaces for human in the loop, etc.

The design of Trinity-RFT

The overall design of Trinity-RFT exhibits a trinity:

RFT-core;
agent-environment interaction;
data processing pipelines tailored to RFT;

and the design of RFT-core also exhibits a trinity:

explorer;
trainer;
manager & buffer.

The explorer, powered by the rollout model, interacts with the environment and generates rollout trajectories to be stored in the experience buffer.

The trainer, powered by the policy model, samples batches of experiences from the buffer and updates the policy via RL algorithms.

These two can be completely decoupled and act asynchronously, except that they share the same experience buffer, and their model weights are synchronized once in a while. Such a decoupled design is crucial for making the aforementioned features of Trinity-RFT possible.

Meanwhile, Trinity-RFT has done the dirty work for ensuring high efficiency in every component of the framework, e.g., utilizing NCCL (when feasible) for model weight synchronization, sequence concatenation with proper masking for multi-turn conversations and ReAct-style workflows, pipeline parallelism for the synchronous RFT mode, among many others.

Getting started

Note

This project is currently under active development. Comments and suggestions are welcome!

Step 1: preparations

Installation from source (recommended):

# Pull the source code from GitHub
git clone https://github.com/modelscope/Trinity-RFT
cd Trinity-RFT

# Create a new environment using Conda or venv
# Option 1: Conda
conda create -n trinity python=3.10
conda activate trinity

# Option 2: venv
python3.10 -m venv .venv
source .venv/bin/activate

# Install the package in editable mode
# for bash
pip install -e .[dev]
# for zsh
pip install -e .\[dev\]

# Install flash-attn after all dependencies are installed
# Note: flash-attn will take a long time to compile, please be patient.
pip install flash-attn -v
# Try the following command if you encounter errors during installation
# pip install flash-attn -v --no-build-isolation

Installation from docker:

We provided a dockerfile for Trinity-RFT (trinity)

git clone https://github.com/modelscope/Trinity-RFT
cd Trinity-RFT

# build the docker image
# Note: you can edit the dockerfile to customize the environment
# e.g., use pip mirrors or set api key
docker build -f scripts/docker/Dockerfile -t trinity-rft:latest .

# run the docker image
docker run -it --gpus all --shm-size="64g" --rm -v $PWD:/workspace -v <root_path_of_data_and_checkpoints>:/data trinity-rft:latest

Step 2: prepare dataset and model

Trinity-RFT supports most datasets and models from Huggingface and ModelScope.

Prepare the model in the local directory $MODEL_PATH/{model_name}:

# Using Huggingface
huggingface-cli download {model_name} --local-dir $MODEL_PATH/{model_name}

# Using Modelscope
modelscope download {model_name} --local_dir $MODEL_PATH/{model_name}

For more details about model downloading, please refer to Huggingface or ModelScope.

Prepare the dataset in the local directory $DATASET_PATH/{dataset_name}:

# Using Huggingface
huggingface-cli download {dataset_name} --repo-type dataset --local-dir $DATASET_PATH/{dataset_name}

# Using Modelscope
modelscope download --dataset {dataset_name} --local_dir $DATASET_PATH/{dataset_name}

For more details about dataset downloading, please refer to Huggingface or ModelScope.

Step 3: configurations

For convenience, Trinity-RFT provides a web interface for configuring your RFT process.

Note

This is a experimental feature. We will continue to improve it and make it more user-friendly.

trinity studio --port 8080

Then you can configure your RFT process in the web page and generate a config file. You can save the config for later use or run it directly as described in the following section.

For advanced users, you can also manually configure your RFT process by editing the config file. We provide a set of example config files in examples.

Step 4: run the RFT process

First, start a ray cluster with the following command:

# On master node
ray start --head

# On worker nodes
ray start --address=<master_address>

Optionally, we can login into wandb to better monitor the RFT process:

export WANDB_API_KEY=<your_api_key>
wandb login

Then, for command-line users, run the RFT process with the following command:

trinity run --config <config_path>

For example, below is the command for fine-tuning Qwen-2.5-1.5B-Instruct on GSM8k dataset using GRPO algorithm:
trinity run --config examples/grpo_gsm8k/gsm8k.yaml

For studio users, just click the "Run" button in the web page.

For more detailed examples about how to use Trinity-RFT, please refer to the following tutorials:

Advanced usage and full configurations

Please refer to this document.

Programming guide for developers

Please refer to this document.

Contribution guide

This project is currently under active development, and we welcome contributions from the community!

Code style check:

pre-commit run --all-files

Unit tests:

python -m pytest tests

Acknowledgements

This project is built upon many excellent open-source projects, including:

verl and PyTorch's FSDP for LLM training;
vLLM for LLM inference;
Data-Juicer for data processing pipelines;
AgentScope for agentic workflow;
Ray for distributed systems;
we have also drawn inspirations from RL frameworks like OpenRLHF, TRL and ChatLearn;
......

Citation

@misc{Trinity-RFT,
  title={Trinity-RFT},
  author={{Trinity-RFT Team}},
  url={https://github.com/modelscope/trinity-rft},
  year={2025},
}

Name		Name	Last commit message	Last commit date
Latest commit History 47 Commits
.github		.github
docs		docs
environments		environments
examples		examples
scripts		scripts
tests		tests
trinity		trinity
.flake8		.flake8
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Vision of this project

Key features

The design of Trinity-RFT

Getting started

Step 1: preparations

Step 2: prepare dataset and model

Step 3: configurations

Step 4: run the RFT process

Advanced usage and full configurations

Programming guide for developers

Contribution guide

Acknowledgements

Citation

About

Releases

Packages

Contributors 9

Languages

License

modelscope/Trinity-RFT

Folders and files

Latest commit

History

Repository files navigation

Vision of this project

Key features

The design of Trinity-RFT

Getting started

Step 1: preparations

Step 2: prepare dataset and model

Step 3: configurations

Step 4: run the RFT process

Advanced usage and full configurations

Programming guide for developers

Contribution guide

Acknowledgements

Citation

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 9

Languages

Packages