-
ShanghaiTech University
- Foshan,Guangdong
- https://www.zhihu.com/people/chen-zhen-bin-88
- https://scholar.google.com/citations?user=B-9QFwIAAAAJ&hl=zh-CN
Highlights
- Pro
Lists (1)
Sort Name ascending (A-Z)
Starred repositories
TreeRL: LLM Reinforcement Learning with On-Policy Tree Search in ACL'25
Awesome-LLM: a curated list of Large Language Model
Policy Optimization is awesome, let’s put a tree on it! 🌲🌟
A high-throughput and memory-efficient inference and serving engine for LLMs
This is the official code of DeepSearch paper!
[R]einforcement [L]earning from [M]odel-rewarded [T]hinking - code for the paper "Language Models That Think, Chat Better"
Tree Search for LLM Agent Reinforcement Learning
A simple, open source bilingual translation extension & Greasemonkey script (一个简约、开源的 双语对照翻译扩展 & 油猴脚本)
OpenR: An Open Source Framework for Advanced Reasoning with Large Language Models
中文nlp解决方案(大模型、数据、模型、训练、推理)
Repo of paper "Free Process Rewards without Process Labels"
[NeurIPS 2025 Spotlight] ReasonFlux (long-CoT), ReasonFlux-PRM (process reward model) and ReasonFlux-Coder (code generation)
STREET: a multi-task and multi-step reasoning dataset
Use Claude Code as the foundation for coding infrastructure, allowing you to decide how to interact with the model while enjoying updates from Anthropic.
Claude Code is an agentic coding tool that lives in your terminal, understands your codebase, and helps you code faster by executing routine tasks, explaining complex code, and handling git workflo…
Full guide on claude tips and tricks and how you can optimise your claude code the best & strive to find every command possible even hidden ones!
Scalable RL solution for advanced reasoning of language models
Recipes to train reward model for RLHF.
