ZhenbinChan

Follow

🏠

Working

Zhenbin Chen ZhenbinChan

🏠

Working

Follow

I am Zhenbin Chen, a PhD student in ShanghaiTech University. My research interests is LLM && RL.

13 followers · 12 following

ShanghaiTech University
Foshan,Guangdong
https://www.zhihu.com/people/chen-zhen-bin-88
https://scholar.google.com/citations?user=B-9QFwIAAAAJ&hl=zh-CN

Achievements

Achievements

Highlights

Pro

Lists (1)

Sort

Speculative Decoding

Starred repositories

pengsida / learning_research

本人的科研经验

9,293 506 Updated Dec 12, 2025

THUDM / TreeRL

TreeRL: LLM Reinforcement Learning with On-Policy Tree Search in ACL'25

Python 84 6 Updated Jun 16, 2025

More2Search / Awesome-Search-LLM

18 1 Updated Oct 17, 2025

Hannibal046 / Awesome-LLM

Awesome-LLM: a curated list of Large Language Model

25,859 2,226 Updated Jul 31, 2025

1ring2rta / MCTS-GRPO

Policy Optimization is awesome, let’s put a tree on it! 🌲🌟

Python 22 1 Updated Jul 4, 2025

vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 66,131 12,176 Updated Dec 25, 2025

smiles724 / DeepSearch

This is the official code of DeepSearch paper!

Python 15 Updated Oct 22, 2025

princeton-pli / RLMT

[R]einforcement [L]earning from [M]odel-rewarded [T]hinking - code for the paper "Language Models That Think, Chat Better"

Python 122 7 Updated Oct 27, 2025

AMAP-ML / Tree-GRPO

Tree Search for LLM Agent Reinforcement Learning

Python 256 23 Updated Sep 29, 2025

fishjar / kiss-translator

A simple, open source bilingual translation extension & Greasemonkey script (一个简约、开源的双语对照翻译扩展 & 油猴脚本)

JavaScript 8,401 348 Updated Dec 25, 2025

agentica-project / rllm

Jupyter Notebook 303 27 Updated Sep 17, 2025

Chen-GX / SEER

Python 14 3 Updated Feb 10, 2025

yuanzhoulvpi2017 / vscode_debug_transformers

Python 405 36 Updated Feb 10, 2025

stanford-cs336 / spring2025-lectures

Python 2,277 492 Updated Dec 3, 2025

mukhal / ThinkPRM

Process Reward Models That Think

Python 67 5 Updated Nov 29, 2025

ZhenbinChan / verl

VERL 可视化、PRM、LLM-as-a-Judge

Python 9 Updated Dec 11, 2025

openreasoner / openr

OpenR: An Open Source Framework for Advanced Reasoning with Large Language Models

Python 1,832 135 Updated Jan 17, 2025

shy-robin / shy-nvim

个人的 Neovim 配置（基于 LazyVim）

Lua 8 4 Updated Dec 11, 2025

yuanzhoulvpi2017 / zero_nlp

中文nlp解决方案(大模型、数据、模型、训练、推理)

Jupyter Notebook 3,737 444 Updated Aug 5, 2025

PRIME-RL / ImplicitPRM

Repo of paper "Free Process Rewards without Process Labels"

Python 168 11 Updated Mar 14, 2025

Gen-Verse / ReasonFlux

[NeurIPS 2025 Spotlight] ReasonFlux (long-CoT), ReasonFlux-PRM (process reward model) and ReasonFlux-Coder (code generation)

Python 511 36 Updated Sep 27, 2025

amazon-science / street-reasoning

STREET: a multi-task and multi-step reasoning dataset

Python 24 4 Updated Feb 28, 2024

lgw863 / LogiQA-dataset

138 13 Updated Nov 25, 2020

musistudio / claude-code-router

Use Claude Code as the foundation for coding infrastructure, allowing you to decide how to interact with the model while enjoying updates from Anthropic.

TypeScript 23,974 1,891 Updated Dec 18, 2025

anthropics / claude-code

Claude Code is an agentic coding tool that lives in your terminal, understands your codebase, and helps you code faster by executing routine tasks, explaining complex code, and handling git workflo…

Shell 48,418 3,409 Updated Dec 20, 2025

zebbern / claude-code-guide

Full guide on claude tips and tricks and how you can optimise your claude code the best & strive to find every command possible even hidden ones!

2,796 246 Updated Dec 25, 2025

PRIME-RL / PRIME

Scalable RL solution for advanced reasoning of language models

Python 1,785 99 Updated Mar 18, 2025

LazyVim / LazyVim

Neovim config for the lazy

Lua 24,352 1,694 Updated Nov 11, 2025

RLHFlow / RLHF-Reward-Modeling

Recipes to train reward model for RLHF.

Python 1,490 108 Updated Apr 24, 2025

LazyVim / starter

Starter template for LazyVim

Lua 1,674 1,348 Updated Dec 11, 2024

Starred topics

Natural language processing