Skip to content
View ZhenbinChan's full-sized avatar
🏠
Working
🏠
Working

Highlights

  • Pro

Block or report ZhenbinChan

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

本人的科研经验

9,293 506 Updated Dec 12, 2025

TreeRL: LLM Reinforcement Learning with On-Policy Tree Search in ACL'25

Python 84 6 Updated Jun 16, 2025

Awesome-LLM: a curated list of Large Language Model

25,859 2,226 Updated Jul 31, 2025

Policy Optimization is awesome, let’s put a tree on it! 🌲🌟

Python 22 1 Updated Jul 4, 2025

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 66,131 12,176 Updated Dec 25, 2025

This is the official code of DeepSearch paper!

Python 15 Updated Oct 22, 2025

[R]einforcement [L]earning from [M]odel-rewarded [T]hinking - code for the paper "Language Models That Think, Chat Better"

Python 122 7 Updated Oct 27, 2025

Tree Search for LLM Agent Reinforcement Learning

Python 256 23 Updated Sep 29, 2025

A simple, open source bilingual translation extension & Greasemonkey script (一个简约、开源的 双语对照翻译扩展 & 油猴脚本)

JavaScript 8,401 348 Updated Dec 25, 2025
Jupyter Notebook 303 27 Updated Sep 17, 2025
Python 14 3 Updated Feb 10, 2025

Process Reward Models That Think

Python 67 5 Updated Nov 29, 2025

VERL 可视化、PRM、LLM-as-a-Judge

Python 9 Updated Dec 11, 2025

OpenR: An Open Source Framework for Advanced Reasoning with Large Language Models

Python 1,832 135 Updated Jan 17, 2025

个人的 Neovim 配置(基于 LazyVim)

Lua 8 4 Updated Dec 11, 2025

中文nlp解决方案(大模型、数据、模型、训练、推理)

Jupyter Notebook 3,737 444 Updated Aug 5, 2025

Repo of paper "Free Process Rewards without Process Labels"

Python 168 11 Updated Mar 14, 2025

[NeurIPS 2025 Spotlight] ReasonFlux (long-CoT), ReasonFlux-PRM (process reward model) and ReasonFlux-Coder (code generation)

Python 511 36 Updated Sep 27, 2025

STREET: a multi-task and multi-step reasoning dataset

Python 24 4 Updated Feb 28, 2024

Use Claude Code as the foundation for coding infrastructure, allowing you to decide how to interact with the model while enjoying updates from Anthropic.

TypeScript 23,974 1,891 Updated Dec 18, 2025

Claude Code is an agentic coding tool that lives in your terminal, understands your codebase, and helps you code faster by executing routine tasks, explaining complex code, and handling git workflo…

Shell 48,418 3,409 Updated Dec 20, 2025

Full guide on claude tips and tricks and how you can optimise your claude code the best & strive to find every command possible even hidden ones!

2,796 246 Updated Dec 25, 2025

Scalable RL solution for advanced reasoning of language models

Python 1,785 99 Updated Mar 18, 2025

Neovim config for the lazy

Lua 24,352 1,694 Updated Nov 11, 2025

Recipes to train reward model for RLHF.

Python 1,490 108 Updated Apr 24, 2025

Starter template for LazyVim

Lua 1,674 1,348 Updated Dec 11, 2024
Next