jianzhu

🎯

Focusing

steve jianzhu

🎯

Focusing

87 followers · 20 following

Beijing, China

Achievements

Stars

karpathy / nanochat

The best ChatGPT that $100 can buy.

Python 37,671 4,621 Updated Nov 17, 2025

Continual-Intelligence / SEAL

Self-Adapting Language Models

Python 1,556 275 Updated Aug 1, 2025

wizard-III / Archer2.0

Archer2.0 evolves from its predecessor by introducing ASPO, which overcomes fundamental PPO-Clip limitations to prevent premature convergence and unlock greater RL potential.

Python 22 2 Updated Oct 10, 2025

AlmondGod / tinyworlds

A minimal implementation of DeepMind's Genie world model

Python 1,040 77 Updated Nov 22, 2025

X-PLUG / MobileAgent

Mobile-Agent: The Powerful GUI Agent Family

Python 6,430 651 Updated Nov 26, 2025

anordin95 / a-conceptual-overview-of-asyncio

Python 220 6 Updated Oct 17, 2025

MikeWangWZHL / PAPO

Official repo for "PAPO: Perception-Aware Policy Optimization for Multimodal Reasoning"

Python 101 5 Updated Aug 26, 2025

dvlab-research / ARPO

Official Implementation of ARPO: End-to-End Policy Optimization for GUI Agents with Experience Replay

Python 139 8 Updated May 29, 2025

yfzhang114 / r1_reward

✨✨R1-Reward: Training Multimodal Reward Model Through Stable Reinforcement Learning

Python 270 22 Updated May 9, 2025

MoonshotAI / Kimi-VL

Kimi-VL: Mixture-of-Experts Vision-Language Model for Multimodal Reasoning, Long-Context Understanding, and Strong Agent Capabilities

1,119 60 Updated Jul 15, 2025

volcengine / verl

verl: Volcano Engine Reinforcement Learning for LLMs

Python 16,774 2,668 Updated Nov 27, 2025

huggingface / open-r1

Fully open reproduction of DeepSeek-R1

Python 25,689 2,402 Updated Nov 24, 2025

sail-sg / understand-r1-zero

Understanding R1-Zero-Like Training: A Critical Perspective

Python 1,160 54 Updated Aug 27, 2025

lukDev / awr_pytorch

PyTorch implementation of AWR.

Python 4 1 Updated Apr 29, 2020

NVlabs / COAT

[ICLR 2025] COAT: Compressing Optimizer States and Activation for Memory-Efficient FP8 Training

Python 250 22 Updated Aug 9, 2025

DigiRL-agent / digiq

Python 117 8 Updated Apr 8, 2025

deepseek-ai / open-infra-index

Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation

7,932 286 Updated May 15, 2025

simplescaling / s1

s1: Simple test-time scaling

Python 6,607 763 Updated Jun 25, 2025

deepseek-ai / DeepSeek-R1

91,519 11,773 Updated Jun 27, 2025

LeslieTrue / SFTvsRL

Official implementation of paper: SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

Python 310 17 Updated Apr 28, 2025

deepseek-ai / DeepSeek-V3

Python 100,417 16,368 Updated Aug 28, 2025

princeton-nlp / LESS

[ICML 2024] LESS: Selecting Influential Data for Targeted Instruction Tuning

Jupyter Notebook 506 44 Updated Oct 20, 2024

karpathy / nanoGPT

The simplest, fastest repository for training/finetuning medium-sized GPTs.

Python 50,222 8,397 Updated Nov 12, 2025

nikhilvyas / SOAP

Python 225 16 Updated Dec 2, 2024

MARIO-Math-Reasoning / Super_MARIO

Python 341 21 Updated Jun 5, 2025

hijkzzz / Awesome-LLM-Strawberry

A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.

6,852 371 Updated Oct 17, 2025

sangmichaelxie / doremi

Pytorch implementation of DoReMi, a method for optimizing the data mixture weights in language modeling datasets

HTML 347 35 Updated Dec 26, 2023

zhentingqi / rStar

Python 966 111 Updated Jan 23, 2025

gpt-omni / mini-omni

open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.

Python 3,480 299 Updated Nov 5, 2024

facebookresearch / audiocraft

Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable…

Jupyter Notebook 22,718 2,503 Updated Mar 13, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

steve jianzhu

Achievements

Achievements

Block or report jianzhu

Stars

karpathy / nanochat

Continual-Intelligence / SEAL

wizard-III / Archer2.0

AlmondGod / tinyworlds

X-PLUG / MobileAgent

anordin95 / a-conceptual-overview-of-asyncio

MikeWangWZHL / PAPO

dvlab-research / ARPO

yfzhang114 / r1_reward

MoonshotAI / Kimi-VL

volcengine / verl

huggingface / open-r1

sail-sg / understand-r1-zero

lukDev / awr_pytorch

NVlabs / COAT

DigiRL-agent / digiq

deepseek-ai / open-infra-index

simplescaling / s1

deepseek-ai / DeepSeek-R1

LeslieTrue / SFTvsRL

deepseek-ai / DeepSeek-V3

princeton-nlp / LESS

karpathy / nanoGPT

nikhilvyas / SOAP

MARIO-Math-Reasoning / Super_MARIO

hijkzzz / Awesome-LLM-Strawberry

sangmichaelxie / doremi

zhentingqi / rStar

gpt-omni / mini-omni

facebookresearch / audiocraft