Skip to content
View jianzhu's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report jianzhu

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

The best ChatGPT that $100 can buy.

Python 36,982 4,472 Updated Nov 17, 2025

Self-Adapting Language Models

Python 1,523 262 Updated Aug 1, 2025

Archer2.0 evolves from its predecessor by introducing ASPO, which overcomes fundamental PPO-Clip limitations to prevent premature convergence and unlock greater RL potential.

Python 20 1 Updated Oct 10, 2025

A minimal implementation of DeepMind's Genie world model

Python 1,029 77 Updated Nov 8, 2025

Mobile-Agent: The Powerful GUI Agent Family

Python 6,267 630 Updated Nov 14, 2025

Official repo for "PAPO: Perception-Aware Policy Optimization for Multimodal Reasoning"

Python 96 5 Updated Aug 26, 2025

Official Implementation of ARPO: End-to-End Policy Optimization for GUI Agents with Experience Replay

Python 137 8 Updated May 29, 2025

✨✨R1-Reward: Training Multimodal Reward Model Through Stable Reinforcement Learning

Python 270 22 Updated May 9, 2025

Kimi-VL: Mixture-of-Experts Vision-Language Model for Multimodal Reasoning, Long-Context Understanding, and Strong Agent Capabilities

1,106 65 Updated Jul 15, 2025

verl: Volcano Engine Reinforcement Learning for LLMs

Python 15,997 2,571 Updated Nov 18, 2025

Fully open reproduction of DeepSeek-R1

Python 25,648 2,402 Updated Sep 8, 2025

Understanding R1-Zero-Like Training: A Critical Perspective

Python 1,156 54 Updated Aug 27, 2025

PyTorch implementation of AWR.

Python 4 1 Updated Apr 29, 2020

[ICLR 2025] COAT: Compressing Optimizer States and Activation for Memory-Efficient FP8 Training

Python 246 22 Updated Aug 9, 2025
Python 116 8 Updated Apr 8, 2025

Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation

7,926 287 Updated May 15, 2025

s1: Simple test-time scaling

Python 6,597 762 Updated Jun 25, 2025

Official implementation of paper: SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

Python 310 17 Updated Apr 28, 2025

[ICML 2024] LESS: Selecting Influential Data for Targeted Instruction Tuning

Jupyter Notebook 503 44 Updated Oct 20, 2024

The simplest, fastest repository for training/finetuning medium-sized GPTs.

Python 49,704 8,330 Updated Nov 12, 2025
Python 223 16 Updated Dec 2, 2024

A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.

6,843 371 Updated Oct 17, 2025

Pytorch implementation of DoReMi, a method for optimizing the data mixture weights in language modeling datasets

HTML 347 35 Updated Dec 26, 2023
Python 963 110 Updated Jan 23, 2025

open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.

Python 3,430 294 Updated Nov 5, 2024

Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable…

Jupyter Notebook 22,680 2,498 Updated Mar 13, 2025
Next