
Starred repositories
The Automated LLM Speedrunning Benchmark measures how well LLM agents can reproduce previous innovations and discover new ones in language modeling.
We perform functional grounding of LLMs' knowledge in BabyAI-Text
MiniCPM4: Ultra-Efficient LLMs on End Devices, achieving 5+ speedup on typical end-side chips
[COLM 2024] OpenAgents: An Open Platform for Language Agents in the Wild
Educational framework exploring ergonomic, lightweight multi-agent orchestration. Managed by OpenAI Solution team.
Simple MPI implementation for prototyping or learning
Handwriting Synthesis with RNNs ✏️
Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.
Minimalistic large language model 3D-parallelism training
Qwen2.5-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
Qwen2.5-Omni is an end-to-end multimodal model by Qwen team at Alibaba Cloud, capable of understanding text, audio, vision, video, and performing real-time speech generation.
Minimal implementation of scalable rectified flow transformers, based on SD3's approach
The simplest, fastest repository for training/finetuning small-sized VLMs.
A nascent multi-agent tool for learning anything the feynman way (Microsoft AI Agent Hackathon Submission)
Minimalistic 4D-parallelism distributed training framework for education purpose
Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning
A TTS model capable of generating ultra-realistic dialogue in one pass.
Implementing DeepSeek R1's GRPO algorithm from scratch
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
Official PyTorch implementation of One-Minute Video Generation with Test-Time Training
Understanding R1-Zero-Like Training: A Critical Perspective
SpatialLM: Training Large Language Models for Structured Indoor Modeling
This package contains the original 2012 AlexNet code.
Solve Visual Understanding with Reinforced VLMs
Repository to create traveling waves integrate special information through time
Official Repo for "TheoremExplainAgent: Towards Video-based Multimodal Explanations for LLM Theorem Understanding" [ACL 2025 oral]