Lists (11)
Sort Name ascending (A-Z)
Starred repositories
Distributed Compiler based on Triton for Parallel Systems
🗂 The perfect Front-End Checklist for modern websites and meticulous developers
slime is an LLM post-training framework for RL Scaling.
The AI coding agent built for the terminal.
MiroRL is an MCP-first reinforcement learning framework for deep research agent.
Democratizing Reinforcement Learning for LLMs
A curated list of awesome commands, files, and workflows for Claude Code
Train speculative decoding models effortlessly and port them smoothly to SGLang serving.
SkyRL: A Modular Full-stack RL Library for LLMs
Accelerate LLM preference tuning via prefix sharing with a single line of code
A Distributed Attention Towards Linear Scalability for Ultra-Long Context, Heterogeneous Data Training
LLM training code for Databricks foundation models
VILA is a family of state-of-the-art vision language models (VLMs) for diverse multimodal AI tasks across the edge, data center, and cloud.
VeOmni: Scaling Any Modality Model Training with Model-Centric Distributed Recipe Zoo
ByteCheckpoint: An Unified Checkpointing Library for LFMs
Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels
Lightning-Fast RL for LLM Reasoning and Agents. Made Simple & Flexible.
Qwen2.5-Omni is an end-to-end multimodal model by Qwen team at Alibaba Cloud, capable of understanding text, audio, vision, video, and performing real-time speech generation.
12 Lessons to Get Started Building AI Agents
Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train OpenAI gpt-oss, DeepSeek-R1, Qwen3, Gemma 3, TTS 2x faster with 70% less VRAM.
A simple and well styled PPO implementation. Based on my Medium series: https://medium.com/@eyyu/coding-ppo-from-scratch-with-pytorch-part-1-4-613dfc1b14c8.
✨ A synthetic dataset generation framework that produces diverse coding questions and verifiable solutions - all in one framwork
A bidirectional pipeline parallelism algorithm for computation-communication overlap in DeepSeek V3/R1 training.
FlashMLA: Efficient Multi-head Latent Attention Kernels
Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation


