Stars
Universal LLM Deployment Engine with ML Compilation
Collection of extracted System Prompts from popular chatbots like ChatGPT, Claude & Gemini
Ongoing research training transformer models at scale
Learning Deep Representations of Data Distributions
Notebooks using the Hugging Face libraries 🤗
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Single File, Single GPU, From Scratch, Efficient, Full Parameter Tuning library for "RL for LLMs"
Minimalistic large language model 3D-parallelism training
Minimalistic 4D-parallelism distributed training framework for education purpose
A simulation framework for RLHF and alternatives. Develop your RLHF method without collecting human data.
SGLang is a fast serving framework for large language models and vision language models.
A high-throughput and memory-efficient inference and serving engine for LLMs
Utilities intended for use with Llama models.
Home for "How To Scale Your Model", a short blog-style textbook about scaling LLMs on TPUs
Development repo for Think Python 3rd edition
The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.
20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
The simplest, fastest repository for training/finetuning medium-sized GPTs.
A reimplementation of Andrej Karpathy's repository for an RL self-learning AI agent that learns to play Pong through trial and error, using PyTorch
Awesome Reasoning LLM Tutorial/Survey/Guide

