Stars
🔥 The Web Data API for AI - Turn entire websites into LLM-ready markdown or structured data
Processed / Cleaned Data for Paper Copilot
Storing long contexts in tiny caches with self-study
A minimalistic framework for transparently training language models and storing comprehensive checkpoints for in-depth learning dynamics research.
Model Context Protocol Servers
Democratizing Reinforcement Learning for LLMs
Discovering Interpretable Features in Protein Language Models via Sparse Autoencoders
Aioli: A unified optimization framework for language model data mixing
Archon provides a modular framework for combining different inference-time techniques and LMs with just a JSON config file.
[ICLR2025] Breaking Throughput-Latency Trade-off for Long Sequences with Speculative Decoding
Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.
[NeurIPS 2024] Simple and Effective Masked Diffusion Language Model
(NeurIPS 2024) AvaTaR: Optimizing LLM Agents for Tool Usage via Contrastive Reasoning
TextGrad: Automatic ''Differentiation'' via Text -- using large language models to backpropagate textual gradients. Published in Nature.
Tile primitives for speedy kernels
SWE-agent takes a GitHub issue and tries to automatically fix it, using your LM of choice. It can also be employed for offensive cybersecurity or competitive coding challenges. [NeurIPS 2024]
A curated reading list of research in Adaptive Computation, Inference-Time Computation & Mixture of Experts (MoE).
Triton-based implementation of Sparse Mixture of Experts.
Code for exploring Based models from "Simple linear attention language models balance the recall-throughput tradeoff"
This repo contains data and code for the paper "Language Models Enable Simple Systems for Generating Structured Views of Heterogeneous Data Lakes"
🚀 Efficient implementations of state-of-the-art linear attention models
Understand and test language model architectures on synthetic tasks.
Building blocks for foundation models.
Skill-It! A Data-Driven Skills Framework for Understanding and Training Language Models





