Stars
High performance build system for Windows, OSX and Linux. Supporting caching, network distribution and more.
Official Implementation of Dynamic erf (Derf).
[ICLR2025, ICML2025, NeurIPS2025 Spotlight] Quantized Attention achieves speedup of 2-5x compared to FlashAttention, without losing end-to-end metrics across language, image, and video models.
Hierarchical Reasoning Model Official Release
On-device AI across mobile, embedded and edge for PyTorch
A simple guide that explains the steps from training a simple PyTorch image classifier to converting the generated neural network into a CoreML model ready for production
A C++ Compute/Graphics Library and Toolchain enabling same-source CUDA/Host/Metal/OpenCL/Vulkan C++ programming and execution.
Noise supression using deep filtering
Fast and memory-efficient exact attention
Synchrosqueezing, wavelet transforms, and time-frequency analysis in Python
A high-performance, zero-overhead, extensible Python compiler with built-in NumPy support
A repository of AI-enhanced audio plugins written using C++, JUCE, libtorch, RTNeural and other libraries. Supports my book
AlphaFold Meets Flow Matching for Generating Protein Ensembles
Pytorch implementation of MeanFlow on ImageNet and CIFAR10
🤗 smolagents: a barebones library for agents that think in code.
Warped cosine-modulated filter bank design
Windows, Linux and Mac. Fully accelerated on CUDA and MPS
Fair-code workflow automation platform with native AI capabilities. Combine visual building with custom code, self-host or cloud, 400+ integrations.
An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System
Encode and decode audio samples to/from continuous and discrete compressed representations!
Code of WinT3R: Window-Based Streaming Rrconstruction With Camera Token Pool

