Starred repositories
Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
My learning notes/codes for ML SYS.
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and support state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorR…
NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.
ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
A Golang implemented Redis Server and Cluster. Go 语言实现的 Redis 服务器和分布式集群
SGLang is a fast serving framework for large language models and vision language models.
LLM大模型(重点)以及搜广推等 AI 算法中手写的面试题,(非 LeetCode),比如 Self-Attention, AUC等,一般比 LeetCode 更考察一个人的综合能力,又更贴近业务和基础知识一点
A high-throughput and memory-efficient inference and serving engine for LLMs
A framework for few-shot evaluation of language models.
Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.
High-speed Large Language Model Serving for Local Deployment
Several optimization methods of half-precision general matrix multiplication (HGEMM) using tensor core with WMMA API and MMA PTX instruction.
A professional list on Large (Language) Models and Foundation Models (LLM, LM, FM) for Time Series, Spatiotemporal, and Event Data.
how to optimize some algorithm in cuda.
《Hello 算法》:动画图解、一键运行的数据结构与算法教程。支持 Python, Java, C++, C, C#, JS, Go, Swift, Rust, Ruby, Kotlin, TS, Dart 代码。简体版和繁体版同步更新,English version in translation
[AAAI 2024] DiffusionTrack: Diffusion Model For Multi-Object Tracking. DiffusionTrack is the first work to employ the diffusion model for multi-object tracking by formulating it as a generative noi…
The road to hack SysML and become an system expert
👩🏿💻👨🏾💻👩🏼💻👨🏽💻👩🏻💻中国独立开发者项目列表 -- 分享大家都在做什么
Online CUDA Occupancy Calculator
🦜🔗 Build context-aware reasoning applications
Qdrant - High-performance, massive-scale Vector Database and Vector Search Engine for the next generation of AI. Also available in the cloud https://cloud.qdrant.io/
A curated list of practical guide resources of LLMs (LLMs Tree, Examples, Papers)
AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head