Skip to content
View nbsps's full-sized avatar
❤️‍🔥
🌵 🌵 🌵 🌵 🌵
❤️‍🔥
🌵 🌵 🌵 🌵 🌵

Block or report nbsps

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.

C++ 3,486 292 Updated Jul 2, 2025

My learning notes/codes for ML SYS.

Python 2,720 169 Updated Jul 1, 2025

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and support state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorR…

C++ 10,910 1,540 Updated Jul 2, 2025

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.

C++ 11,791 2,213 Updated Jun 26, 2025

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator

C++ 17,070 3,307 Updated Jul 2, 2025

A Golang implemented Redis Server and Cluster. Go 语言实现的 Redis 服务器和分布式集群

Go 3,709 593 Updated Jun 1, 2025

SGLang is a fast serving framework for large language models and vision language models.

Python 15,621 2,235 Updated Jul 2, 2025

LLM大模型(重点)以及搜广推等 AI 算法中手写的面试题,(非 LeetCode),比如 Self-Attention, AUC等,一般比 LeetCode 更考察一个人的综合能力,又更贴近业务和基础知识一点

Jupyter Notebook 301 16 Updated Dec 29, 2024

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 51,186 8,439 Updated Jul 2, 2025

A framework for few-shot evaluation of language models.

Python 9,422 2,496 Updated Jun 30, 2025

Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.

Python 6,004 558 Updated Apr 11, 2025

Learning about CUDA by writing PTX code.

Python 133 4 Updated Feb 27, 2024

High-speed Large Language Model Serving for Local Deployment

C++ 8,226 434 Updated Feb 19, 2025

Several optimization methods of half-precision general matrix multiplication (HGEMM) using tensor core with WMMA API and MMA PTX instruction.

Cuda 434 80 Updated Sep 8, 2024

A professional list on Large (Language) Models and Foundation Models (LLM, LM, FM) for Time Series, Spatiotemporal, and Event Data.

1,107 84 Updated Dec 22, 2024

how to optimize some algorithm in cuda.

Cuda 2,292 209 Updated Jun 27, 2025

《Hello 算法》:动画图解、一键运行的数据结构与算法教程。支持 Python, Java, C++, C, C#, JS, Go, Swift, Rust, Ruby, Kotlin, TS, Dart 代码。简体版和繁体版同步更新,English version in translation

Java 113,939 14,127 Updated Jun 12, 2025

[AAAI 2024] DiffusionTrack: Diffusion Model For Multi-Object Tracking. DiffusionTrack is the first work to employ the diffusion model for multi-object tracking by formulating it as a generative noi…

Python 189 10 Updated Jul 17, 2024

The road to hack SysML and become an system expert

Emacs Lisp 489 61 Updated Sep 25, 2024

👩🏿‍💻👨🏾‍💻👩🏼‍💻👨🏽‍💻👩🏻‍💻中国独立开发者项目列表 -- 分享大家都在做什么

39,518 3,267 Updated Jul 1, 2025

Inference Llama 2 in one file of pure C

C 18,510 2,289 Updated Aug 6, 2024

Online CUDA Occupancy Calculator

CoffeeScript 77 12 Updated Oct 12, 2021

🦜🔗 Build context-aware reasoning applications

Jupyter Notebook 110,549 17,975 Updated Jul 1, 2025

Qdrant - High-performance, massive-scale Vector Database and Vector Search Engine for the next generation of AI. Also available in the cloud https://cloud.qdrant.io/

Rust 24,417 1,676 Updated Jul 1, 2025
Jupyter Notebook 1,030 99 Updated May 29, 2023

Compiler for Dynamic Neural Networks

Python 46 2 Updated Nov 13, 2023

The Modular Platform (includes MAX & Mojo)

Mojo 24,422 2,646 Updated Jul 2, 2025

A curated list of practical guide resources of LLMs (LLMs Tree, Examples, Papers)

9,964 768 Updated May 31, 2024

AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head

Python 10,171 863 Updated Jul 6, 2024

Fantastic toolkit for CTFers and everyone.

Vue 887 71 Updated Jan 1, 2025
Next