Stars
A course of learning LLM inference serving on Apple Silicon for systems engineers.
Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
📚A curated list of Awesome LLM/VLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.🎉
Get up and running with Llama 3.3, DeepSeek-R1, Phi-4, Gemma 3, Mistral Small 3.1 and other large language models.
Your ultimate Go microservices framework for the cloud-native era.
We are committed to the open-sourcing quantitative knowledge, aiming to bridge the information gap between the domestic and international quantitative finance industries. 我们致力于量化知识的开源与汉化,打破国内外量化金融行…
Implement a ChatGPT-like LLM in PyTorch from scratch, step by step
Cost-efficient and pluggable Infrastructure components for GenAI inference
Large Language Model (LLM) Systems Paper List
《Hello 算法》:动画图解、一键运行的数据结构与算法教程。支持 Python, Java, C++, C, C#, JS, Go, Swift, Rust, Ruby, Kotlin, TS, Dart 代码。简体版和繁体版同步更新,English version in translation
整理开源的中文大语言模型,以规模较小、可私有化部署、训练成本较低的模型为主,包括底座模型,垂直领域微调及应用,数据集与教程等。
21 Lessons, Get Started Building with Generative AI 🔗 https://microsoft.github.io/generative-ai-for-beginners/
Katalyst aims to provide a universal solution to help improve resource utilization and optimize the overall costs in the cloud. This is the core components in Katalyst system, including multiple ag…
a unified scheduler for online and offline tasks
Explain complex systems using visuals and simple terms. Help you prepare for system design interviews.
Easily and securely send things from one computer to another 🐊 📦
🎉 Repo for LaWGPT, Chinese-Llama tuned with Chinese Legal knowledge. 基于中文法律知识的大语言模型
🐜🐜🐜 ants is the most powerful and reliable pooling solution for Go.
为GPT/GLM等LLM大语言模型提供实用化交互接口,特别优化论文阅读/润色/写作体验,模块化设计,支持自定义快捷按钮&函数插件,支持Python和C++等项目剖析&自译解功能,PDF/LaTex论文翻译&总结功能,支持并行问询多种LLM模型,支持chatglm3等本地模型。接入通义千问, deepseekcoder, 讯飞星火, 文心一言, llama2, rwkv, claude2, m…
Build and run Docker containers leveraging NVIDIA GPUs
得意黑 Smiley Sans:一款在人文观感和几何特征中寻找平衡的中文黑体
A QoS-based scheduling system brings optimal layout and status to workloads such as microservices, web services, big data jobs, AI jobs, etc.
Papers from the computer science community to read and discuss.
📺 Discover the latest machine learning / AI courses on YouTube.