
-
Qualcomm
- San Diego, CA, USA
-
20:02
(UTC -07:00) - in/hongqiang
Lists (1)
Sort Name ascending (A-Z)
Starred repositories
Distributed MoE in a Single Kernel [NeurIPS '25]
gpt-oss-120b and gpt-oss-20b are two open-weight language models by OpenAI
CodeLinaro / llama.cpp
Forked from ggml-org/llama.cppLLM inference in C/C++
Real-time webcam demo with SmolVLM and llama.cpp server
Beignet is an open source implementation of the OpenCL specification - a generic compute oriented API. Here is Beignet Source Code Mirror in github- This is a publish-only repository and all pull r…
Compute Benchmarks for oneAPI Level Zero and OpenCL™ Driver
The fastest and most memory efficient lattice Boltzmann CFD software, running on all GPUs and CPUs via OpenCL. Free for non-commercial use.
The Qualcomm® AI Hub Models are a collection of state-of-the-art machine learning models optimized for performance (latency, memory etc.) and ready to deploy on Qualcomm® devices.
Print all known information about all available OpenCL platforms and devices in the system
A comprehensive 10-page probability cheatsheet that covers a semester's worth of introduction to probability.
Welcome to the Llama Cookbook! This is your go to guide for Building with Llama: Getting started with Inference, Fine-Tuning, RAG. We also show you how to solve end to end problems using Llama mode…
lightweight, standalone C++ inference engine for Google's Gemma models.
Notes about "Attention is all you need" video (https://www.youtube.com/watch?v=bCz4OMemCcA)
[ICML 2024] Break the Sequential Dependency of LLM Inference Using Lookahead Decoding
中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)
Llama中文社区,实时汇总最新Llama学习资料,构建最好的中文Llama大模型开源生态,完全开源可商用
LlamaIndex is the leading framework for building LLM-powered agents over your data.
A curated list of awesome computer vision resources
A profiler to disclose and quantify hardware features on GPUs.