Skip to content
View LidhCS's full-sized avatar
💭
I may be slow to respond.
💭
I may be slow to respond.

Block or report LidhCS

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning.

Python 558 11 Updated Jul 4, 2025

optimized BERT transformer inference on NVIDIA GPU. https://arxiv.org/abs/2210.03052

C++ 474 37 Updated Mar 15, 2024

Search-R1: An Efficient, Scalable RL Training Framework for Reasoning & Search Engine Calling interleaved LLM based on veRL

Python 2,769 208 Updated Jun 20, 2025

free and open OpenAI Deep Research

Python 626 82 Updated Feb 18, 2025

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and support state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorR…

C++ 10,924 1,554 Updated Jul 4, 2025

Transformer related optimization, including BERT, GPT

C++ 6,227 909 Updated Mar 27, 2024

Running BERT without Padding

C++ 472 54 Updated Mar 18, 2022

The Triton Inference Server provides an optimized cloud and edge inferencing solution.

Python 9,424 1,603 Updated Jul 3, 2025

FlashMLA: Efficient MLA decoding kernels

Cuda 11,640 873 Updated Apr 29, 2025

DeepEP: an efficient expert-parallel communication library

Cuda 8,245 834 Updated Jul 4, 2025

Fast and memory-efficient exact attention

Python 18,185 1,779 Updated Jul 3, 2025

Fully open reproduction of DeepSeek-R1

Python 24,959 2,320 Updated Jul 3, 2025

Mixture-of-Experts for Large Vision-Language Models

Python 2,183 138 Updated Dec 3, 2024

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.

Python 38,807 4,724 Updated Jun 2, 2025

Puzzles for learning Triton

Jupyter Notebook 1,739 138 Updated Nov 18, 2024

Similarities: a toolkit for similarity calculation and semantic search. 相似度计算、匹配搜索工具包,支持亿级数据文搜文、文搜图、图搜图,python3开发,开箱即用。

Python 858 86 Updated Oct 29, 2024

CLIP as a service - Embed image and sentences, object recognition, visual reasoning, image classification and reverse image search

Jupyter Notebook 65 4 Updated Jan 15, 2024

An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.

Python 4,888 522 Updated Apr 11, 2025

Large Language Model Text Generation Inference

Python 10,287 1,206 Updated Jul 3, 2025

text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)

Python 11,647 1,126 Updated Jun 17, 2025

LLM inference in C/C++

C++ 82,547 12,256 Updated Jul 4, 2025

Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.

Shell 22,345 1,511 Updated Jun 26, 2025

GLM-4 series: Open Multilingual Multimodal Chat LMs | 开源多语言多模态对话模型

Python 6,659 564 Updated Jul 4, 2025

V2rayU,基于v2ray核心的mac版客户端,用于科学上网,使用swift编写,支持trojan,vmess,shadowsocks,socks5等服务协议,支持订阅, 支持二维码,剪贴板导入,手动配置,二维码分享等

19,436 2,936 Updated Jun 20, 2025

RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.

Python 58,915 5,849 Updated Jul 4, 2025

基于ChatGLM-6B、ChatGLM2-6B、ChatGLM3-6B模型,进行下游具体任务微调,涉及Freeze、Lora、P-tuning、全参微调等

Python 2,749 313 Updated Dec 12, 2023

基于ChatGLM-6B + LoRA的Fintune方案

Python 3,763 445 Updated Nov 25, 2023

SuperCLUE: 中文通用大模型综合性基准 | A Benchmark for Foundation Models in Chinese

3,218 108 Updated Apr 28, 2025
Next