ModelTC repositories

LightLLM

Public

LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.

nlp deep-learning llamagpt model-serving llm openai-triton

Python

•

Apache License 2.0

•283•3.8k•81•34•Updated

Nov 26, 2025

LightX2V

Public

Light Video Generation Inference Framework

video-generation diffusion-models wan-videoauto-regressive-diffusion-model

Python

•55•921•57•0•Updated

Nov 26, 2025

lightllm-blog

Public

SCSS

•

MIT License

•1•1•0•0•Updated

Nov 26, 2025

LightMem

Public

Apache License 2.0

•0•0•0•1•Updated

Nov 26, 2025

LightKernel

Public

HTML

•

Apache License 2.0

•0•3•0•0•Updated

Nov 26, 2025

ComfyUI-Lightx2vWrapper

Public

ComfyUI custom node for lightx2v

comfyui comfyui-nodes

Python

•

MIT License

•7•54•2•0•Updated

Nov 25, 2025

mtc-incremental-bpe

Public

Incremental BPE tokenization for all prefixes

Rust

•

Apache License 2.0

•0•0•0•0•Updated

Nov 25, 2025

general-sam

Public

A general suffix automaton implementation in Rust with Python bindings

Rust

•

Apache License 2.0

•0•8•0•0•Updated

Nov 25, 2025

mtc-token-healing

Public

Token healing implementation in Rust

Rust

•

Apache License 2.0

•0•4•0•0•Updated

Nov 25, 2025

general-sam-py

Public

Python bindings for general-sam and some utilities

Python

•

Apache License 2.0

•0•5•0•0•Updated

Nov 25, 2025

greedy-tokenizer

Public

Greedily tokenize strings with the longest tokens iteratively.

Python

•

Apache License 2.0

•0•0•0•3•Updated

Nov 24, 2025

slime

Public

slime is an LLM post-training framework for RL Scaling.

Python

•

Apache License 2.0

•282•0•0•0•Updated

Nov 20, 2025

modeltc.github.io

Public

HTML

•0•0•0•0•Updated

Nov 19, 2025

LightCompress

Public

[EMNLP 2024 & AAAI 2026] A powerful toolkit for compressing large models including LLM, VLM, and video generation models.

benchmark deployment toolevaluation pruning quantization wan awq large-language-models llm

Python

•

Apache License 2.0

•62•627•41•0•Updated

Nov 19, 2025

lightx2v_examples

Public

0•0•0•0•Updated

Nov 12, 2025

Wan2.2-Lightning

Public

Wan2.2-Lightning: Speed up wan2.2 model with distillation

Python

•

Apache License 2.0

•1.4k•226•17•0•Updated

Nov 7, 2025

LTX-Video-Q8-Kernels

Public

Python

•15•0•0•0•Updated

Nov 6, 2025

SageAttention-1104

Public

[ICLR2025, ICML2025, NeurIPS2025 Spotlight] Quantized Attention achieves speedup of 2-5x compared to FlashAttention, without losing end-to-end metrics across language, image, and video models.

Cuda

•

Apache License 2.0

•270•0•0•0•Updated

Nov 6, 2025

Towards Real-Time Diffusion-Based Streaming Video Super-Resolution — An efficient one-step diffusion framework for streaming VSR with locality-constrained sparse attention and a tiny conditional decoder.

Python

•

Apache License 2.0

•80•1•0•0•Updated

Nov 5, 2025

ComfyUI-LightVAE

Public

Python

•

Apache License 2.0

•7•33•13•0•Updated

Nov 3, 2025

Qwen-Image-Lightning

Public

Qwen-Image-Lightning: Speed up Qwen-Image model with distillation

Python

•

Apache License 2.0

•39•997•19•0•Updated

Oct 14, 2025

HBP

Public

[NeurIPS 2025] This is the official PyTorch implementation of "Hierarchical Balance Packing: Towards Efficient Supervised Fine-tuning for Long-Context LLM".

Python

•

Apache License 2.0

•0•4•0•0•Updated

Sep 30, 2025

TFMQ-DM

Public

[CVPR 2024 Highlight & TPAMI 2025] This is the official PyTorch implementation of "TFMQ-DM: Temporal Feature Maintenance Quantization for Diffusion Models".

highlight quantization cvprldm diffusion-models tpami post-training-quantization ddim stable-diffusion cvpr2024

Jupyter Notebook

•

Apache License 2.0

•4•109•0•0•Updated

Sep 29, 2025

SageAttention

Public

Quantized Attention achieves speedup of 2-5x and 3-11x compared to FlashAttention and xformers, without lossing end-to-end metrics across language, image, and video models.

Cuda

•

Apache License 2.0

•270•2•0•0•Updated

Aug 16, 2025

fa3

Public

Python

•

BSD 3-Clause "New" or "Revised" License

•1•0•0•0•Updated

Aug 7, 2025

flash-attn-3-build

Public

Dockerfile

•2•0•0•0•Updated

Jul 24, 2025

HarmoniCa

Public

[ICML 2025] This is the official PyTorch implementation of "🎵 HarmoniCa: Harmonizing Training and Inference for Better Feature Caching in Diffusion Transformer Acceleration".

acceleration icml ditpixart diffusion-models diffusion-transformer pixart-sigma feature-caching icml-2025

Python

•

Apache License 2.0

•1•44•0•0•Updated

Jul 10, 2025

LightTTS

Public

Light-tts is a lightweight TTS inference framework optimized for CosyVoice2, enabling fast and scalable speech synthesis in Python.

Python

•

Apache License 2.0

•0•11•0•0•Updated

Jun 24, 2025

OmniBal

Public

[ICML 2025] This is the official PyTorch implementation of "OmniBal: Towards Fast Instruction-Tuning for Vision-Language Models via Omniverse Computation Balance".

vlm icml-2025

Python

•

Apache License 2.0

•3•26•3•0•Updated

Jun 16, 2025

lightx2v_comfyui_node

Public

0•0•0•0•Updated

Apr 28, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ModelTC

All

All

65 repositories

LightLLM

LightX2V

lightllm-blog

LightMem

LightKernel

ComfyUI-Lightx2vWrapper

mtc-incremental-bpe

general-sam

mtc-token-healing

general-sam-py

greedy-tokenizer

slime

modeltc.github.io

LightCompress

lightx2v_examples

Wan2.2-Lightning

LTX-Video-Q8-Kernels

SageAttention-1104

FlashVSR

ComfyUI-LightVAE

Qwen-Image-Lightning

HBP

TFMQ-DM

SageAttention

fa3

flash-attn-3-build

HarmoniCa

LightTTS

OmniBal

lightx2v_comfyui_node

All

All

Repositories list

65 repositories