Starred repositories
A latent text-to-image diffusion model
《开源大模型食用指南》针对中国宝宝量身打造的基于Linux环境快速微调(全参数/Lora)、部署国内外开源大模型(LLM)/多模态大模型(MLLM)教程
Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything
《李宏毅深度学习教程》(李宏毅老师推荐👍,苹果书🍎),PDF下载地址:https://github.com/datawhalechina/leedl-tutorial/releases
This is the official code for MobileSAM project that makes SAM lightweight for mobile applications and beyond!
CoTracker is a model for tracking any point (pixel) on a video.
GTSAM is a library of C++ classes that implement smoothing and mapping (SAM) in robotics and vision, using factor graphs and Bayes networks as the underlying computing paradigm rather than sparse m…
A Modular Framework for 3D Gaussian Splatting and Beyond
Code for "GVHMR: World-Grounded Human Motion Recovery via Gravity-View Coordinates", Siggraph Asia 2024
[ICCV 2023 Oral] "FateZero: Fusing Attentions for Zero-shot Text-based Video Editing"
[ECCV2024] This is an official inference code of the paper "Glyph-ByT5: A Customized Text Encoder for Accurate Visual Text Rendering" and "Glyph-ByT5-v2: A Strong Aesthetic Baseline for Accurate Mu…
Implementation of the paper "DeepLSD: Line Segment Detection and Refinement with Deep Image Gradients"
An inference and training framework for multiple image input in Flux Kontext dev
CapSpeech: Enabling Downstream Applications in Style-Captioned Text-to-Speech
Python binding to Skia Graphics Library
Better, faster, peering over yonder. Improved large scale object detection in aerial/satellite imagery.
[ICCV2025] CAD-Recode: Reverse Engineering CAD Code from Point Clouds
Cloud-native vector similarity search and storage with efficient, serverless scale-out
Train a model to classify different fonts by self generating training data

