Stars
Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.
Image inpainting tool powered by SOTA AI Model. Remove any unwanted object, defect, people from your pictures or erase and replace(powered by stable diffusion) any thing on your pictures.
High-Resolution 3D Assets Generation with Large Scale Hunyuan3D Diffusion Models.
[CVPR 2025 Best Paper Award] VGGT: Visual Geometry Grounded Transformer
Official repository of "SAMURAI: Adapting Segment Anything Model for Zero-Shot Visual Tracking with Motion-Aware Memory"
Qwen-Image is a powerful image generation foundation model capable of complex text rendering and precise image editing.
TripoSR: Fast 3D Object Reconstruction from a Single Image
Depth Pro: Sharp Monocular Metric Depth in Less Than a Second.
[CVPR 2024 - Oral, Best Paper Award Candidate] Marigold: Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation
Effortless AI-assisted data labeling with AI support from YOLO, Segment Anything (SAM+SAM2), MobileSAM!!
[CAAI AIR'24] Bilateral Reference for High-Resolution Dichotomous Image Segmentation
[SIGGRAPH'24] 2D Gaussian Splatting for Geometrically Accurate Radiance Fields
[CVPR'25 Oral] MoGe: Unlocking Accurate Monocular Geometry Estimation for Open-Domain Images with Optimal Training Supervision
Labeling tool with SAM(segment anything model),supports SAM, SAM2, sam-hq, MobileSAM EdgeSAM etc.交互式半自动图像标注工具
Ray tracing and hybrid rasterization of Gaussian particles
InstantSplat: Sparse-view SfM-free Gaussian Splatting in Seconds
A ComfyUI custom node designed for advanced image background removal and object, face, clothes, and fashion segmentation, utilizing multiple models including RMBG-2.0, INSPYRENET, BEN, BEN2, BiRefN…
TripoSG: High-Fidelity 3D Shape Synthesis using Large-Scale Rectified Flow Models
Official Implementation of paper "MonST3R: A Simple Approach for Estimating Geometry in the Presence of Motion"
DINO-X: The World's Top-Performing Vision Model for Open-World Object Detection and Understanding
This repo contains the projects: 'Virtual Normal', 'DiverseDepth', and '3D Scene Shape'. They aim to solve the monocular depth estimation, 3D scene reconstruction from single image problems.
Universal Monocular Metric Depth Estimation
[ECCV 2024] PowerPaint, a versatile image inpainting model that supports text-guided object inpainting, object removal, image outpainting and shape-guided object inpainting with only a single model…
[ICCV2023] Delicate Textured Mesh Recovery from NeRF via Adaptive Surface Refinement
SPAR3D: Stable Point-Aware Reconstruction of 3D Objects from Single Images
Official implementation of the paper "LangSplat: 3D Language Gaussian Splatting" [CVPR2024 Highlight]
[TVCG2024] PGSR: Planar-based Gaussian Splatting for Efficient and High-Fidelity Surface Reconstruction
LBM: Latent Bridge Matching for Fast Image-to-Image Translation ✨ (ICCV 2025 Highlight)