Lists (1)
Sort Name ascending (A-Z)
Stars
Stable Diffusion web UI
Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more
Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
RWKV (pronounced RwaKuv) is an RNN with great LLM performance, which can also be directly trained like a GPT transformer (parallelizable). We are at RWKV-7 "Goose". So it's combining the best of RN…
Object Detection toolkit based on PaddlePaddle. It supports object detection, instance segmentation, multiple object tracking and real-time multi-person keypoint detection.
An open source implementation of CLIP.
🏄 Scalable embedding, reasoning, ranking for images and sentences with CLIP
Style transfer, deep learning, feature transform
🐍 Geometric Computer Vision Library for Spatial AI
Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"
Large World Model -- Modeling Text and Video with Millions Context
Ingest, parse, and optimize any data format ➡️ from documents to multimedia ➡️ for enhanced compatibility with GenAI frameworks
[AAAI 2025] Official implementation of "OOTDiffusion: Outfitting Fusion based Latent Diffusion for Controllable Virtual Try-on"
All-in-One Development Tool based on PaddlePaddle
Joint 3D Face Reconstruction and Dense Alignment with Position Map Regression Network (ECCV 2018)
OpenMMLab Pre-training Toolbox and Benchmark
VMamba: Visual State Space Models,code is based on mamba
One-step image-to-image with Stable Diffusion turbo: sketch2image, day2night, and more
GANimation: Anatomically-aware Facial Animation from a Single Image (ECCV'18 Oral) [PyTorch]
yolov5 + csl_label.(Oriented Object Detection)(Rotation Detection)(Rotated BBox)基于yolov5的旋转目标检测
[ECCV 2022] XMem: Long-Term Video Object Segmentation with an Atkinson-Shiffrin Memory Model
[CVPR'25 Oral] MoGe: Unlocking Accurate Monocular Geometry Estimation for Open-Domain Images with Optimal Training Supervision
Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation
