Stars
multi-task yolov5 with detection and segmentation
Mamba-YOLO-World: Marrying YOLO-World with Mamba for Open-Vocabulary Detection
[CVPR 2024] Real-Time Open-Vocabulary Object Detection
Official repository for WaterScenes dataset
triple-Mu / mmyolo
Forked from open-mmlab/mmyoloOpenMMLab YOLO series toolbox and benchmark
OpenMMLab Detection Toolbox and Benchmark
Track-Anything is a flexible and interactive tool for video object tracking and segmentation, based on Segment Anything, XMem, and E2FGVI.
An open source implementation of CLIP.
YOLO-POSE was used for key point detection, Bytetrack for tracking, and STGCN for fall and other behavior recognition
YOLO-POSE was used for key point detection, Bytetrack for tracking, and STGCN for fall and other behavior recognition
YOLOv7-POSE was used for key point detection, Bytetrack for tracking, and Stgan for fall and other behavior recognition
Fall Detection using OpenPifPaf's Human Pose Estimation model
A toolbox for skeleton-based action recognition.
The official implementation of [CVPR2022] Decoupled Knowledge Distillation https://arxiv.org/abs/2203.08679 and [ICCV2023] DOT: A Distillation-Oriented Trainer https://openaccess.thecvf.com/content…
nvnnghia / Online-Realtime-Action-Recognition-based-on-OpenPose
Forked from LZQthePlane/Online-Realtime-Action-Recognition-based-on-OpenPoseA skeleton-based real-time online action recognition project, classifying and recognizing base on framewise joints, which can be used for safety surveilence.
Prompt Learning for Vision-Language Models (IJCV'22, CVPR'22)
Official implementation for "CLIP-ReID: Exploiting Vision-Language Model for Image Re-identification without Concrete Text Labels" (AAAI 2023)
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
[CVPR 2022] Official code for "RegionCLIP: Region-based Language-Image Pretraining"
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
Code release for our CVPR 2023 paper "Detecting Everything in the Open World: Towards Universal Object Detection".
Cross-Modal Implicit Relation Reasoning and Aligning for Text-to-Image Person Retrieval (CVPR 2023)
Code for Transferable Interactiveness Knowledge for Human-Object Interaction Detection. (CVPR'19, TPAMI'21)
[CVPR2023] MOTRv2: Bootstrapping End-to-End Multi-Object Tracking by Pretrained Object Detectors