Towards Real-Time Diffusion-Based Streaming Video Super-Resolution — An efficient one-step diffusion framework for streaming VSR with locality-constrained sparse attention and a tiny conditional de…

Python 1,042 84 Updated Nov 21, 2025

dvlab-research / DreamOmni2

This project is the official implementation of 'DreamOmni2: Multimodal Instruction-based Editing and Generation''

Python 2,427 203 Updated Oct 20, 2025

sczhou / ProPainter

[ICCV 2023] ProPainter: Improving Propagation and Transformer for Video Inpainting

Python 6,390 753 Updated Feb 19, 2025

TencentARC / VideoPainter

[SIGGRAPH2025] Official repo for paper "Any-length Video Inpainting and Editing with Plug-and-Play Context Control"

Python 526 33 Updated Apr 8, 2025

TIGER-AI-Lab / AnyV2V

Code and data for "AnyV2V: A Tuning-Free Framework For Any Video-to-Video Editing Tasks" [TMLR 2024]

Jupyter Notebook 640 48 Updated Oct 29, 2024

QwenLM / Qwen3-VL

Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Jupyter Notebook 16,915 1,397 Updated Nov 28, 2025

yejy53 / Echo-4o

Echo-4o

Jupyter Notebook 314 14 Updated Oct 20, 2025

ignoww / RALU

[arXiv 2025] Upsample What Matters: Region-Adaptive Latent Sampling for Accelerated Diffusion Transformers

Python 49 4 Updated Aug 8, 2025

3587jjh / LSRNA

Latent Space Super-Resolution for Higher-Resolution Image Generation with Diffusion Models (CVPR 2025)

Python 40 Updated Jun 30, 2025

asgeirtj / system_prompts_leaks

Collection of extracted System Prompts from popular chatbots like ChatGPT, Claude & Gemini

Roff 24,054 3,680 Updated Nov 30, 2025

AIDC-AI / Awesome-Unified-Multimodal-Models

Awesome Unified Multimodal Models

921 28 Updated Aug 17, 2025

multimodal-reasoning-lab / Bagel-Zebra-CoT

https://huggingface.co/datasets/multimodal-reasoning-lab/Zebra-CoT

Python 104 6 Updated Nov 1, 2025

AIDC-AI / Ovis-U1

An unified model that seamlessly integrates multimodal understanding, text-to-image generation, and image editing within a single powerful framework.

Python 437 14 Updated Dec 2, 2025

stepfun-ai / NextStep-1

Python 574 16 Updated Nov 10, 2025

ByteDance-Seed / VeOmni

VeOmni: Scaling Any Modality Model Training with Model-Centric Distributed Recipe Zoo

Python 1,375 113 Updated Dec 4, 2025

QwenLM / Qwen2.5-Omni

Qwen2.5-Omni is an end-to-end multimodal model by Qwen team at Alibaba Cloud, capable of understanding text, audio, vision, video, and performing real-time speech generation.

Jupyter Notebook 3,825 302 Updated Jun 12, 2025

VectorSpaceLab / OmniGen2

OmniGen2: Exploration to Advanced Multimodal Generation.

Jupyter Notebook 3,954 9 Updated Dec 2, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Nanyang Wang nywang16

Achievements

Achievements

Block or report nywang16

Stars

MS-Diffusion / MS-Diffusion

bytedance-fanqie-ai / MOSAIC

Kr1sJFU / iMontage

black-forest-labs / flux2

instantX-research / InstantStyle-Plus

HVision-NKU / StoryDiffusion

HumanAIGC / AnimateAnyone

zllrunning / face-parsing.PyTorch

yakhyo / face-parsing

kandinskylab / kandinsky-5

Doby-Xu / WithAnyone

bcmi / Awesome-Object-Placement

yjsunnn / Awesome-video-super-resolution-diffusion

OpenImagingLab / FlashVSR