Yujing Sun1,2, * | Lingchen Sun1,2, * | Shuaizheng Liu1,2 | Rongyuan Wu1,2 | Zhengqiang Zhang1,2 | Lei Zhang1,2
1The Hong Kong Polytechnic University, 2OPPO Research Institute
- 2025.10.16: We update an improved version of DLoRAL. Thanks @Feynman1999 for the bug fixes!
- 2025.09.18: DLoRAL is accepted by NIPS2025 🎉
- 2025.07.14: Colab demo is available. ✨ No local GPU or setup needed - just upload and enhance!
- 2025.07.08: The inference code and pretrained weights are available.
- 2025.06.24: The project page is available, including a brief 2-minute explanation video, more visual results and relevant researches.
- 2025.06.17: The repo is released.
⭐ If DLoRAL is helpful to your videos or projects, please help star this repo. Thanks! 🤗
😊 You may also want to check our relevant works:
-
OSEDiff (NIPS2024) Paper | Code
Real-time Image SR algorithm that has been applied to the OPPO Find X8 series.
-
PiSA-SR (CVPR2025) Paper | Code
Pioneering exploration of Dual-LoRA paradigm in Image SR.
-
TVT-SR (ICCV2025) Paper | Code
A compact VAE and compute-efficient UNet able to handle fine-grained structures.
-
Awesome Diffusion Models for Video Super-Resolution Repo
A curated list of resources for Video Super-Resolution (VSR) using Diffusion Models.
- Release inference code.
- Colab demo for convenient test.
- Release training code.
- Release training data.
Training: A dynamic dual-stage training scheme alternates between optimizing temporal coherence (consistency stage) and refining high-frequency spatial details (enhancement stage) with smooth loss interpolation to ensure stability.
Inference: During inference, both C-LoRA and D-LoRA are merged into the frozen diffusion UNet, enabling one-step enhancement of low-quality inputs into high-quality outputs.
-
Clone repo
git clone https://github.com/yjsunnn/DLoRAL.git cd DLoRAL -
Install dependent packages
conda create -n DLoRAL python=3.10 -y conda activate DLoRAL pip install -r requirements.txt # mim install mmedit and mmcv pip install openmim mim install mmcv-full pip install mmedit -
Download Models
- RAM --> put into /path/to/DLoRAL/preset/models/ram_swin_large_14m.pth
- DAPE --> put into /path/to/DLoRAL/preset/models/DAPE.pth
- Pretrained Weights --> put into /path/to/DLoRAL/preset/models/checkpoints/model.pkl
- If your goal is to reproduce the results from the paper, we recommend using this version of the weights instead.
Each path can be modified according to its own requirements, and the corresponding changes should also be applied to the command line and the code.
For Real-World Video Super-Resolution:
python src/test_DLoRAL.py \
--pretrained_model_path stabilityai/stable-diffusion-2-1-base \
--ram_ft_path /path/to/DLoRAL/preset/models/DAPE.pth \
--ram_path '/path/to/DLoRAL/preset/models/ram_swin_large_14m.pth' \
--merge_and_unload_lora False \
--process_size 512 \
--pretrained_model_name_or_path stabilityai/stable-diffusion-2-1-base \
--vae_encoder_tiled_size 4096 \
--load_cfr \
--pretrained_path /path/to/DLoRAL/preset/models/checkpoints/model.pkl \
--stages 1 \
-i /path/to/input_videos/ \
-o /path/to/results
For Real-World Video Super-Resolution:
bash train_scripts.sh
Some key parameters and corresponding meaning:
| Param | Description | Example Value |
|---|---|---|
--quality_iter |
Number of steps for the initial switch from consistency to quality stage | 5000 |
--quality_iter_1_final |
Number of steps required to switch from the quality stage to the consistency stage | 13000 |
--quality_iter_2 |
Relative number of steps after quality_iter_1_final to switch back to the quality stage (actual switch happens at quality_iter_1_final + quality_iter_2) |
5000 |
--lsdir_txt_path |
Dataset path for the first stage | "/path/to/your/dataset" |
--pexel_txt_path |
Dataset path for the second stage | "/path/to/your/dataset" |
If you have any problem (not only about DLoRAL, but also problems regarding to burst/video super-resolution), please feel free to contact me at [email protected]
If our code helps your research or work, please consider citing our paper. The following are BibTeX references:
@misc{sun2025onestepdiffusiondetailrichtemporally,
title={One-Step Diffusion for Detail-Rich and Temporally Consistent Video Super-Resolution},
author={Yujing Sun and Lingchen Sun and Shuaizheng Liu and Rongyuan Wu and Zhengqiang Zhang and Lei Zhang},
year={2025},
eprint={2506.15591},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2506.15591},
}