GitHub - yjsunnn/DLoRAL: [NeurIPS'25] One-Step Diffusion for Detail-Rich and Temporally Consistent Video Super-Resolution

One-Step Diffusion for Detail-Rich and Temporally Consistent Video Super-Resolution

Yujing Sun^{1,2, *} | Lingchen Sun^{1,2, *} | Shuaizheng Liu^1,2 | Rongyuan Wu^1,2 | Zhengqiang Zhang^1,2 | Lei Zhang^1,2

¹The Hong Kong Polytechnic University, ²OPPO Research Institute

📍 NeurIPS 2025

⏰ Update

2025.10.16: We update an improved version of DLoRAL. Thanks @Feynman1999 for the bug fixes!
2025.09.18: DLoRAL is accepted by NIPS2025 🎉
2025.07.14: Colab demo is available. ✨ No local GPU or setup needed - just upload and enhance!
2025.07.08: The inference code and pretrained weights are available.
2025.06.24: The project page is available, including a brief 2-minute explanation video, more visual results and relevant researches.
2025.06.17: The repo is released.

⭐ If DLoRAL is helpful to your videos or projects, please help star this repo. Thanks! 🤗

😊 You may also want to check our relevant works:

OSEDiff (NIPS2024) Paper | Code

Real-time Image SR algorithm that has been applied to the OPPO Find X8 series.
PiSA-SR (CVPR2025) Paper | Code

Pioneering exploration of Dual-LoRA paradigm in Image SR.
TVT-SR (ICCV2025) Paper | Code

A compact VAE and compute-efficient UNet able to handle fine-grained structures.
Awesome Diffusion Models for Video Super-Resolution Repo

A curated list of resources for Video Super-Resolution (VSR) using Diffusion Models.

👀 TODO

Release inference code.
Colab demo for convenient test.
Release training code.
Release training data.

🌟 Overview Framework

Training: A dynamic dual-stage training scheme alternates between optimizing temporal coherence (consistency stage) and refining high-frequency spatial details (enhancement stage) with smooth loss interpolation to ensure stability.

Inference: During inference, both C-LoRA and D-LoRA are merged into the frozen diffusion UNet, enabling one-step enhancement of low-quality inputs into high-quality outputs.

🔧 Dependencies and Installation

Clone repo

git clone https://github.com/yjsunnn/DLoRAL.git
cd DLoRAL

Install dependent packages

conda create -n DLoRAL python=3.10 -y
conda activate DLoRAL
pip install -r requirements.txt
# mim install mmedit and mmcv
pip install openmim
mim install mmcv-full
pip install mmedit

Download Models

Dependent Models

RAM --> put into /path/to/DLoRAL/preset/models/ram_swin_large_14m.pth
DAPE --> put into /path/to/DLoRAL/preset/models/DAPE.pth
Pretrained Weights --> put into /path/to/DLoRAL/preset/models/checkpoints/model.pkl
- If your goal is to reproduce the results from the paper, we recommend using this version of the weights instead.

Each path can be modified according to its own requirements, and the corresponding changes should also be applied to the command line and the code.

🖼️ Quick Inference

For Real-World Video Super-Resolution:

python src/test_DLoRAL.py     \
--pretrained_model_path stabilityai/stable-diffusion-2-1-base     \
--ram_ft_path /path/to/DLoRAL/preset/models/DAPE.pth     \
--ram_path '/path/to/DLoRAL/preset/models/ram_swin_large_14m.pth'     \
--merge_and_unload_lora False     \
--process_size 512     \
--pretrained_model_name_or_path stabilityai/stable-diffusion-2-1-base     \
--vae_encoder_tiled_size 4096     \
--load_cfr     \
--pretrained_path /path/to/DLoRAL/preset/models/checkpoints/model.pkl     \
--stages 1     \
-i /path/to/input_videos/     \
-o /path/to/results

⚙️ Training

For Real-World Video Super-Resolution:

bash train_scripts.sh

Some key parameters and corresponding meaning:

Param	Description	Example Value
`--quality_iter`	Number of steps for the initial switch from consistency to quality stage	`5000`
`--quality_iter_1_final`	Number of steps required to switch from the quality stage to the consistency stage	`13000`
`--quality_iter_2`	Relative number of steps after `quality_iter_1_final` to switch back to the quality stage (actual switch happens at `quality_iter_1_final + quality_iter_2`)	`5000`
`--lsdir_txt_path`	Dataset path for the first stage	`"/path/to/your/dataset"`
`--pexel_txt_path`	Dataset path for the second stage	`"/path/to/your/dataset"`

💬 Contact:

If you have any problem (not only about DLoRAL, but also problems regarding to burst/video super-resolution), please feel free to contact me at [email protected]

Citations

If our code helps your research or work, please consider citing our paper. The following are BibTeX references:

@misc{sun2025onestepdiffusiondetailrichtemporally,
      title={One-Step Diffusion for Detail-Rich and Temporally Consistent Video Super-Resolution}, 
      author={Yujing Sun and Lingchen Sun and Shuaizheng Liu and Rongyuan Wu and Zhengqiang Zhang and Lei Zhang},
      year={2025},
      eprint={2506.15591},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2506.15591}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 62 Commits
assets		assets
models		models
ram		ram
src		src
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
train_scripts.sh		train_scripts.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

One-Step Diffusion for Detail-Rich and Temporally Consistent Video Super-Resolution

📍 NeurIPS 2025

⏰ Update

👀 TODO

🌟 Overview Framework

🔧 Dependencies and Installation

Dependent Models

🖼️ Quick Inference

⚙️ Training

💬 Contact:

Citations

About

Uh oh!

Releases

Packages

Languages

License

yjsunnn/DLoRAL

Folders and files

Latest commit

History

Repository files navigation

One-Step Diffusion for Detail-Rich and Temporally Consistent Video Super-Resolution

📍 NeurIPS 2025

⏰ Update

👀 TODO

🌟 Overview Framework

🔧 Dependencies and Installation

Dependent Models

🖼️ Quick Inference

⚙️ Training

💬 Contact:

Citations

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages