Designing a Practical Degradation Model for Deep Blind Image Super-Resolution (ICCV, 2021) (PyTorch) - We released the training code!

Last update: Jan 08, 2023

Overview

Designing a Practical Degradation Model for Deep Blind Image Super-Resolution

Kai Zhang, Jingyun Liang, Luc Van Gool, Radu Timofte
Computer Vision Lab, ETH Zurich, Switzerland

[Paper] [Code] [Training Code]

Our work is the beginning rather than the end of real image super-resolution.

News (2021-08-31): We upload the training code.
News (2021-08-24): We upload the BSRGAN degradation model.

from utils import utils_blindsr as blindsr
img_lq, img_hq = blindsr.degradation_bsrgan(img, sf=4, lq_patchsize=72)

News (2021-07-23): After rejection by CVPR 2021, our paper is accepted by ICCV 2021. For the sake of fairness, we will not update the trained models in our camera-ready version. However, we may updata the trained models in github.
News (2021-05-18): Add trained BSRGAN model for scale factor 2.
News (2021-04): Our degradation model for face image enhancement: https://github.com/vvictoryuki/BSRGAN_implementation

Training

Download KAIR: git clone https://github.com/cszn/KAIR.git
Put your training high-quality images into trainsets/trainH or set "dataroot_H": "trainsets/trainH"

Train BSRNet

Modify train_bsrgan_x4_psnr.json e.g., "gpu_ids": [0], "dataloader_batch_size": 4
Training with DataParallel

python main_train_psnr.py --opt options/train_bsrgan_x4_psnr.json

Training with DistributedDataParallel - 4 GPUs

python -m torch.distributed.launch --nproc_per_node=4 --master_port=1234 main_train_psnr.py --opt options/train_bsrgan_x4_psnr.json  --dist True

Train BSRGAN
1. Put BSRNet model (e.g., '400000_G.pth') into superresolution/bsrgan_x4_gan/models
2. Modify train_bsrgan_x4_gan.json e.g., "gpu_ids": [0], "dataloader_batch_size": 4
3. Training with DataParallel
```
python main_train_gan.py --opt options/train_bsrgan_x4_gan.json
```
1. Training with DistributedDataParallel - 4 GPUs
```
python -m torch.distributed.launch --nproc_per_node=4 --master_port=1234 main_train_gan.py --opt options/train_bsrgan_x4_gan.json  --dist True
```
Test BSRGAN model 'xxxxxx_E.pth' by modified main_test_bsrgan.py
1. 'xxxxxx_E.pth' is more stable than 'xxxxxx_G.pth'

✨ Some visual examples: oldphoto2; butterfly; comic; oldphoto3; oldphoto6; comic_01; comic_03; comic_04

Testing code
Main idea
Comparison
More visual results on RealSRSet dataset
Visual results on DPED dataset
Citation
Acknowledgments

Testing code

main_test_bsrgan.py
model_zoo (Download the following models from Google drive or 腾讯微云).
- Proposed:
  - BSRGAN.pth [Google drive] [腾讯微云] 🌱
  - BSRNet.pth [Google drive] [腾讯微云] 🌱
- Compared methods:
  - RRDB.pth ---> original link
  - ESRGAN.pth ---> original link
  - FSSR_DPED.pth ---> original link
  - FSSR_DPED.pth ---> original link
  - RealSR_DPED.pth ---> original link
  - RealSR_JPEG.pth ---> original link

Main idea

Design a new degradation model to synthesize LR images for training:

1) Make the blur, downsampling and noise more practical
- Blur: two convolutions with isotropic and anisotropic Gaussian kernels from both the HR space and LR space
- Downsampling: nearest, bilinear, bicubic, down-up-sampling
- Noise: Gaussian noise, JPEG compression noise, processed camera sensor noise
2) Degradation shuffle: instead of using the commonly-used blur/downsampling/noise-addition pipeline, we perform randomly shuffled degradations to synthesize LR images

Some notes on the proposed degradation model:

The degradation model is mainly designed to synthesize degraded LR images. Its most direct application is to train a deep blind super-resolver with paired LR/HR images. In particular, the degradation model can be performed on a large dataset of HR images to produce unlimited perfectly aligned training images, which typically do not suffer from the limited data issue of laboriously collected paired data and the misalignment issue of unpaired training data.
The degradation model tends to be unsuited to model a degraded LR image as it involves too many degradation parameters and also adopts a random shuffle strategy.
The degradation model can produce some degradation cases that rarely happen in real-world scenarios, while this can still be expected to improve the generalization ability of the trained deep blind super-resolver.
A DNN with large capacity has the ability to handle different degradations via a single model. This has been validated multiple times. For example, DnCNN is able to handle SISR with different scale factors, JPEG compression deblocking with different quality factors and denoising for a wide range of noise levels, while still having a performance comparable to VDSR for SISR. It is worth noting that even when the super-resolver reduces the performance for unrealistic bicubic downsampling, it is still a preferred choice for real SISR.
One can conveniently modify the degradation model by changing the degradation parameter settings and adding more reasonable degradation types to improve the practicability for a certain application.

Comparison

These no-reference IQA metrics, i.e., NIQE, NRQM and PI, do not always match perceptual visual quality [1] and the IQA metric should be updated with new SISR methods [2]. We further argue that the IQA metric for SISR should also be updated with new image degradation types, which we leave for future work.

[1] "NTIRE 2020 challenge on real-world image super-resolution: Methods and results." CVPRW, 2020.
[2] "PIPAL: a large-scale image quality assessment dataset for perceptual image restoration." ECCV, 2020.

More visual results on RealSRSet dataset

Left: real images | Right: super-resolved images with scale factor 4

Visual results on DPED dataset

Without using any prior information of DPED dataset for training, our BSRGAN still performs well.

Citation

@inproceedings{zhang2021designing,
  title={Designing a Practical Degradation Model for Deep Blind Image Super-Resolution},
  author={Zhang, Kai and Liang, Jingyun and Van Gool, Luc and Timofte, Radu},
  booktitle={arxiv},
  year={2021}
}

Acknowledgments

This work was partly supported by the ETH Zurich Fund (OK), a Huawei Technologies Oy (Finland) project, and an Amazon AWS grant.

Designing a Practical Degradation Model for Deep Blind Image Super-Resolution (ICCV, 2021) (PyTorch) - We released the training code!

Related tags

Overview

Designing a Practical Degradation Model for Deep Blind Image Super-Resolution

Training

Testing code

Main idea

Comparison

More visual results on RealSRSet dataset

Visual results on DPED dataset

Citation

Acknowledgments

Owner

Kai Zhang

Task-based end-to-end model learning in stochastic optimization

Node for thenewboston digital currency network.

CAMoE + Dual SoftMax Loss (DSL): Improving Video-Text Retrieval by Multi-Stream Corpus Alignment and Dual Softmax Loss

Course materials for Fall 2021 "CIS6930 Topics in Computing for Data Science" at New College of Florida

This is an example of object detection on Micro bacterium tuberculosis using Mask-RCNN

PyTorch-based framework for Deep Hedging

Research on Event Accumulator Settings for Event-Based SLAM

A lightweight deep network for fast and accurate optical flow estimation.

piSTAR Lab is a modular platform built to make AI experimentation accessible and fun. (pistar.ai)

A plug-and-play library for neural networks written in Python

This is a demo app to be used in the video streaming applications

Learning Dense Representations of Phrases at Scale (Lee et al., 2020)

code for the ICLR'22 paper: On Robust Prefix-Tuning for Text Classification

A simple approach to emable dense segmentation with ViT.

[NeurIPS 2020] Official repository for the project "Listening to Sound of Silence for Speech Denoising"

NU-Wave: A Diffusion Probabilistic Model for Neural Audio Upsampling

Code for "FGR: Frustum-Aware Geometric Reasoning for Weakly Supervised 3D Vehicle Detection", ICRA 2021

The goal of the exercises below is to evaluate the candidate knowledge and problem solving expertise regarding the main development focuses for the iFood ML Platform team: MLOps and Feature Store development.

COIN the currently largest dataset for comprehensive instruction video analysis.

Misc YOLOL scripts for use in the Starbase space sandbox videogame