Official PyTorch implementation of the paper "Recycling Discriminator: Towards Opinion-Unaware Image Quality Assessment Using Wasserstein GAN", accepted to ACM MM 2021 BNI Track.

Related tags

Deep LearningRecycleD
Overview

RecycleD

Official PyTorch implementation of the paper "Recycling Discriminator: Towards Opinion-Unaware Image Quality Assessment Using Wasserstein GAN", accepted to ACM Multimedia 2021 Brave New Ideas (BNI) Track.

Brief Introduction

The core idea of RecycleD is to reuse the pre-trained discriminator in SR WGAN to directly assess the image perceptual quality.

overall_pipeline

In addition, we use the Salient Object Detection (SOD) networks and Image Residuals to produce weight matrices to improve the PatchGAN discriminator.

Requirements

  • Python 3.6
  • NumPy 1.17
  • PyTorch 1.2
  • torchvision 0.4
  • tensorboardX 1.4
  • scikit-image 0.16
  • Pillow 5.2
  • OpenCV-Python 3.4
  • SciPy 1.4

Datasets

For Training

We adopt the commonly used DIV2K as the training set to train SR WGAN.
For training, we use the HR images in "DIV2K/DIV2K_train_HR/", and LR images in "DIV2K/DIV2K_train_LR_bicubic/X4/". (The upscale factor is x4.)
For validation, we use the Set5 & Set14 datasets. You can download these benchmark datasets from LapSRN project page or My Baidu disk with password srbm.

For Test

We use PIPAL, Ma's dataset, BAPPS-Superres as super-resolved image quality datasets.
We use LIVE-itW and KonIQ-10k as artificially distorted image quality datasets.

Getting Started

See the directory shell.

Pre-trained Models

If you want to test the discriminators, you need to download the pre-trained models, and put them into the directory pretrained_models.
Meanwhile, you may need to modify the model location options in the shell scripts so that these model files can be loaded correctly.

License

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

Citation

If you find this repository is useful for your research, please cite the following paper.

(1) BibTeX:

(2) ACM Reference Format:

Yunan Zhu, Haichuan Ma, Jialun Peng, Dong Liu, and Zhiwei Xiong. 2021.
Recycling Discriminator: Towards Opinion-Unaware Image Quality Assessment Using Wasserstein GAN.
In Proceedings of the 29th ACM International Conference on Multimedia (MM ’21), October 20–24, 2021, Virtual Event, China.
ACM, NewYork, NY, USA, 10 pages. https://doi.org/10.1145/3474085.3479234

About Brave New Ideas (BNI) Track

Following paragraphs were directly excerpted from the Call for Brave New Ideas of ACM Multimedia 2021.

The Brave New Ideas (BNI) Track of ACM Multimedia 2021 is calling for innovative papers that open up new vistas for multimedia research and stimulate activity towards addressing new, long term challenges of interest to the multimedia research community. Submissions should be scientifically rigorous and also introduce fresh perspectives.

We understand "brave" to mean that a paper (or an area of research introduced by the paper) has great potential for high impact. For the proposed algorithm, technology or application to be understood as high impact, the authors should be able to argue that their proposal is important to solving problems, to supporting new perspectives, or to providing services that directly affect people's lives.

We understand "new" to mean that an idea has not yet been proposed before. The component techniques and technologies may exist, but their integration must be novel.

BNI FAQ
1.What type of papers are suitable for the BNI track?
The BNI track invites papers with brave and new ideas, where "brave" means “out-of-the-box thinking” ideas that may generate high impact and "new" means ideas not yet been proposed before. The highlight of BNI 2021 is "Multimedia for Social Good", where innovative research showcasing the benefit to the general public are encouraged.
2.What is the format requirement for BNI papers?
The paper format requirement is consistent with that of the regular paper.
4.How selective is the BNI track?
The BNI track is at least as competitive as the regular track. A BNI paper is regarded as respectful if not more compared to a regular paper. It is even more selective than the regular one with the acceptance rate at ~10% in previous years.
6.How are the BNI papers published?
The BNI papers are officially published in the conference proceeding.

Acknowledgements

This code borrows partially from the repo BasicSR.
We use the SOD networks from BASNet and U-2-Net.

Owner
Yunan Zhu
MEng student at EEIS, USTC. [email protected]
Boostcamp CV Serving For Python

Boostcamp-CV-Serving Prerequisites MySQL GCP Cloud Storage GCP key file Sentry Streamlit Cloud Secrets: .streamlit/secrets.toml #DO NOT SHARE THIS I

Jungwon Seo 19 Feb 22, 2022
Code for paper "Multi-level Disentanglement Graph Neural Network"

Multi-level Disentanglement Graph Neural Network (MD-GNN) This is a PyTorch implementation of the MD-GNN, and the code includes the following modules:

Lirong Wu 6 Dec 29, 2022
AQP is a modular pipeline built to enable the comparison and testing of different quality metric configurations.

Audio Quality Platform - AQP An Open Modular Python Platform for Objective Speech and Audio Quality Metrics AQP is a highly modular pipeline designed

Jack Geraghty 24 Oct 01, 2022
CS5242_2021 - Neural Networks and Deep Learning, NUS CS5242, 2021

CS5242_2021 Neural Networks and Deep Learning, NUS CS5242, 2021 Cloud Machine #1 : Google Colab (Free GPU) Follow this Notebook installation : https:/

Xavier Bresson 165 Oct 25, 2022
An image base contains 490 images for learning (400 cars and 90 boats), and another 21 images for testingAn image base contains 490 images for learning (400 cars and 90 boats), and another 21 images for testing

SVM Données Une base d’images contient 490 images pour l’apprentissage (400 voitures et 90 bateaux), et encore 21 images pour fait des tests. Prétrait

Achraf Rahouti 3 Nov 30, 2021
We will release the code of "ConTNet: Why not use convolution and transformer at the same time?" in this repo

ConTNet Introduction ConTNet (Convlution-Tranformer Network) is proposed mainly in response to the following two issues: (1) ConvNets lack a large rec

93 Nov 08, 2022
Posterior temperature optimized Bayesian models for inverse problems in medical imaging

Posterior temperature optimized Bayesian models for inverse problems in medical imaging Max-Heinrich Laves*, Malte Tölle*, Alexander Schlaefer, Sandy

Artificial Intelligence in Cardiovascular Medicine (AICM) 6 Sep 19, 2022
Development Kit for the SoccerNet Challenge

SoccerNetv2-DevKit Welcome to the SoccerNet-V2 Development Kit for the SoccerNet Benchmark and Challenge. This kit is meant as a help to get started w

Silvio Giancola 117 Dec 30, 2022
Research Artifact of USENIX Security 2022 Paper: Automated Side Channel Analysis of Media Software with Manifold Learning

Manifold-SCA Research Artifact of USENIX Security 2022 Paper: Automated Side Channel Analysis of Media Software with Manifold Learning The repo is org

Yuanyuan Yuan 172 Dec 29, 2022
Python Classes: Medical Insurance Project using Object Oriented Programming Concepts

Medical-Insurance-Project-OOP Python Classes: Medical Insurance Project using Object Oriented Programming Concepts Classes are an incredibly useful pr

Hugo B. 0 Feb 04, 2022
Using VideoBERT to tackle video prediction

VideoBERT This repo reproduces the results of VideoBERT (https://arxiv.org/pdf/1904.01766.pdf). Inspiration was taken from https://github.com/MDSKUL/M

75 Dec 14, 2022
Bringing Computer Vision and Flutter together , to build an awesome app !!

Bringing Computer Vision and Flutter together , to build an awesome app !! Explore the Directories Flutter · Machine Learning Table of Contents About

Padmanabha Banerjee 14 Apr 07, 2022
The official repository for paper ''Domain Generalization for Vision-based Driving Trajectory Generation'' submitted to ICRA 2022

DG-TrajGen The official repository for paper ''Domain Generalization for Vision-based Driving Trajectory Generation'' submitted to ICRA 2022. Our Meth

Wang 25 Sep 26, 2022
Single-Stage 6D Object Pose Estimation, CVPR 2020

Overview This repository contains the code for the paper Single-Stage 6D Object Pose Estimation. Yinlin Hu, Pascal Fua, Wei Wang and Mathieu Salzmann.

CVLAB @ EPFL 89 Dec 26, 2022
Paper Title: Heterogeneous Knowledge Distillation for Simultaneous Infrared-Visible Image Fusion and Super-Resolution

HKDnet Paper Title: "Heterogeneous Knowledge Distillation for Simultaneous Infrared-Visible Image Fusion and Super-Resolution" Email:

wasteland 11 Nov 12, 2022
Jupyter notebooks for the code samples of the book "Deep Learning with Python"

Jupyter notebooks for the code samples of the book "Deep Learning with Python"

François Chollet 16.2k Dec 30, 2022
Combine Tacotron2 and Hifi GAN to generate speech from text

EndToEndTextToSpeech Combine Tacotron2 and Hifi GAN to generate speech from text Download weights Hifi GAN - hifi_gan/checkpoint/ : pretrain 2.5M ste

Phạm Quốc Huy 1 Dec 18, 2021
Adversarial Reweighting for Partial Domain Adaptation

Adversarial Reweighting for Partial Domain Adaptation Code for paper "Xiang Gu, Xi Yu, Yan Yang, Jian Sun, Zongben Xu, Adversarial Reweighting for Par

12 Dec 01, 2022
This is official implementaion of paper "Token Shift Transformer for Video Classification".

This is official implementaion of paper "Token Shift Transformer for Video Classification". We achieve SOTA performance 80.40% on Kinetics-400 val. Paper link

VideoNet 60 Dec 30, 2022
🥈78th place in Riiid Answer Correctness Prediction competition

Riiid Answer Correctness Prediction Introduction This repository is the code that placed 78th in Riiid Answer Correctness Prediction competition. Requ

Jungwoo Park 10 Jul 14, 2022