No-Reference Image Quality Assessment via Transformers, Relative Ranking, and Self-Consistency

Last update: Dec 30, 2022

Related tags

Deep Learning TReS

Overview

This repository contains the implementation for the paper:

No-Reference Image Quality Assessment via Transformers, Relative Ranking, and Self-Consistency (WACV 2022) Video

Creat Environment

This code is train and test on Ubuntu 16.04 while using Anaconda, python 3.6.6, and pytorch 1.8.0. To set up the evironment run: conda env create -f environment.yml after installing the virtuall env you should be able to run python -c "import torch; print(torch.__version__)" in the terminal and see 1.8.0

Datasets

In this work we use 7 datasets for evaluation (LIVE, CSIQ, TID2013, KADID10K, CLIVE, KonIQ, LIVEFB)

To start training please make sure to follow the correct folder structure for each of the aformentioned datasets as provided bellow:

LIVE

live
    |--fastfading
    |    |  ...     
    |--blur
    |    |  ... 
    |--jp2k
    |    |  ...     
    |--jpeg
    |    |  ...     
    |--wn
    |    |  ...     
    |--refimgs
    |    |  ...     
    |--dmos.mat
    |--dmos_realigned.mat
    |--refnames_all.mat
    |--readme.txt

CSIQ

csiq
    |--dst_imgs_all
    |    |--1600.AWGN.1.png
    |    |  ... (you need to put all the distorted images here)
    |--src_imgs
    |    |--1600.png
    |    |  ...
    |--csiq.DMOS.xlsx
    |--csiq_label.txt

TID2013

tid2013
    |--distorted_images
    |--reference_images
    |--mos.txt
    |--mos_std.txt
    |--mos_with_names.txt
    |--readme

KADID10K

kadid10k
    |--distorted_images
    |    |--I01_01_01.png
    |    |  ...    
    |--reference_images
    |    |--I01.png
    |    |  ...    
    |--dmos.csv
    |--mv.sh.save
    |--mvv.sh

CLIVE

clive
    |--Data
    |    |--I01_01_01.png
    |    |  ...    
    |--Images
    |    |--I01.png
    |    |  ...    
    |--ChallengeDB_release
    |    |--README.txt
    |--dmos.csv
    |--mv.sh.save
    |--mvv.sh

KonIQ

fblive
   |--1024x768
   |    |  992920521.jpg 
   |    |  ... (all the images should be here)     
   |--koniq10k_scores_and_distributions.csv

LIVEFB

fblive
   |--FLIVE
   |    |  AVA__149.jpg    
   |    |  ... (all the images should be here)     
   |--labels_image.csv

Training

The training scrips are provided in the run.sh. Please change the paths correspondingly. Please note that to achive the same performace the parameters should match the ones in the run.sh files.

Pretrained models

The pretrain models are provided here.

Acknowledgement

This code is borrowed parts from HyperIQA and DETR.

FAQs

What is the difference between self-consistency and ensembling? and will the self-consistency increase the interface time?

In ensampling methods, we need to have several models (with different initializations) and ensemble the results during the training and testing, but in our self-consistency model, we enforce one model to have consistent performance for one network during the training while the network has an input with different transformations. Our self-consistency model has the same interface time/parameters in the testing similar to the model without self-consistency. In other words, we are not adding any new parameters to the network and it won't affect the interface.

What is the difference between self-consistency and augmentation?

In augmentation, we augment an input and send it to one network, so although the network will become robust to different augmentation, it will never have the chance of enforcing the outputs to be the same for different versions of an input at the same time. In our self-consistency approach, we force the network to have a similar output for an image with a different transformation (in our case horizontal flipping) which leads to more robust performance. Please also note that we still use augmentation during the training, so our model is benefiting from the advantages of both augmentation and self-consistency. Also, please see Fig. 1 in the main paper, where we showed that models that used augmentation alone are sensitive to simple transformations.

Why does the relative ranking loss apply to the samples with the highest and lowest quality scores, why not applying it to all the samples?

1) We did not see a significant improvement by applying our ranking loss to all the samples within each batch compared to the case that we just use extreme cases. 2) Considering more samples lead to more gradient back-propagation and therefore more computation during the training which causes slower training.

Citation

If you find this work useful for your research, please cite our paper:

@InProceedings{golestaneh2021no,
  title={No-Reference Image Quality Assessment via Transformers, Relative Ranking, and Self-Consistency},
  author={Golestaneh, S Alireza and Dadsetan, Saba and Kitani, Kris M},
  booktitle={Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision},
  pages={3209--3218},
  year={2022}
}

If you have any questions about our work, please do not hesitate to contact [email protected]

No-Reference Image Quality Assessment via Transformers, Relative Ranking, and Self-Consistency

Related tags

Overview

Creat Environment

Datasets

Training

Pretrained models

Acknowledgement

FAQs

Citation

Owner

Alireza Golestaneh

PPO is a very popular Reinforcement Learning algorithm at present.

Bot developed in Python that automates races in pegaxy.

[CVPR2022] Bridge-Prompt: Towards Ordinal Action Understanding in Instructional Videos

CONditionals for Ordinal Regression and classification in PyTorch

Official code for MPG2: Multi-attribute Pizza Generator: Cross-domain Attribute Control with Conditional StyleGAN

An Active Automata Learning Library Written in Python

A PyTorch implementation of "Predict then Propagate: Graph Neural Networks meet Personalized PageRank" (ICLR 2019).

PyTorch Implementation of Daft-Exprt: Robust Prosody Transfer Across Speakers for Expressive Speech Synthesis

This project deploys a yolo fastest model in the form of tflite on raspberry 3b+. The model is from another repository of mine called -Trash-Classification-Car

Catch-all collection of generative art made using processing

We propose a new method for effective shadow removal by regarding it as an exposure fusion problem.

Source code for EquiDock: Independent SE(3)-Equivariant Models for End-to-End Rigid Protein Docking (ICLR 2022)

Code release for Hu et al. Segmentation from Natural Language Expressions. in ECCV, 2016

This repository will be a summary and outlook on all our open, medical, AI advancements.

This repo provides the source code & data of our paper "GreaseLM: Graph REASoning Enhanced Language Models"

A large-scale database for graph representation learning

Train Dense Passage Retriever (DPR) with a single GPU

A library for answering questions using data you cannot see

Artificial intelligence technology inferring issues and logically supporting facts from raw text

Prediction of MBA refinance Index (Mortgage prepayment)