ICCV2021 - A New Journey from SDRTV to HDRTV.

Related tags

Deep LearningHDRTVNet
Overview

HDRTVNet [Paper Link]

A New Journey from SDRTV to HDRTV

By Xiangyu Chen*, Zhengwen Zhang*, Jimmy S. Ren, Lynhoo Tian, Yu Qiao and Chao Dong

(* indicates equal contribution)

This paper is accepted to ICCV 2021.

Overview

Simplified SDRTV/HDRTV formation pipeline:

Overview of the method:

Getting Started

  1. Dataset
  2. Configuration
  3. How to test
  4. How to train
  5. Metrics
  6. Visualization

Dataset

We conduct a dataset using videos with 4K resolutions under HDR10 standard (10-bit, Rec.2020, PQ) and their counterpart SDR versions from Youtube. The dataset consists of a training set with 1235 image pairs and a test set with 117 image pairs. Please refer to the paper for the details on the processing of the dataset. The dataset can be downloaded from Baidu Netdisk (access code: 6qvu) or OneDrive (access code: HDRTVNet).

We also provide the original Youtube links of these videos, which can be found in this file. Note that we cannot provide the download links since we do not have the copyright to distribute. Please download this dataset only for academic use.

Configuration

Please refer to the requirements. Matlab is also used to process the data, but it is not necessary and can be replaced by OpenCV.

How to test

We provide the pretrained models to test, which can be downloaded from Baidu Netdisk (access code: 2me9) or OneDrive (access code: HDRTVNet). Since our method is casaded of three steps, the results also need to be inferenced step by step.

  • Before testing, it is optional to generate the downsampled inputs of the condition network in advance. Make sure the input_folder and save_LR_folder in ./scripts/generate_mod_LR_bic.m are correct, then run the file using Matlab. After that, matlab-bicubic-downsampled versions of the input SDR images are generated that will be input to the condition network. Note that this step is not necessary, but can reproduce more precise performance.
  • For the first part of AGCM, make sure the paths of dataroot_LQ, dataroot_cond, dataroot_GT and pretrain_model_G in ./codes/options/test/test_AGCM.yml are correct, then run
cd codes
python test.py -opt options/test/test_AGCM.yml
  • Note that if the first step is not preformed, the line of dataroot_cond should be commented. The test results will be saved to ./results/Adaptive_Global_Color_Mapping.
  • For the second part of LE, make sure dataroot_LQ is modified into the path of results obtained by AGCM, then run
python test.py -opt options/test/test_LE.yml
  • Note that results generated by LE can achieve the best quantitative performance. The part of HG is for the completeness of the solution and improving the visual quality forthermore. For testing the last part of HG, make sure dataroot_LQ is modified into the path of results obtained by LE, then run
python test.py -opt options/test/test_HG.yml
  • Note that the results of the each step are 16-bit images that can be converted into HDR10 video.

How to train

  • Prepare the data. Generate the sub-images with specific patch size using ./scripts/extract_subimgs_single.py and generate the down-sampled inputs for the condition network (using the ./scripts/generate_mod_LR_bic.m or any other methods).
  • For AGCM, make sure that the paths and settings in ./options/train/train_AGCM.yml are correct, then run
cd codes
python train.py -opt options/train/train_AGCM.yml
  • For LE, the inputs are generated by the trained AGCM model. The original data should be inferenced through the first step (refer to the last part on how to test AGCM) and then be processed by extracting sub-images. After that, modify the corresponding settings in ./options/train/train_LE.yml and run
python train.py -opt options/train/train_LE.yml
  • For HG, the inputs are also obtained by the last part LE, thus the training data need to be processed by similar operations as the previous two parts. When the data is prepared, it is recommended to pretrain the generator at first by running
python train.py -opt options/train/train_HG_Generator.yml
  • After that, choose a pretrained model and modify the path of pretrained model in ./options/train/train_HG_GAN.yml, then run
python train.py -opt options/train/train_HG_GAN.yml
  • All models and training states are stored in ./experiments.

Metrics

Five metrics are used to evaluate the quantitative performance of different methods, including PSNR, SSIM, SR_SIM, Delta EITP (ITU Rec.2124) and HDR-VDP3. Since the latter three metrics are not very common in recent papers, we provide some reference codes in ./metrics for convenient usage.

Visualization

Since HDR10 is an HDR standard using PQ transfer function for the video, the correct way to visualize the results is to synthesize the image results into a video format and display it on the HDR monitor or TVs that support HDR. The HDR images in our dataset are generated by directly extracting frames from the original HDR10 videos, thus these images consisting of PQ values look relatively dark compared to their true appearances. We provide the reference commands of our extracting frames and synthesizing videos in ./scripts. Please use MediaInfo to check the format and the encoding information of synthesized videos before visualization. If circumstances permit, we strongly recommend to observe the HDR results and the original HDR resources by this way on the HDR dispalyer.

If the HDR displayer is not available, some media players with HDR render can play the HDR video and show a relatively realistic look, such as Potplayer. Note that this is only an approximate alternative, and it still cannot fully restore the appearance of HDR content on HDR monitors.

Citation

If our work is helpful to you, please cite our paper:

@inproceedings{chen2021new,
  title={A New Journey from SDRTV to HDRTV}, 
  author={Chen, Xiangyu and Zhang, Zhengwen and Ren, Jimmy S. and Tian, Lynhoo and Qiao, Yu and Dong, Chao},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  year={2021}
}
Owner
XyChen
PhD. Student,Computer Vision
XyChen
A TensorFlow implementation of Neural Program Synthesis from Diverse Demonstration Videos

ViZDoom http://vizdoom.cs.put.edu.pl ViZDoom allows developing AI bots that play Doom using only the visual information (the screen buffer). It is pri

Hyeonwoo Noh 1 Aug 19, 2020
Code and data of the ACL 2021 paper: Few-Shot Text Ranking with Meta Adapted Synthetic Weak Supervision

MetaAdaptRank This repository provides the implementation of meta-learning to reweight synthetic weak supervision data described in the paper Few-Shot

THUNLP 5 Jun 16, 2022
Channel Pruning for Accelerating Very Deep Neural Networks (ICCV'17)

Channel Pruning for Accelerating Very Deep Neural Networks (ICCV'17)

Yihui He 1k Jan 03, 2023
An open-source project for applying deep learning to medical scenarios

Auto Vaidya An open source solution for creating end-end web app for employing the power of deep learning in various clinical scenarios like implant d

Smaranjit Ghose 18 May 29, 2022
The implementation of "Shuffle Transformer: Rethinking Spatial Shuffle for Vision Transformer"

Shuffle Transformer The implementation of "Shuffle Transformer: Rethinking Spatial Shuffle for Vision Transformer" Introduction Very recently, window-

87 Nov 29, 2022
3D Multi-Person Pose Estimation by Integrating Top-Down and Bottom-Up Networks

3D Multi-Person Pose Estimation by Integrating Top-Down and Bottom-Up Networks Introduction This repository contains the code and models for the follo

124 Jan 06, 2023
VisualGPT: Data-efficient Adaptation of Pretrained Language Models for Image Captioning

VisualGPT Our Paper VisualGPT: Data-efficient Adaptation of Pretrained Language Models for Image Captioning Main Architecture of Our VisualGPT Downloa

Vision CAIR Research Group, KAUST 140 Dec 28, 2022
MiniHack the Planet: A Sandbox for Open-Ended Reinforcement Learning Research

MiniHack the Planet: A Sandbox for Open-Ended Reinforcement Learning Research

Facebook Research 338 Dec 29, 2022
NeurIPS-2021: Neural Auto-Curricula in Two-Player Zero-Sum Games.

NAC Official PyTorch implementation of NAC from the paper: Neural Auto-Curricula in Two-Player Zero-Sum Games. We release code for: Gradient based ora

Xidong Feng 19 Nov 11, 2022
A repository for interferometer controller code.

dses-interferometer-controller A repository for interferometer controller code, hardware, and simulations. See dses.science for more information on th

Eli Reed 1 Jan 17, 2022
Hierarchical Attentive Recurrent Tracking

Hierarchical Attentive Recurrent Tracking This is an official Tensorflow implementation of single object tracking in videos by using hierarchical atte

Adam Kosiorek 147 Aug 07, 2021
Lorien: A Unified Infrastructure for Efficient Deep Learning Workloads Delivery

Lorien: A Unified Infrastructure for Efficient Deep Learning Workloads Delivery Lorien is an infrastructure to massively explore/benchmark the best sc

Amazon Web Services - Labs 45 Dec 12, 2022
Code associated with the paper "Towards Understanding the Data Dependency of Mixup-style Training".

Mixup-Data-Dependency Code associated with the paper "Towards Understanding the Data Dependency of Mixup-style Training". Running Alternating Line Exp

Muthu Chidambaram 0 Nov 11, 2021
Official PyTorch implementation of the paper Image-Based CLIP-Guided Essence Transfer.

TargetCLIP- official pytorch implementation of the paper Image-Based CLIP-Guided Essence Transfer This repository finds a global direction in StyleGAN

Hila Chefer 221 Dec 13, 2022
Compare outputs between layers written in Tensorflow and layers written in Pytorch

Compare outputs of Wasserstein GANs between TensorFlow vs Pytorch This is our testing module for the implementation of improved WGAN in Pytorch Prereq

Hung Nguyen 72 Dec 20, 2022
TensorFlow-based neural network library

Sonnet Documentation | Examples Sonnet is a library built on top of TensorFlow 2 designed to provide simple, composable abstractions for machine learn

DeepMind 9.5k Jan 07, 2023
An implementation of a sequence to sequence neural network using an encoder-decoder

Keras implementation of a sequence to sequence model for time series prediction using an encoder-decoder architecture. I created this post to share a

Luke Tonin 195 Dec 17, 2022
PoseViz – Multi-person, multi-camera 3D human pose visualization tool built using Mayavi.

PoseViz – 3D Human Pose Visualizer Multi-person, multi-camera 3D human pose visualization tool built using Mayavi. As used in MeTRAbs visualizations.

István Sárándi 79 Dec 30, 2022
For encoding a text longer than 512 tokens, for example 800. Set max_pos to 800 during both preprocessing and training.

LongScientificFormer For encoding a text longer than 512 tokens, for example 800. Set max_pos to 800 during both preprocessing and training. Some code

Athar Sefid 6 Nov 02, 2022
Official implementation for Multi-Modal Interaction Graph Convolutional Network for Temporal Language Localization in Videos

Multi-modal Interaction Graph Convolutioal Network for Temporal Language Localization in Videos Official implementation for Multi-Modal Interaction Gr

Zongmeng Zhang 15 Oct 18, 2022