This is the official repository of XVFI (eXtreme Video Frame Interpolation)

Overview

XVFI PWC PWC

This is the official repository of XVFI (eXtreme Video Frame Interpolation), https://arxiv.org/abs/2103.16206

Last Update: 20210607

We provide the training and test code along with the trained weights and the dataset (train+test) used for XVFI. If you find this repository useful, please consider citing our paper.

Examples of the VFI (x8 Multi-Frame Interpolation) results on X-TEST

results_045_resized results_079_resized results_158_resized
The [email protected] input frames are interpolated to be [email protected] frames. All results are encoded at 30fps to be played as x8 slow motion and spatially down-scaled due to the limit of file sizes. All methods are trained on X-TRAIN.

Table of Contents

  1. X4K1000FPS
  2. Requirements
  3. Test
  4. Test_Custom
  5. Training
  6. Reference
  7. Contact

X4K1000FPS

Dataset of high-resolution (4096×2160), high-fps (1000fps) video frames with extreme motion.

003 004 045 078 081 146
Some examples of X4K1000FPS dataset, which are frames of 1000-fps and 4K-resolution. Our dataset contains the various scenes with extreme motions. (Displayed in spatiotemporally subsampled .gif files)

We provide our X4K1000FPS dataset which consists of X-TEST and X-TRAIN. Please refer to our main/suppl. paper for the details of the dataset. You can download the dataset from this dropbox link.

X-TEST consists of 15 video clips with 33-length of 4K-1000fps frames. It follows the below directory format:

├──── YOUR_DIR/
    ├──── test/
       ├──── Type1/
          ├──── TEST01/
             ├──── 0000.png
             ├──── ...
             └──── 0032.png
          ├──── TEST02/
             ├──── 0000.png
             ├──── ...
             └──── 0032.png
          ├──── ...
       ├──── ...

X-TRAIN consists of 4,408 clips from various types of 110 scenes. The clips are 65-length of 1000fps frames. Each frame is the size of 768x768 cropped from 4K frame. It follows the below directory format:

├──── YOUR_DIR/
    ├──── train/
       ├──── 002/
          ├──── occ008.320/
             ├──── 0000.png
             ├──── ...
             └──── 0064.png
          ├──── occ008.322/
             ├──── 0000.png
             ├──── ...
             └──── 0064.png
          ├──── ...
       ├──── ...

After downloading the files from the link, decompress the encoded_test.tar.gz and encoded_train.tar.gz. The resulting .mp4 files can be decoded into .png files via running mp4_decoding.py. Please follow the instruction written in mp4_decoding.py.

Requirements

Our code is implemented using PyTorch1.7, and was tested under the following setting:

  • Python 3.7
  • PyTorch 1.7.1
  • CUDA 10.2
  • cuDNN 7.6.5
  • NVIDIA TITAN RTX GPU
  • Ubuntu 16.04 LTS

Caution: since there is "align_corners" option in "nn.functional.interpolate" and "nn.functional.grid_sample" in PyTorch1.7, we recommend you to follow our settings. Especially, if you use the other PyTorch versions, it may lead to yield a different performance.

Test

Quick Start for X-TEST (x8 Multi-Frame Interpolation as in Table 2)

  1. Download the source codes in a directory of your choice .
  2. First download our X-TEST test dataset by following the above section 'X4K1000FPS'.
  3. Download the pre-trained weights, which was trained by X-TRAIN, from this link to place in /checkpoint_dir/XVFInet_X4K1000FPS_exp1.
XVFI
└── checkpoint_dir
   └── XVFInet_X4K1000FPS_exp1
       ├── XVFInet_X4K1000FPS_exp1_latest.pt           
  1. Run main.py with the following options in parse_args:
python main.py --gpu 0 --phase 'test' --exp_num 1 --dataset 'X4K1000FPS' --module_scale_factor 4 --S_tst 5 --multiple 8 

==> It would yield (PSNR/SSIM/tOF) = (30.12/0.870/2.15).

python main.py --gpu 0 --phase 'test' --exp_num 1 --dataset 'X4K1000FPS' --module_scale_factor 4 --S_tst 3 --multiple 8 

==> It would yield (PSNR/SSIM/tOF) = (28.86/0.858/2.67).

Description

  • After running with the above test option, you can get the result images in /test_img_dir/XVFInet_X4K1000FPS_exp1, then obtain the PSNR/SSIM/tOF results per each test clip as "total_metrics.csv" in the same folder.
  • Our proposed XVFI-Net can start from any downscaled input upward by regulating '--S_tst', which is adjustable in terms of the number of scales for inference according to the input resolutions or the motion magnitudes.
  • You can get any Multi-Frame Interpolation (x M) result by regulating '--multiple'.

Quick Start for Vimeo90K (as in Fig. 8)

  1. Download the source codes in a directory of your choice .
  2. First download Vimeo90K dataset from this link (including 'tri_trainlist.txt') to place in /vimeo_triplet.
XVFI
└── vimeo_triplet
       ├──  sequences
       readme.txt
       tri_testlist.txt
       tri_trainlist.txt
  1. Download the pre-trained weights (XVFI-Net_v), which was trained by Vimeo90K, from this link to place in /checkpoint_dir/XVFInet_Vimeo_exp1.
XVFI
└── checkpoint_dir
   └── XVFInet_Vimeo_exp1
       ├── XVFInet_Vimeo_exp1_latest.pt           
  1. Run main.py with the following options in parse_args:
python main.py --gpu 0 --phase 'test' --exp_num 1 --dataset 'Vimeo' --module_scale_factor 2 --S_tst 1 --multiple 2

==> It would yield PSNR = 35.07 on Vimeo90K.

Description

  • After running with the above test option, you can get the result images in /test_img_dir/XVFInet_Vimeo_exp1.
  • There are certain code lines in front of the 'def main()' for a convenience when running with the Vimeo option.
  • The SSIM result of 0.9760 as in Fig. 8 was measured by matlab ssim function for a fair comparison after running the above guide because other SOTA methods did so. We also upload "compare_psnr_ssim.m" matlab file to obtain it.
  • It should be noted that there is a typo "S_trn and S_tst are set to 2" in the current version of XVFI paper, which should be modified to 1 (not 2), sorry for inconvenience.

Test_Custom

Quick Start for your own video data ('--custom_path') for any Multi-Frame Interpolation (x M)

  1. Download the source codes in a directory of your choice .
  2. First prepare your own video datasets in /custom_path by following a hierarchy as belows:
XVFI
└── custom_path
   ├── scene1
       ├── 'xxx.png'
       ├── ...
       └── 'xxx.png'
   ...
   
   ├── sceneN
       ├── 'xxxxx.png'
       ├── ...
       └── 'xxxxx.png'

  1. Download the pre-trained weights trained on X-TRAIN or Vimeo90K as decribed above.

  2. Run main.py with the following options in parse_args (ex) x8 Multi-Frame Interpolation):

# For the model trained on X-TRAIN
python main.py --gpu 0 --phase 'test_custom' --exp_num 1 --dataset 'X4K1000FPS' --module_scale_factor 4 --S_tst 5 --multiple 8 --custom_path './custom_path'
# For the model trained on Vimeo90K
python main.py --gpu 0 --phase 'test_custom' --exp_num 1 --dataset 'Vimeo' --module_scale_factor 2 --S_tst 1 --multiple 8 --custom_path './custom_path'

Description

  • Our proposed XVFI-Net can start from any downscaled input upward by regulating '--S_tst', which is adjustable in terms of the number of scales for inference according to the input resolutions or the motion magnitudes.
  • You can get any Multi-Frame Interpolation (x M) result by regulating '--multiple'.
  • It only supports for '.png' format.
  • Since we can not cover diverse possibilites of naming rule for custom frames, please sort your own frames properly.

Training

Quick Start for X-TRAIN

  1. Download the source codes in a directory of your choice .
  2. First download our X-TRAIN train/val/test datasets by following the above section 'X4K1000FPS' and place them as belows:
XVFI
└── X4K1000FPS
      ├──  train
          ├── 002
          ├── ...
          └── 172
      ├──  val
          ├── Type1
          ├── Type2
          ├── Type3
      ├──  test
          ├── Type1
          ├── Type2
          ├── Type3

  1. Run main.py with the following options in parse_args:
python main.py --phase 'train' --exp_num 1 --dataset 'X4K1000FPS' --module_scale_factor 4 --S_trn 3 --S_tst 5

Quick Start for Vimeo90K

  1. Download the source codes in a directory of your choice .
  2. First download Vimeo90K dataset from this link (including 'tri_trainlist.txt') to place in /vimeo_triplet.
XVFI
└── vimeo_triplet
       ├──  sequences
       readme.txt
       tri_testlist.txt
       tri_trainlist.txt
  1. Run main.py with the following options in parse_args:
python main.py --phase 'train' --exp_num 1 --dataset 'Vimeo' --module_scale_factor 2 --S_trn 1 --S_tst 1

Description

  • You can freely regulate other arguments in the parser of main.py, here

Reference

Hyeonjun Sim*, Jihyong Oh*, and Munchurl Kim "XVFI: eXtreme Video Frame Interpolation", https://arxiv.org/abs/2103.16206, 2021. (* equal contribution)

BibTeX

@article{sim2021xvfi,
  title={XVFI: eXtreme Video Frame Interpolation},
  author={Sim, Hyeonjun and Oh, Jihyong and Kim, Munchurl},
  journal={arXiv preprint arXiv:2103.16206},
  year={2021}
}

Contact

If you have any question, please send an email to either [email protected] or [email protected].

License

The source codes and datasets can be freely used for research and education only. Any commercial use should get formal permission first.

Owner
Jihyong Oh
KAIST Ph.D. Candidate 3rd yr. Please refer to my personal homepage as below (URL).
Jihyong Oh
N-RPG - Novel role playing game da turfu

N-RPG Ce README sera la page de garde du projet. Contenu Il contiendra la présen

4 Mar 15, 2022
LightHuBERT: Lightweight and Configurable Speech Representation Learning with Once-for-All Hidden-Unit BERT

LightHuBERT LightHuBERT: Lightweight and Configurable Speech Representation Learning with Once-for-All Hidden-Unit BERT | Github | Huggingface | SUPER

WangRui 46 Dec 29, 2022
Hysterese plugin with two temperature offset areas

craftbeerpi4 plugin OffsetHysterese Temperatur-Steuerungs-Plugin mit zwei tempereaturbereich abhängigen Offsets. Installation sudo pip3 install https:

HappyHibo 1 Dec 21, 2021
Structural Constraints on Information Content in Human Brain States

Structural Constraints on Information Content in Human Brain States Code accompanying the paper "The information content of brain states is explained

Leon Weninger 3 Sep 07, 2022
Official implementation of Densely connected normalizing flows

Densely connected normalizing flows This repository is the official implementation of NeurIPS 2021 paper Densely connected normalizing flows. Poster a

Matej Grcić 31 Dec 12, 2022
AlphaNet Improved Training of Supernet with Alpha-Divergence

AlphaNet: Improved Training of Supernet with Alpha-Divergence This repository contains our PyTorch training code, evaluation code and pretrained model

Facebook Research 87 Oct 10, 2022
RID-Noise: Towards Robust Inverse Design under Noisy Environments

This is code of RID-Noise. Reproduce RID-Noise Results Toy tasks Please refer to the notebook ridnoise.ipynb to view experiments on three toy tasks. B

Thyrix 2 Nov 23, 2022
[NeurIPS'21] Shape As Points: A Differentiable Poisson Solver

Shape As Points (SAP) Paper | Project Page | Short Video (6 min) | Long Video (12 min) This repository contains the implementation of the paper: Shape

394 Dec 30, 2022
The code for MM2021 paper "Multi-Level Counterfactual Contrast for Visual Commonsense Reasoning"

The Code for MM2021 paper "Multi-Level Counterfactual Contrast for Visual Commonsense Reasoning" Setting up and using the repo Get the dataset. Follow

4 Apr 20, 2022
Uses OpenCV and Python Code to detect a face on the screen

Simple-Face-Detection This code uses OpenCV and Python Code to detect a face on the screen. This serves as an example program. Important prerequisites

Denis Woolley (CreepyD) 1 Feb 12, 2022
Exploring Visual Engagement Signals for Representation Learning

Exploring Visual Engagement Signals for Representation Learning Menglin Jia, Zuxuan Wu, Austin Reiter, Claire Cardie, Serge Belongie and Ser-Nam Lim C

Menglin Jia 9 Jul 23, 2022
Reproducing Results from A Hybrid Approach to Targeting Social Assistance

title author date output Reproducing Results from A Hybrid Approach to Targeting Social Assistance Lendie Follett and Heath Henderson 12/28/2021 html_

Lendie Follett 0 Jan 06, 2022
OpenMMLab Detection Toolbox and Benchmark

MMDetection is an open source object detection toolbox based on PyTorch. It is a part of the OpenMMLab project.

OpenMMLab 22.5k Jan 05, 2023
Differentiable Simulation of Soft Multi-body Systems

Differentiable Simulation of Soft Multi-body Systems Yi-Ling Qiao, Junbang Liang, Vladlen Koltun, Ming C. Lin [Paper] [Code] Updates The C++ backend s

YilingQiao 26 Dec 23, 2022
Deep learning model, heat map, data prepo

deep learning model, heat map, data prepo

Pamela Dekas 1 Jan 14, 2022
This code uses generative adversarial networks to generate diverse task allocation plans for Multi-agent teams.

Mutli-agent task allocation This code uses generative adversarial networks to generate diverse task allocation plans for Multi-agent teams. To change

Biorobotics Lab 5 Oct 12, 2022
Source Code and data for my paper titled Linguistic Knowledge in Data Augmentation for Natural Language Processing: An Example on Chinese Question Matching

Description The source code and data for my paper titled Linguistic Knowledge in Data Augmentation for Natural Language Processing: An Example on Chin

Zhengxiang Wang 3 Jun 28, 2022
This is the paddle code for SeBoW(Self-Born wiring for neural trees), a kind of neural tree born form a large search space

SeBoW: Self-Born Wiring for neural trees(PaddlePaddle version) This is the paddle code for SeBoW(Self-Born wiring for neural trees), a kind of neural

HollyLee 13 Dec 08, 2022
Your interactive network visualizing dashboard

Your interactive network visualizing dashboard Documentation: Here What is Jaal Jaal is a python based interactive network visualizing tool built usin

Mohit 177 Jan 04, 2023
arxiv-sanity, but very lite, simply providing the core value proposition of the ability to tag arxiv papers of interest and have the program recommend similar papers.

arxiv-sanity, but very lite, simply providing the core value proposition of the ability to tag arxiv papers of interest and have the program recommend similar papers.

Andrej 671 Dec 31, 2022