Unsupervised Video Interpolation using Cycle Consistency

Overview

Unsupervised Video Interpolation using Cycle Consistency

Project | Paper | YouTube

Unsupervised Video Interpolation using Cycle Consistency
Fitsum A. Reda, Deqing Sun*, Aysegul Dundar, Mohammad Shoeybi, Guilin Liu, Kevin J. Shih, Andrew Tao, Jan Kautz, Bryan Catanzaro
NVIDIA Corporation
In International Conferene on Computer Vision (ICCV) 2019.
( * Currently affiliated with Google. )

Installation
# Get unsupervised video interpolation source codes
git clone https://github.com/NVIDIA/unsupervised-video-interpolation.git
cd unsupervised-video-interpolation
mkdir pretrained_models

# Build Docker Image
docker build -t unsupervised-video-interpolation -f Dockerfile .

If you prefer not to use docker, you can manually install the following requirements:

  • An NVIDIA GPU and CUDA 9.0 or higher. Some operations only have gpu implementation.
  • PyTorch (>= 1.0)
  • Python 3
  • numpy
  • scikit-image
  • imageio
  • pillow
  • tqdm
  • tensorboardX
  • natsort
  • ffmpeg
  • torchvision

To propose a model or change for inclusion, please submit a pull request.

Multiple GPU training and mixed precision training are supported, and the code provides examples for training and inference. For more help, type

python3 train.py --help

Network Architectures

Our repo now supports Super SloMo. Other video interpolation architectures can be integrated with our repo with minimal changes, for instance DVF or SepConv.

Pre-trained Models

We've included pre-trained models trained with cycle consistency (CC) alone, or with cycle consistency with Psuedo-supervised (CC + PS) losses.
Download checkpoints to a folder pretrained_models.

Supervised Baseline Weights

Unsupervised Finetuned Weights

Fully Unsupervised Weights for UCF101 evaluation

Data Loaders

We use VideoInterp and CycleVideoInterp (in datasets) dataloaders for all frame sequences, i.e. Adobe, YouTube, SlowFlow, Sintel, and UCF101.

We split Slowflow dataset into disjoint sets: A low FPS training (3.4K frames) and a high FPS test (414 frames) subset. We form the test set by selecting the first nine frames in each of the 46 clips, and train set by temporally sub-sampling the remaining frames from 240-fps to 30-fps. During evaluation, our models take as input the first and ninth frame in each test clip and interpolate seven intermediate frames. We follow a similar procedure for Sintel-1008fps, but interpolate 41 intermediate frames, i.e., conversion of frame rate from 24- to 1008-fps. Note, since SlowFlow and Sintel are of high resolution, we downsample all frames by a factor of 2 isotropically.
All training and evaluations presented in the paper are done on the spatially downsampled sequences.

For UCF101, we simply use the the test provided here.

Generating Interpolated Frames or Videos

  • --write_video and --write_images, if enabled will create an interpolated video and interpolated frame sequences, respectively.
#Example creation of interpolated videos, where we interleave low FPS input frames with one or more interpolated intermediate frames.
python3 eval.py --model CycleHJSuperSloMo --num_interp 7 --flow_scale 2 --val_file ${/path/to/input/sequences} \
    --name ${video_name} --save ${/path/to/output/folder} --post_fix ${output_image_tag} \
    --resume ${/path/to/pre-trained/model} --write_video
  • If input sequences for interpolation do not contain ground-truth intermediate frames, add --val_sample_rate 0 and --val_step_size 1 to the example script above.
  • For a simple test on two input frames, set --val_file to the folder containing both frames, and set --val_sample_rate 0, --val_step_size 1.

Images : Results and Comparisons

.
.
.

Inference for Unsupervised Models

  • UCF101: A total of 379 folders, each with three frames, with the middle frame being the ground-truth for a single frame interpolation.
# Evaluation of model trained with CC alone on Adobe-30fps dataset
# PSNR: 34.47, SSIM: 0.946, IE: 5.50
python3 eval.py --model CycleHJSuperSloMo --num_interp 1 --flow_scale 1 --val_file /path/to/ucf/root \
    --resume ./pretrained_models/fully_unsupervised_adobe30fps.pth
# Evaluation of model trained with CC alone on Battlefield-30fps dataset
# PSNR: 34.55, SSIM: 0.947, IE: 5.38
python3 eval.py --model CycleHJSuperSloMo --num_interp 1 --flow_scale 1 --val_file /path/to/ucf/root \
    --resume ./pretrained_models/fully_unsupervised_battlefield30fps.pth
  • SlowFlow: A total of 46 folders, each with nine frames, with the intermediate nine frames being ground-truths for a 30->240FPS multi-frame interpolation.
# Evaluation of model trained with CC alone on SlowFlow-30fps train split
# PSNR: 32.35, SSIM: 0.886, IE: 6.78
python3 eval.py --model CycleHJSuperSloMo --num_interp 7 --flow_scale 2 --val_file /path/to/SlowFlow/val \
    --resume ./pretrained_models/unsupervised_random2slowflow.pth
# Evaluation of model finetuned with CC+PS losses on SlowFlow-30fps train split.
# Model pre-trained with supervision on Adobe-240fps.
# PSNR: 33.05, SSIM: 0.890, IE: 6.62
python3 eval.py --model CycleHJSuperSloMo --num_interp 7 --flow_scale 2 --val_file /path/to/SlowFlow/val \
    --resume ./pretrained_models/unsupervised_adobe2slowflow.pth
# Evaluation of model finetuned with CC+PS losses on SlowFlow-30fps train split.
# Model pre-trained with supervision on Adobe+YouTube-240fps.
# PSNR: 33.20, SSIM: 0.891, IE: 6.56
python3 eval.py --model CycleHJSuperSloMo --num_interp 7 --flow_scale 2 --val_file /path/to/SlowFlow/val \
    --resume ./pretrained_models/unsupervised_adobe+youtube2slowflow.pth
  • Sintel: A total of 13 folders, each with 43 frames, with the intermediate 41 frames being ground-truths for a 30->1008FPS multi-frame interpolation.
We simply use the same commands used for SlowFlow, but setting `--num_interp 41`
and the corresponding `--resume *2sintel.pth` pre-trained models should lead to the number we presented in our papers.

Inference for Supervised Baseline Models

  • UCF101: A total of 379 folders, each with three frames, with the middle frame being the ground-truth for a single frame interpolation.
# Evaluation of model trained with Paird-GT on Adobe-240fps dataset
# PSNR: 34.63, SSIM: 0.946, IE: 5.48
python3 eval.py --model HJSuperSloMo --num_interp 1 --flow_scale 1 --val_file /path/to/ucf/root \
    --resume ./pretrained_models/baseline_superslomo_adobe.pth
  • SlowFlow: A total of 46 folders, each with nine frames, with the intermediate nine frames being ground-truths for a 30->240FPS multi-frame interpolation.
# Evaluation of model trained with paird-GT on Adobe-240fps dataset
# PSNR: 32.84, SSIM: 0.887, IE: 6.67
python3 eval.py --model HJSuperSloMo --num_interp 7 --flow_scale 2 --val_file /path/to/SlowFlow/val \
    --resume ./pretrained_models/baseline_superslomo_adobe.pth
# Evaluation of model trained with paird-GT on Adobe+YouTube-240fps dataset
# PSNR: 33.13, SSIM: 0.889, IE: 6.63
python3 eval.py --model HJSuperSloMo --num_interp 7 --flow_scale 2 --val_file /path/to/SlowFlow/val \
    --resume ./pretrained_models/baseline_superslomo_adobe+youtube.pth
  • Sintel: We use commands similar to SlowFlow, but setting --num_interp 41.

Training and Reproducing Our Results

# CC alone: Fully unsupervised training on SlowFlow and evaluation on SlowFlow
# SlowFlow/val target PSNR: 32.35, SSIM: 0.886, IE: 6.78
python3 -m torch.distributed.launch --nproc_per_node=16 train.py --model CycleHJSuperSloMo \
    --flow_scale 2.0 --batch_size 2 --crop_size 384 384 --print_freq 1 --dataset CycleVideoInterp \
    --step_size 1 --sample_rate 0 --num_interp 7 --val_num_interp 7 --skip_aug --save_freq 20 --start_epoch 0 \
    --train_file /path/to/SlowFlow/train --val_file SlowFlow/val --name unsupervised_slowflow --save /path/to/output 

# --nproc_per_node=16, we use a total of 16 V100 GPUs over two nodes.
# CC + PS: Unsupervised fine-tuning on SlowFlow with a baseline model pre-trained on Adobe+YouTube-240fps.
# SlowFlow/val target PSNR: 33.20, SSIM: 0.891, IE: 6.56
python3 -m torch.distributed.launch --nproc_per_node=16 train.py --model CycleHJSuperSloMo \
    --flow_scale 2.0 --batch_size 2 --crop_size 384 384 --print_freq 1 --dataset CycleVideoInterp \
    --step_size 1 --sample_rate 0 --num_interp 7 --val_num_interp 7 --skip_aug --save_freq 20 --start_epoch 0 \
    --train_file /path/to/SlowFlow/train --val_file /path/to/SlowFlow/val --name finetune_slowflow \
    --save /path/to/output --resume ./pretrained_models/baseline_superslomo_adobe+youtube.pth
# Supervised baseline training on Adobe240-fps and evaluation on SlowFlow
# SlowFlow/val target PSNR: 32.84, SSIM: 0.887, IE: 6.67
python3 -m torch.distributed.launch --nproc_per_node=16 train.py --model HJSuperSloMo \
    --flow_scale 2.0 --batch_size 2 --crop_size 352 352 --print_freq 1 --dataset VideoInterp \
    --num_interp 7 --val_num_interp 7 --skip_aug --save_freq 20 --start_epoch 0 --stride 32 \
    --train_file /path/to/Adobe-240fps/train --val_file /path/to/SlowFlow/val --name supervised_adobe \
    --save /path/to/output

Reference

If you find this implementation useful in your work, please acknowledge it appropriately and cite the paper or code accordingly:

@InProceedings{Reda_2019_ICCV,
author = {Fitsum A Reda and Deqing Sun and Aysegul Dundar and Mohammad Shoeybi and Guilin Liu and Kevin J Shih and Andrew Tao and Jan Kautz and Bryan Catanzaro},
title = {Unsupervised Video Interpolation Using Cycle Consistency},
booktitle = {The IEEE International Conference on Computer Vision (ICCV)},
month = {October},
year = {2019},
url={https://nv-adlr.github.io/publication/2019-UnsupervisedVideoInterpolation}
}

We encourage people to contribute to our code base and provide suggestions, point any issues, or solution using merge request, and we hope this repo is useful.

Acknowledgments

Parts of the code were inspired by NVIDIA/flownet2-pytorch, ClementPinard/FlowNetPytorch, and avinashpaliwal/Super-SloMo.

We would also like to thank Huaizu Jiang.

Coding style

  • 4 spaces for indentation rather than tabs
  • 80 character line length
  • PEP8 formatting
Owner
NVIDIA Corporation
NVIDIA Corporation
Plugin adapted from Ultralytics to bring YOLOv5 into Napari

napari-yolov5 Plugin adapted from Ultralytics to bring YOLOv5 into Napari. Training and detection can be done using the GUI. Training dataset must be

2 May 05, 2022
A distributed deep learning framework that supports flexible parallelization strategies.

FlexFlow FlexFlow is a deep learning framework that accelerates distributed DNN training by automatically searching for efficient parallelization stra

528 Dec 25, 2022
Self-supervised learning optimally robust representations for domain generalization.

OptDom: Learning Optimal Representations for Domain Generalization This repository contains the official implementation for Optimal Representations fo

Yangjun Ruan 18 Aug 25, 2022
一套完整的微博舆情分析流程代码,包括微博爬虫、LDA主题分析和情感分析。

已经将项目的关键文件上传,包含微博爬虫、LDA主题分析和情感分析三个部分。 1.微博爬虫 实现微博评论爬取和微博用户信息爬取,一天大概十万条。 2.LDA主题分析 实现文档主题抽取,包括数据清洗及分词、主题数的确定(主题一致性和困惑度)和最优主题模型的选择(暴力搜索)。 3.情感分析 实现评论文本的

182 Jan 02, 2023
PyTorch Implementation of ECCV 2020 Spotlight TuiGAN: Learning Versatile Image-to-Image Translation with Two Unpaired Images

TuiGAN-PyTorch Official PyTorch Implementation of "TuiGAN: Learning Versatile Image-to-Image Translation with Two Unpaired Images" (ECCV 2020 Spotligh

181 Dec 09, 2022
This repository accompanies our paper “Do Prompt-Based Models Really Understand the Meaning of Their Prompts?”

This repository accompanies our paper “Do Prompt-Based Models Really Understand the Meaning of Their Prompts?” Usage To replicate our results in Secti

Albert Webson 64 Dec 11, 2022
Classical OCR DCNN reproduction based on PaddlePaddle framework.

Paddle-SVHN Classical OCR DCNN reproduction based on PaddlePaddle framework. This project reproduces Multi-digit Number Recognition from Street View I

1 Nov 12, 2021
Explainability of the Implications of Supervised and Unsupervised Face Image Quality Estimations Through Activation Map Variation Analyses in Face Recognition Models

Explainable_FIQA_WITH_AMVA Note This is the official repository of the paper: Explainability of the Implications of Supervised and Unsupervised Face I

3 May 08, 2022
The PyTorch implementation of DiscoBox: Weakly Supervised Instance Segmentation and Semantic Correspondence from Box Supervision.

DiscoBox: Weakly Supervised Instance Segmentation and Semantic Correspondence from Box Supervision The PyTorch implementation of DiscoBox: Weakly Supe

Shiyi Lan 1 Oct 23, 2021
Code for 2021 NeurIPS --- Towards Multi-Grained Explainability for Graph Neural Networks

ReFine: Multi-Grained Explainability for GNNs This is the official code for Towards Multi-Grained Explainability for Graph Neural Networks (NeurIPS 20

Shirley (Ying-Xin) Wu 47 Dec 16, 2022
An open source library for face detection in images. The face detection speed can reach 1000FPS.

libfacedetection This is an open source library for CNN-based face detection in images. The CNN model has been converted to static variables in C sour

Shiqi Yu 11.4k Dec 27, 2022
Self-Supervised CNN-GCN Autoencoder

GCNDepth Self-Supervised CNN-GCN Autoencoder GCNDepth: Self-supervised monocular depth estimation based on graph convolutional network To be published

53 Dec 14, 2022
A Python implementation of active inference for Markov Decision Processes

A Python package for simulating Active Inference agents in Markov Decision Process environments. Please see our companion preprint on arxiv for an ove

235 Dec 21, 2022
With this package, you can generate mixed-integer linear programming (MIP) models of trained artificial neural networks (ANNs) using the rectified linear unit (ReLU) activation function

With this package, you can generate mixed-integer linear programming (MIP) models of trained artificial neural networks (ANNs) using the rectified linear unit (ReLU) activation function. At the momen

ChemEngAI 40 Dec 27, 2022
Implement of "Training deep neural networks via direct loss minimization" in PyTorch for 0-1 loss

This is the implementation of "Training deep neural networks via direct loss minimization" published at ICML 2016 in PyTorch. The implementation targe

Cuong Nguyen 1 Jan 18, 2022
Repository for the semantic WMI loss

Installation: pip install -e . Installing DL2: First clone DL2 in a separate directory and install it using the following commands: git clone https:/

Nick Hoernle 4 Sep 15, 2022
The official implementation of the CVPR 2021 paper FAPIS: a Few-shot Anchor-free Part-based Instance Segmenter

FAPIS The official implementation of the CVPR 2021 paper FAPIS: a Few-shot Anchor-free Part-based Instance Segmenter Introduction This repo is primari

Khoi Nguyen 8 Dec 11, 2022
A Dataset for Direct Quotation Extraction and Attribution in News Articles.

DirectQuote - A Dataset for Direct Quotation Extraction and Attribution in News Articles DirectQuote is a corpus containing 19,760 paragraphs and 10,3

THUNLP-MT 9 Sep 23, 2022
Fast sparse deep learning on CPUs

SPARSEDNN **If you want to use this repo, please send me an email: [email pro

Ziheng Wang 44 Nov 30, 2022