Depth-Aware Video Frame Interpolation (CVPR 2019)

Related tags

Deep LearningDAIN
Overview

DAIN (Depth-Aware Video Frame Interpolation)

Project | Paper

Wenbo Bao, Wei-Sheng Lai, Chao Ma, Xiaoyun Zhang, Zhiyong Gao, and Ming-Hsuan Yang

IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CVPR 2019

This work is developed based on our TPAMI work MEMC-Net, where we propose the adaptive warping layer. Please also consider referring to it.

Table of Contents

  1. Introduction
  2. Citation
  3. Requirements and Dependencies
  4. Installation
  5. Testing Pre-trained Models
  6. Downloading Results
  7. Slow-motion Generation
  8. Training New Models
  9. Google Colab Demo

Introduction

We propose the Depth-Aware video frame INterpolation (DAIN) model to explicitly detect the occlusion by exploring the depth cue. We develop a depth-aware flow projection layer to synthesize intermediate flows that preferably sample closer objects than farther ones. Our method achieves state-of-the-art performance on the Middlebury dataset. We provide videos here.

Citation

If you find the code and datasets useful in your research, please cite:

@inproceedings{DAIN,
    author    = {Bao, Wenbo and Lai, Wei-Sheng and Ma, Chao and Zhang, Xiaoyun and Gao, Zhiyong and Yang, Ming-Hsuan}, 
    title     = {Depth-Aware Video Frame Interpolation}, 
    booktitle = {IEEE Conference on Computer Vision and Pattern Recognition},
    year      = {2019}
}
@article{MEMC-Net,
     title={MEMC-Net: Motion Estimation and Motion Compensation Driven Neural Network for Video Interpolation and Enhancement},
     author={Bao, Wenbo and Lai, Wei-Sheng, and Zhang, Xiaoyun and Gao, Zhiyong and Yang, Ming-Hsuan},
     journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
     doi={10.1109/TPAMI.2019.2941941},
     year={2018}
}

Requirements and Dependencies

  • Ubuntu (We test with Ubuntu = 16.04.5 LTS)
  • Python (We test with Python = 3.6.8 in Anaconda3 = 4.1.1)
  • Cuda & Cudnn (We test with Cuda = 9.0 and Cudnn = 7.0)
  • PyTorch (The customized depth-aware flow projection and other layers require ATen API in PyTorch = 1.0.0)
  • GCC (Compiling PyTorch 1.0.0 extension files (.c/.cu) requires gcc = 4.9.1 and nvcc = 9.0 compilers)
  • NVIDIA GPU (We use Titan X (Pascal) with compute = 6.1, but we support compute_50/52/60/61 devices, should you have devices with higher compute capability, please revise this)

Installation

Download repository:

$ git clone https://github.com/baowenbo/DAIN.git

Before building Pytorch extensions, be sure you have pytorch >= 1.0.0:

$ python -c "import torch; print(torch.__version__)"

Generate our PyTorch extensions:

$ cd DAIN
$ cd my_package 
$ ./build.sh

Generate the Correlation package required by PWCNet:

$ cd ../PWCNet/correlation_package_pytorch1_0
$ ./build.sh

Testing Pre-trained Models

Make model weights dir and Middlebury dataset dir:

$ cd DAIN
$ mkdir model_weights
$ mkdir MiddleBurySet

Download pretrained models,

$ cd model_weights
$ wget http://vllab1.ucmerced.edu/~wenbobao/DAIN/best.pth

and Middlebury dataset:

$ cd ../MiddleBurySet
$ wget http://vision.middlebury.edu/flow/data/comp/zip/other-color-allframes.zip
$ unzip other-color-allframes.zip
$ wget http://vision.middlebury.edu/flow/data/comp/zip/other-gt-interp.zip
$ unzip other-gt-interp.zip
$ cd ..

preinstallations:

$ cd PWCNet/correlation_package_pytorch1_0
$ sh build.sh
$ cd ../my_package
$ sh build.sh
$ cd ..

We are good to go by:

$ CUDA_VISIBLE_DEVICES=0 python demo_MiddleBury.py

The interpolated results are under MiddleBurySet/other-result-author/[random number]/, where the random number is used to distinguish different runnings.

Downloading Results

Our DAIN model achieves the state-of-the-art performance on the UCF101, Vimeo90K, and Middlebury (eval and other). Download our interpolated results with:

$ wget http://vllab1.ucmerced.edu/~wenbobao/DAIN/UCF101_DAIN.zip
$ wget http://vllab1.ucmerced.edu/~wenbobao/DAIN/Vimeo90K_interp_DAIN.zip
$ wget http://vllab1.ucmerced.edu/~wenbobao/DAIN/Middlebury_eval_DAIN.zip
$ wget http://vllab1.ucmerced.edu/~wenbobao/DAIN/Middlebury_other_DAIN.zip

Slow-motion Generation

Our model is fully capable of generating slow-motion effect with minor modification on the network architecture. Run the following code by specifying time_step = 0.25 to generate x4 slow-motion effect:

$ CUDA_VISIBLE_DEVICES=0 python demo_MiddleBury_slowmotion.py --netName DAIN_slowmotion --time_step 0.25

or set time_step to 0.125 or 0.1 as follows

$ CUDA_VISIBLE_DEVICES=0 python demo_MiddleBury_slowmotion.py --netName DAIN_slowmotion --time_step 0.125
$ CUDA_VISIBLE_DEVICES=0 python demo_MiddleBury_slowmotion.py --netName DAIN_slowmotion --time_step 0.1

to generate x8 and x10 slow-motion respectively. Or if you would like to have x100 slow-motion for a little fun.

$ CUDA_VISIBLE_DEVICES=0 python demo_MiddleBury_slowmotion.py --netName DAIN_slowmotion --time_step 0.01

You may also want to create gif animations by:

$ cd MiddleBurySet/other-result-author/[random number]/Beanbags
$ convert -delay 1 *.png -loop 0 Beanbags.gif //1*10ms delay 

Have fun and enjoy yourself!

Training New Models

Download the Vimeo90K triplet dataset for video frame interpolation task, also see here by Xue et al., IJCV19.

$ cd DAIN
$ mkdir /path/to/your/dataset & cd /path/to/your/dataset 
$ wget http://data.csail.mit.edu/tofu/dataset/vimeo_triplet.zip
$ unzip vimeo_triplet.zip
$ rm vimeo_triplet.zip

Download the pretrained MegaDepth and PWCNet models

$ cd MegaDepth/checkpoints/test_local
$ wget http://vllab1.ucmerced.edu/~wenbobao/DAIN/best_generalization_net_G.pth
$ cd ../../../PWCNet
$ wget http://vllab1.ucmerced.edu/~wenbobao/DAIN/pwc_net.pth.tar
$ cd  ..

Run the training script:

$ CUDA_VISIBLE_DEVICES=0 python train.py --datasetPath /path/to/your/dataset --batch_size 1 --save_which 1 --lr 0.0005 --rectify_lr 0.0005 --flow_lr_coe 0.01 --occ_lr_coe 0.0 --filter_lr_coe 1.0 --ctx_lr_coe 1.0 --alpha 0.0 1.0 --patience 4 --factor 0.2

The optimized models will be saved to the model_weights/[random number] directory, where [random number] is generated for different runs.

Replace the pre-trained model_weights/best.pth model with the newly trained model_weights/[random number]/best.pth model. Then test the new model by executing:

$ CUDA_VISIBLE_DEVICES=0 python demo_MiddleBury.py

Google Colab Demo

This is a modification of DAIN that allows the usage of Google Colab and is able to do a full demo interpolation from a source video to a target video.

Original Notebook File by btahir can be found here.

To use the Colab, follow these steps:

  • Download the Colab_DAIN.ipynb file (link).
  • Visit Google Colaboratory (link)
  • Select the "Upload" option, and upload the .ipynb file
  • Start running the cells one by one, following the instructions.

Colab file authors: Styler00Dollar and Alpha.

Contact

Wenbo Bao; Wei-Sheng (Jason) Lai

License

See MIT License

This repo is the code release of EMNLP 2021 conference paper "Connect-the-Dots: Bridging Semantics between Words and Definitions via Aligning Word Sense Inventories".

Connect-the-Dots: Bridging Semantics between Words and Definitions via Aligning Word Sense Inventories This repo is the code release of EMNLP 2021 con

12 Nov 22, 2022
Tutel MoE: An Optimized Mixture-of-Experts Implementation

Project Tutel Tutel MoE: An Optimized Mixture-of-Experts Implementation. Supported Framework: Pytorch Supported GPUs: CUDA(fp32 + fp16), ROCm(fp32) Ho

Microsoft 344 Dec 29, 2022
ZeroGen: Efficient Zero-shot Learning via Dataset Generation

ZEROGEN This repository contains the code for our paper “ZeroGen: Efficient Zero

Jiacheng Ye 31 Dec 30, 2022
Using knowledge-informed machine learning on the PRONOSTIA (FEMTO) and IMS bearing data sets. Predict remaining-useful-life (RUL).

Knowledge Informed Machine Learning using a Weibull-based Loss Function Exploring the concept of knowledge-informed machine learning with the use of a

Tim 43 Dec 14, 2022
Space Ship Simulator using python

FlyOver Basic space-ship simulator using python How to run? Just double click run.py What modules do i need? All modules that i currently using is bui

0 Oct 09, 2022
Quasi-Dense Similarity Learning for Multiple Object Tracking, CVPR 2021 (Oral)

Quasi-Dense Tracking This is the offical implementation of paper Quasi-Dense Similarity Learning for Multiple Object Tracking. We present a trailer th

ETH VIS Research Group 327 Dec 27, 2022
A PyTorch Toolbox for Face Recognition

FaceX-Zoo FaceX-Zoo is a PyTorch toolbox for face recognition. It provides a training module with various supervisory heads and backbones towards stat

JDAI-CV 1.6k Jan 06, 2023
Diagnostic tests for linguistic capacities in language models

LM diagnostics This repository contains the diagnostic datasets and experimental code for What BERT is not: Lessons from a new suite of psycholinguist

61 Jan 02, 2023
A note taker for NVDA. Allows the user to create, edit, view, manage and export notes to different formats.

Quick Notetaker add-on for NVDA The Quick Notetaker add-on is a wonderful tool which allows writing notes quickly and easily anytime and from any app

5 Dec 06, 2022
Angular & Electron desktop UI framework. Angular components for native looking and behaving macOS desktop UI (Electron/Web)

Angular Desktop UI This is a collection for native desktop like user interface components in Angular, especially useful for Electron apps. It starts w

Marc J. Schmidt 49 Dec 22, 2022
A Fast and Accurate One-Stage Approach to Visual Grounding, ICCV 2019 (Oral)

One-Stage Visual Grounding ***** New: Our recent work on One-stage VG is available at ReSC.***** A Fast and Accurate One-Stage Approach to Visual Grou

Zhengyuan Yang 118 Dec 05, 2022
Make a Turtlebot3 follow a figure 8 trajectory and create a robot arm and make it follow a trajectory

HW2 - ME 495 Overview Part 1: Makes the robot move in a figure 8 shape. The robot starts moving when launched on a real turtlebot3 and can be paused a

Devesh Bhura 0 Oct 21, 2022
A ssl analyzer which could analyzer target domain's certificate.

ssl_analyzer A ssl analyzer which could analyzer target domain's certificate. Analyze the domain name ssl certificate information according to the inp

vincent 17 Dec 12, 2022
A rule learning algorithm for the deduction of syndrome definitions from time series data.

README This project provides a rule learning algorithm for the deduction of syndrome definitions from time series data. Large parts of the algorithm a

0 Sep 24, 2021
DCGAN-tensorflow - A tensorflow implementation of Deep Convolutional Generative Adversarial Networks

DCGAN in Tensorflow Tensorflow implementation of Deep Convolutional Generative Adversarial Networks which is a stabilize Generative Adversarial Networ

Taehoon Kim 7.1k Dec 29, 2022
A Deep Reinforcement Learning Framework for Stock Market Trading

DQN-Trading This is a framework based on deep reinforcement learning for stock market trading. This project is the implementation code for the two pap

61 Jan 01, 2023
IGCN : Image-to-graph convolutional network

IGCN : Image-to-graph convolutional network IGCN is a learning framework for 2D/3D deformable model registration and alignment, and shape reconstructi

Megumi Nakao 7 Oct 27, 2022
PyContinual (An Easy and Extendible Framework for Continual Learning)

PyContinual (An Easy and Extendible Framework for Continual Learning) Easy to Use You can sumply change the baseline, backbone and task, and then read

Zixuan Ke 176 Jan 05, 2023
Canonical Capsules: Unsupervised Capsules in Canonical Pose (NeurIPS 2021)

Canonical Capsules: Unsupervised Capsules in Canonical Pose (NeurIPS 2021) Introduction This is the official repository for the PyTorch implementation

165 Dec 07, 2022
Vehicle detection using machine learning and computer vision techniques for Udacity's Self-Driving Car Engineer Nanodegree.

Vehicle Detection Video demo Overview Vehicle detection using these machine learning and computer vision techniques. Linear SVM HOG(Histogram of Orien

hata 1.1k Dec 18, 2022