Implementation of the master's thesis "Temporal copying and local hallucination for video inpainting".

Last update: Dec 02, 2022

Overview

Temporal copying and local hallucination for video inpainting

This repository contains the implementation of my master's thesis "Temporal copying and local hallucination for video inpainting". The code has been built using PyTorch Lightning, read its documentation to get a complete overview of how this repository is structured.

Disclaimer: The version published here might contain small differences with the thesis because of the refactoring.

About the data

The thesis uses three different datasets: GOT-10k for the background sequences, YouTube-VOS for realistic mask shapes and DAVIS to test the models with real masked sequences. Some pre-processing steps, which are not published in this repository, have been applied to the data. You can download the exact datasets used in the paper from this link.

The first step is to clone this repository, install its dependencies and other required system packages:

git clone https://github.com/davidalvarezdlt/master_thesis.git
cd master_thesis
pip install -r requirements.txt

apt-get update
apt-get install libturbojpeg ffmpeg libsm6 libxext6

Unzip the file downloaded from the previous link inside ./data. The resulting folder structure should look like this:

master_thesis/
    data/
        DAVIS-2017/
        GOT10k/
        YouTubeVOS/
    lightning_logs/
    master_thesis/
    .gitignore
    .pre-commit-config.yaml
    LICENSE
    README.md
    requirements.txt

Training the Dense Flow Prediction Network (DFPN) model

In short, you can train the model by calling:

python -m master_thesis

You can modify the default parameters of the code by using CLI parameters. Get a complete list of the available parameters by calling:

python -m master_thesis --help

For instance, if we want to train the model using 2 frames, with a batch size of 8 and using one GPUs, we would call:

python -m master_thesis --frames_n 2 --batch_size 8 --gpus 1

Every time you train the model, a new folder inside ./lightning_logs will be created. Each folder represents a different version of the model, containing its checkpoints and auxiliary files.

Training the Copy-and-Hallucinate Network (CHN) model

In this case, you will need to specify that you want to train the CHN model. To do so:

python -m master_thesis --chn --chn_aligner <chn_aligner> --chn_aligner_checkpoint <chn_aligner_checkpoint>

Where --chn_aligner is the model used to align the frames (either cpn or dfpn) and --chn_aligner_checkpoint is the path to its checkpoint.

You can download the checkpoint of the CPN from its original repository (file named weight.pth).

Testing the Dense Flow Prediction Network (DFPN) model

You can align samples from the test split and store them in TensorBoard by calling:

python -m samplernn_pase --test --test_checkpoint <test_checkpoint>

Where --test_checkpoint is a valid path to the model checkpoint that should be used.

Testing the Copy-and-Hallucinate Network (CHN) model

You can inpaint test sequences (they will be stored in a folder) using the three algorithms by calling:

python -m master_thesis --chn --chn_aligner <chn_aligner> --chn_aligner_checkpoint <chn_aligner_checkpoint> --test --test_checkpoint <test_checkpoint>

Notice that now the value of --test_checkpoint must be a valid path to a CHN checkpoint, while --chn_aligner_checkpoint might be the path to a checkpoint of either CPN or DFPN.

Citation

If you find this thesis useful, please use the following citation:

@thesis{Alvarez2020,
    type = {Master's Thesis},
    author = {David Álvarez de la Torre},
    title = {Temporal copying and local hallucination for video onpainting},
    school = {ETH Zürich},
    year = 2020,
}

Implementation of the master's thesis "Temporal copying and local hallucination for video inpainting".

Related tags

Overview

Temporal copying and local hallucination for video inpainting

About the data

Training the Dense Flow Prediction Network (DFPN) model

Training the Copy-and-Hallucinate Network (CHN) model

Testing the Dense Flow Prediction Network (DFPN) model

Testing the Copy-and-Hallucinate Network (CHN) model

Citation

Owner

David Álvarez de la Torre

A pre-trained language model for social media text in Spanish

CVPR2022 paper "Dense Learning based Semi-Supervised Object Detection"

Text mining project; Using distilBERT to predict authors in the classification task authorship attribution.

Code for "ShineOn: Illuminating Design Choices for Practical Video-based Virtual Clothing Try-on", accepted at WACV 2021 Generation of Human Behavior Workshop.

RATE: Overcoming Noise and Sparsity of Textual Features in Real-Time Location Estimation (CIKM'17)

ICLR21 Tent: Fully Test-Time Adaptation by Entropy Minimization

Pytorch implementation of Cut-Thumbnail in the paper Cut-Thumbnail:A Novel Data Augmentation for Convolutional Neural Network.

This repository contains code to run experiments in the paper "Signal Strength and Noise Drive Feature Preference in CNN Image Classifiers."

Controlling the MicriSpotAI robot from scratch

Devkit for 3D -- Some utils for 3D object detection based on Numpy and Pytorch

Neuron Merging: Compensating for Pruned Neurons (NeurIPS 2020)

This is an example of object detection on Micro bacterium tuberculosis using Mask-RCNN

Google-drive-to-sqlite - Create a SQLite database containing metadata from Google Drive

PyTorch implementation for Score-Based Generative Modeling through Stochastic Differential Equations (ICLR 2021, Oral)

基于PaddleClas实现垃圾分类，并转换为inference格式用PaddleHub服务端部署

Official implementation of deep Gaussian process (DGP)-based multi-speaker speech synthesis with PyTorch.

DA2Lite is an automated model compression toolkit for PyTorch.

Code for "NeRS: Neural Reflectance Surfaces for Sparse-View 3D Reconstruction in the Wild," in NeurIPS 2021

This is the code of "Multi-view Contrastive Graph Clustering" in NeurlPS 2021.

Deep Compression for Dense Point Cloud Maps.