Pytorch implementation of Decoupled Spatial-Temporal Transformer for Video Inpainting

Last update: Dec 13, 2022

Related tags

Deep Learning DSTT

Overview

Decoupled Spatial-Temporal Transformer for Video Inpainting

By Rui Liu, Hanming Deng, Yangyi Huang, Xiaoyu Shi, Lewei Lu, Wenxiu Sun, Xiaogang Wang, Jifeng Dai, Hongsheng Li.

This repo is the official Pytorch implementation of Decoupled Spatial-Temporal Transformer for Video Inpainting.

Introduction

Usage

Prerequisites

Python >= 3.6
Pytorch >= 1.0 and corresponding torchvision (https://pytorch.org/)

Install

Clone this repo:

git clone https://github.com/ruiliu-ai/DSTT.git

Install other packages:

cd DSTT
pip install -r requirements.txt

Training

Dataset preparation

Download datasets (YouTube-VOS and DAVIS) into the data folder.

mkdir data

Training script

python train.py -c configs/youtube-vos.json

Test

Download pre-trained model into checkpoints folder.

mkdir checkpoints

Test script

python test.py -c checkpoints/dstt.pth -v data/DAVIS/JPEGImages/blackswan -m data/DAVIS/Annotations/blackswan

Citing DSTT

If you find DSTT useful in your research, please consider citing:

@article{Liu_2021_DSTT,
  title={Decoupled Spatial-Temporal Transformer for Video Inpainting},
  author={Liu, Rui and Deng, Hanming and Huang, Yangyi and Shi, Xiaoyu and Lu, Lewei and Sun, Wenxiu and Wang, Xiaogang and Li Hongsheng},
  journal={arXiv preprint arXiv:2104.06637},
  year={2021}
}

Acknowledement

This code relies heavily on the video inpainting framework from spatial-temporal transformer net.

Pytorch implementation of Decoupled Spatial-Temporal Transformer for Video Inpainting

Related tags

Overview

Decoupled Spatial-Temporal Transformer for Video Inpainting

Introduction

Usage

Prerequisites

Install

Training

Dataset preparation

Training script

Test

Test script

Citing DSTT

Acknowledement

Owner

An efficient PyTorch implementation of the evaluation metrics in recommender systems.

IAST: Instance Adaptive Self-training for Unsupervised Domain Adaptation (ECCV 2020)

Fader Networks: Manipulating Images by Sliding Attributes - NIPS 2017

Implementation of Transformer in Transformer, pixel level attention paired with patch level attention for image classification, in Pytorch

Asymmetric metric learning for knowledge transfer

TAP: Text-Aware Pre-training for Text-VQA and Text-Caption, CVPR 2021 (Oral)

Official implementation for Likelihood Regret: An Out-of-Distribution Detection Score For Variational Auto-encoder at NeurIPS 2020

PyTorch - Python + Nim

PULSE: Self-Supervised Photo Upsampling via Latent Space Exploration of Generative Models

VR-Caps: A Virtual Environment for Active Capsule Endoscopy

Neural network-based build time estimation for additive manufacturing

Many Class Activation Map methods implemented in Pytorch for CNNs and Vision Transformers. Including Grad-CAM, Grad-CAM++, Score-CAM, Ablation-CAM and XGrad-CAM

ScriptProfilerPy - Module to visualize where your python script is slow

This repository is related to an Arabic tutorial, within the tutorial we discuss the common data structure and algorithms and their worst and best case for each, then implement the code using Python.

Runtime type annotations for the shape, dtype etc. of PyTorch Tensors.

mmfewshot is an open source few shot learning toolbox based on PyTorch

AAAI 2022 paper - Unifying Model Explainability and Robustness for Joint Text Classification and Rationale Extraction

Probabilistic Entity Representation Model for Reasoning over Knowledge Graphs

My implementation of Image Inpainting - A deep learning Inpainting model

Using Tensorflow Object Detection API to detect Waymo open dataset