This repository contains a re-implementation of the code for the CVPR 2021 paper "Omnimatte: Associating Objects and Their Effects in Video."

Last update: Dec 28, 2022

Related tags

Deep Learning omnimatte

Overview

Omnimatte in PyTorch

This repository contains a re-implementation of the code for the CVPR 2021 paper "Omnimatte: Associating Objects and Their Effects in Video."

Prerequisites

Linux
Python 3.6+
NVIDIA GPU + CUDA CuDNN

Installation

This code has been tested with PyTorch 1.8 and Python 3.8.

Install PyTorch 1.8 and other dependencies.
- For pip users, please type the command pip install -r requirements.txt.
- For Conda users, you can create a new Conda environment using conda env create -f environment.yml.

Demo

To train a model on a video (e.g. "tennis"), run:

python train.py --name tennis --dataroot ./datasets/tennis --gpu_ids 0,1

To view training results and loss plots, visit the URL http://localhost:8097. Intermediate results are also at ./checkpoints/tennis/web/index.html.

To save the omnimatte layer outputs of the trained model, run:

python test.py --name tennis --dataroot ./datasets/tennis --gpu_ids 0

The results (RGBA layers, videos) will be saved to ./results/tennis/test_latest/.

Custom video

To train on your own video, you will have to preprocess the data:

Extract the frames, e.g.

mkdir ./datasets/my_video && cd ./datasets/my_video 
mkdir rgb && ffmpeg -i video.mp4 rgb/%04d.png

Resize the video to 256x448 and save the frames in my_video/rgb.
Get input object masks (e.g. using Mask-RCNN and STM), save each object's masks in its own subdirectory, e.g. my_video/mask/01/, my_video/mask/02/, etc.
Compute flow (e.g. using RAFT), and save the forward .flo files to my_video/flow and backward flow to my_video/flow_backward
Compute the confidence maps from the forward/backward flows:
```
python datasets/confidence.py --dataroot ./datasets/tennis
```
Register the video and save the computed homographies in my_video/homographies.txt. See here for details.

Note: Videos that are suitable for our method have the following attributes:

Static camera or limited camera motion that can be represented with a homography.
Limited number of omnimatte layers, due to GPU memory limitations. We tested up to 6 layers.
Objects that move relative to the background (static objects will be absorbed into the background layer).
We tested a video length of up to 200 frames (~7 seconds).

Citation

If you use this code for your research, please cite the following paper:

@inproceedings{lu2021,
  title={Omnimatte: Associating Objects and Their Effects in Video},
  author={Lu, Erika and Cole, Forrester and Dekel, Tali and Zisserman, Andrew and Freeman, William T and Rubinstein, Michael},
  booktitle={CVPR},
  year={2021}
}

Acknowledgments

This code is based on retiming and pytorch-CycleGAN-and-pix2pix.

This repository contains a re-implementation of the code for the CVPR 2021 paper "Omnimatte: Associating Objects and Their Effects in Video."

Related tags

Overview

Omnimatte in PyTorch

Prerequisites

Installation

Demo

Custom video

Citation

Acknowledgments

Owner

Erika Lu

This Jupyter notebook shows one way to implement a simple first-order low-pass filter on sampled data in discrete time.

Defocus Map Estimation and Deblurring from a Single Dual-Pixel Image

An example of time series augmentation methods with Keras

Unsupervised Learning of Probably Symmetric Deformable 3D Objects from Images in the Wild

This repository contains code to train and render Mixture of Volumetric Primitives (MVP) models

A simple, clean TensorFlow implementation of Generative Adversarial Networks with a focus on modeling illustrations.

Neural Re-rendering for Full-frame Video Stabilization

Realtime Face Anti Spoofing with Face Detector based on Deep Learning using Tensorflow/Keras and OpenCV

reimpliment of DFANet: Deep Feature Aggregation for Real-Time Semantic Segmentation

custom pytorch implementation of MoCo v3

Unofficial Alias-Free GAN implementation. Based on rosinality's version with expanded training and inference options.

Multi-Scale Aligned Distillation for Low-Resolution Detection (CVPR2021)

This was initially the repo for the project of [email protected] of Asaf Mazar, Millad Kassaie and Georgios Chochlakis named "Powered by the Will? Exploring Lay Theories of Behavior Change through Social Media"

A very simple tool for situations where optimization with onnx-simplifier would exceed the Protocol Buffers upper file size limit of 2GB, or simply to separate onnx files to any size you want.

Official PyTorch Implementation for "Recurrent Video Deblurring with Blur-Invariant Motion Estimation and Pixel Volumes"

Language Models for the legal domain in Spanish done @ BSC-TEMU within the "Plan de las Tecnologías del Lenguaje" (Plan-TL).

Implementation of the paper Recurrent Glimpse-based Decoder for Detection with Transformer.

A PyTorch implementation of "DGC-Net: Dense Geometric Correspondence Network"

DeepLabv3+：Encoder-Decoder with Atrous Separable Convolution语义分割模型在tensorflow2当中的实现

A script written in Python that returns a consensus string and profile matrix of a given DNA string(s) in FASTA format.