Dense Unsupervised Learning for Video Segmentation (NeurIPS*2021)

Overview

Dense Unsupervised Learning for Video Segmentation

License Framework

This repository contains the official implementation of our paper:

Dense Unsupervised Learning for Video Segmentation
Nikita Araslanov, Simone Schaub-Mayer and Stefan Roth
To appear at NeurIPS*2021. [paper] [supp] [talk] [example results] [arXiv]

drawing

We efficiently learn spatio-temporal correspondences
without any supervision, and achieve state-of-the-art
accuracy of video object segmentation.

Contact: Nikita Araslanov fname.lname (at) visinf.tu-darmstadt.de


Installation

Requirements. To reproduce our results, we recommend Python >=3.6, PyTorch >=1.4, CUDA >=10.0. At least one Titan X GPUs (12GB) or equivalent is required. The code was primarily developed under PyTorch 1.8 on a single A100 GPU.

The following steps will set up a local copy of the repository.

  1. Create conda environment:
conda create --name dense-ulearn-vos
source activate dense-ulearn-vos
  1. Install PyTorch >=1.4 (see PyTorch instructions). For example on Linux, run:
conda install pytorch torchvision torchaudio cudatoolkit=11.3 -c pytorch
  1. Install the dependencies:
pip install -r requirements.txt
  1. Download the data:
Dataset Website Target directory with video sequences
YouTube-VOS Link data/ytvos/train/JPEGImages/
OxUvA Link data/OxUvA/images/dev/
TrackingNet Link data/tracking/train/jpegs/
Kinetics-400 Link data/kinetics400/video_jpeg/train/

The last column in this table specifies a path to subdirectories (relative to the project root) containing images of video frames. You can obviously use a different path structure. In this case, you will need to adjust the paths in data/filelists/ for every dataset accordingly.

  1. Download filelists:
cd data/filelists
bash download.sh

This will download lists of training and validation paths for all datasets.

Training

We following bash script will train a ResNet-18 model from scratch on one of the four supported datasets (see above):

bash ./launch/train.sh [ytvos|oxuva|track|kinetics]

We also provide our final models for download.

Dataset Mean J&F (DAVIS-2017) Link MD5
OxUvA 65.3 oxuva_e430_res4.pth (132M) af541[...]d09b3
YouTube-VOS 69.3 ytvos_e060_res4.pth (132M) c3ae3[...]55faf
TrackingNet 69.4 trackingnet_e088_res4.pth (88M) 3e7e9[...]95fa9
Kinetics-400 68.7 kinetics_e026_res4.pth (88M) 086db[...]a7d98

Inference and evaluation

Inference

To run the inference use launch/infer_vos.sh:

bash ./launch/infer_vos.sh [davis|ytvos]

The first argument selects the validation dataset to use (davis for DAVIS-2017; ytvos for YouTube-VOS). The bash variables declared in the script further help to set up the paths for reading the data and the pre-trained models as well as the output directory:

  • EXP, RUN_ID and SNAPSHOT determine the pre-trained model to load.
  • VER specifies a suffix for the output directory (in case you would like to experiment with different configurations for label propagation). Please, refer to launch/infer_vos.sh for their usage.

The inference script will create two directories with the result: [res3|res4|key]_vos and [res3|res4|key]_vis, where the prefix corresponds to the codename of the output CNN layer used in the evaluation (selected in infer_vos.sh using KEY variable). The vos-directory contains the segmentation result ready for evaluation; the vis-directory produces the results for visualisation purposes. You can optionally disable generating the visualisation by setting VERBOSE=False in infer_vos.py.

Evaluation: DAVIS-2017

Please use the official evaluation package. Install the repository, then simply run:

python evaluation_method.py --task semi-supervised --davis_path data/davis2017 --results_path <path-to-vos-directory>

Evaluation: YouTube-VOS 2018

Please use the official CodaLab evaluation server. To create the submission, rename the vos-directory to Annotations and compress it to Annotations.zip for uploading.

Acknowledgements

We thank PyTorch contributors and Allan Jabri for releasing their implementation of the label propagation.

Citation

We hope you find our work useful. If you would like to acknowledge it in your project, please use the following citation:

@inproceedings{Araslanov:2021:DUL,
  author    = {Araslanov, Nikita and Simone Schaub-Mayer and Roth, Stefan},
  title     = {Dense Unsupervised Learning for Video Segmentation},
  booktitle = {Advances in Neural Information Processing Systems (NeurIPS)},
  volume    = {34},
  year = {2021}
}
Owner
Visual Inference Lab @TU Darmstadt
Visual Inference Lab @TU Darmstadt
Build a small, 3 domain internet using Github pages and Wikipedia and construct a crawler to crawl, render, and index.

TechSEO Crawler Build a small, 3 domain internet using Github pages and Wikipedia and construct a crawler to crawl, render, and index. Play with the r

JR Oakes 57 Nov 24, 2022
Ağ tarayıcı.Gönderdiği paketler ile ağa bağlı olan cihazların IP adreslerini gösterir.

NetScanner.py Ağ tarayıcı.Gönderdiği paketler ile ağa bağlı olan cihazların IP adreslerini gösterir. Linux'da Kullanımı: git clone https://github.com/

4 Aug 23, 2021
Code for weakly supervised segmentation of a single class

SingleClassRL Implementation of weak single object segmentation from paper "Regularized Loss for Weakly Supervised Single Class Semantic Segmentation"

16 Nov 14, 2022
Differentiable Factor Graph Optimization for Learning Smoothers @ IROS 2021

Differentiable Factor Graph Optimization for Learning Smoothers Overview Status Setup Datasets Training Evaluation Acknowledgements Overview Code rele

Brent Yi 60 Nov 14, 2022
Invariant Causal Prediction for Block MDPs

MISA Abstract Generalization across environments is critical to the successful application of reinforcement learning algorithms to real-world challeng

Meta Research 41 Sep 17, 2022
Creative Applications of Deep Learning w/ Tensorflow

Creative Applications of Deep Learning w/ Tensorflow This repository contains lecture transcripts and homework assignments as Jupyter Notebooks for th

Parag K Mital 1.5k Dec 30, 2022
Code for the paper: Hierarchical Reinforcement Learning With Timed Subgoals, published at NeurIPS 2021

Hierarchical reinforcement learning with Timed Subgoals (HiTS) This repository contains code for reproducing experiments from our paper "Hierarchical

Autonomous Learning Group 21 Dec 03, 2022
Focal Loss for Dense Rotation Object Detection

Convert ResNets weights from GluonCV to Tensorflow Abstract GluonCV released some new resnet pre-training weights and designed some new resnets (such

17 Nov 24, 2021
Official code for "End-to-End Optimization of Scene Layout" -- including VAE, Diff Render, SPADE for colorization (CVPR 2020 Oral)

End-to-End Optimization of Scene Layout Code release for: End-to-End Optimization of Scene Layout CVPR 2020 (Oral) Project site, Bibtex For help conta

Andrew Luo 41 Dec 09, 2022
Convert dog pictures into various painting styles. Try LimnPet

LimnPet Cartoon stylization service project Try our service » Home page · Team notion · Members 목차 프로젝트 소개 프로젝트 목표 사용한 기술스택과 수행도구 팀원 구현 기능 주요 기능 추가 기능

LiJell 7 Jul 14, 2022
Look Who’s Talking: Active Speaker Detection in the Wild

Look Who's Talking: Active Speaker Detection in the Wild Dependencies pip install -r requirements.txt In addition to the Python dependencies, ffmpeg

Clova AI Research 60 Dec 08, 2022
NLG evaluation via Statistical Measures of Similarity: BaryScore, DepthScore, InfoLM

NLG evaluation via Statistical Measures of Similarity: BaryScore, DepthScore, InfoLM Automatic Evaluation Metric described in the papers BaryScore (EM

Pierre Colombo 28 Dec 28, 2022
MG-GCN: Scalable Multi-GPU GCN Training Framework

MG-GCN MG-GCN: multi-GPU GCN training framework. For more information, please read our paper. After cloning our repository, run git submodule update -

Translational Data Analytics (TDA) Lab @GaTech 6 Oct 24, 2022
Prompt-BERT: Prompt makes BERT Better at Sentence Embeddings

Prompt-BERT: Prompt makes BERT Better at Sentence Embeddings Results on STS Tasks Model STS12 STS13 STS14 STS15 STS16 STSb SICK-R Avg. unsup-prompt-be

196 Jan 08, 2023
PyTorch Lightning implementation of Automatic Speech Recognition

lasr Lightening Automatic Speech Recognition An MIT License ASR research library, built on PyTorch-Lightning, for developing end-to-end ASR models. In

Soohwan Kim 40 Sep 19, 2022
A framework for Quantification written in Python

QuaPy QuaPy is an open source framework for quantification (a.k.a. supervised prevalence estimation, or learning to quantify) written in Python. QuaPy

41 Dec 14, 2022
PyTorch implementation of "Supervised Contrastive Learning" (and SimCLR incidentally)

PyTorch implementation of "Supervised Contrastive Learning" (and SimCLR incidentally)

Yonglong Tian 2.2k Jan 08, 2023
A project which aims to protect your privacy using inexpensive hardware and easily modifiable software

Protecting your privacy using an ESP32, an IR sensor and a python script This project, which I personally call the "never-gonna-catch-me-in-the-act-ev

8 Oct 10, 2022
Source code for our CVPR 2019 paper - PPGNet: Learning Point-Pair Graph for Line Segment Detection

PPGNet: Learning Point-Pair Graph for Line Segment Detection PyTorch implementation of our CVPR 2019 paper: PPGNet: Learning Point-Pair Graph for Line

SVIP Lab 170 Oct 25, 2022
Use CLIP to represent video for Retrieval Task

A Straightforward Framework For Video Retrieval Using CLIP This repository contains the basic code for feature extraction and replication of results.

Jesus Andres Portillo Quintero 54 Dec 22, 2022