Self-Supervised Multi-Frame Monocular Scene Flow (CVPR 2021)

Overview

Self-Supervised Multi-Frame Monocular Scene Flow

3D visualization of estimated depth and scene flow (overlayed with input image) from temporally consecutive images.
Trained on KITTI in a self-supervised manner, and tested on DAVIS.

This repository is the official PyTorch implementation of the paper:

   Self-Supervised Multi-Frame Monocular Scene Flow
   Junhwa Hur and Stefan Roth
   CVPR, 2021
   Arxiv

  • Contact: junhwa.hur[at]gmail.com

Installation

The code has been tested with Anaconda (Python 3.8), PyTorch 1.8.1 and CUDA 10.1 (Different Pytorch + CUDA version is also compatible).
Please run the provided conda environment setup file:

conda env create -f environment.yml
conda activate multi-mono-sf

(Optional) Using the CUDA implementation of the correlation layer accelerates training (~50% faster):

./install_correlation.sh

After installing it, turn on this flag --correlation_cuda_enabled=True in training/evaluation script files.

Dataset

Please download the following to datasets for the experiment:

To save space, we convert the KITTI Raw png images to jpeg, following the convention from MonoDepth:

find (data_folder)/ -name '*.png' | parallel 'convert {.}.png {.}.jpg && rm {}'

We also converted images in KITTI Scene Flow 2015 as well. Please convert the png images in image_2 and image_3 into jpg and save them into the seperate folder image_2_jpg and image_3_jpg.
To save space further, you can delete the velodyne point data in KITTI raw data as we don't need it.

Training and Inference

The scripts folder contains training/inference scripts.

For self-supervised training, you can simply run the following script files:

Script Training Dataset
./train_selfsup.sh Self-supervised KITTI Split

Fine-tuning is done with two stages: (i) first finding the stopping point using train/valid split, and then (ii) fune-tuning using all data with the found iteration steps.

Script Training Dataset
./ft_1st_stage.sh Semi-supervised finetuning KITTI raw + KITTI 2015
./ft_2nd_stage.sh Semi-supervised finetuning KITTI raw + KITTI 2015

In the script files, please configure these following PATHs for experiments:

  • DATA_HOME : the directory where the training or test is located in your local system.
  • EXPERIMENTS_HOME : your own experiment directory where checkpoints and log files will be saved.

To test pretrained models, you can simply run the following script files:

Script Training Dataset
./eval_selfsup_train.sh self-supervised KITTI 2015 Train
./eval_ft_test.sh fine-tuned KITTI 2015 Test
./eval_davis.sh self-supervised DAVIS (one scene)
./eval_davis_all.sh self-supervised DAVIS (all scenes)
  • To save visuailization of outputs, please turn on --save_vis=True in the script.
  • To save output images for KITTI Scene Flow 2015 Benchmark submission, please turn on --save_out=True in the script.

Pretrained Models

The checkpoints folder contains the checkpoints of the pretrained models.

Acknowledgement

Please cite our paper if you use our source code.

@inproceedings{Hur:2021:SSM,  
  Author = {Junhwa Hur and Stefan Roth},  
  Booktitle = {CVPR},  
  Title = {Self-Supervised Multi-Frame Monocular Scene Flow},  
  Year = {2021}  
}
  • Portions of the source code (e.g., training pipeline, runtime, argument parser, and logger) are from Jochen Gast
Owner
Visual Inference Lab @TU Darmstadt
Visual Inference Lab @TU Darmstadt
Latte: Cross-framework Python Package for Evaluation of Latent-based Generative Models

Cross-framework Python Package for Evaluation of Latent-based Generative Models Latte Latte (for LATent Tensor Evaluation) is a cross-framework Python

Karn Watcharasupat 30 Sep 08, 2022
[CVPR 2020] Local Class-Specific and Global Image-Level Generative Adversarial Networks for Semantic-Guided Scene Generation

Contents Local and Global GAN Cross-View Image Translation Semantic Image Synthesis Acknowledgments Related Projects Citation Contributions Collaborat

Hao Tang 131 Dec 07, 2022
A python library for highly configurable transformers - easing model architecture search and experimentation.

A python library for highly configurable transformers - easing model architecture search and experimentation.

Anthony Fuller 51 Nov 20, 2022
[CVPR 2021] Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers

[CVPR 2021] Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers

Fudan Zhang Vision Group 897 Jan 05, 2023
A PyTorch implementation of "SimGNN: A Neural Network Approach to Fast Graph Similarity Computation" (WSDM 2019).

SimGNN ⠀⠀⠀ A PyTorch implementation of SimGNN: A Neural Network Approach to Fast Graph Similarity Computation (WSDM 2019). Abstract Graph similarity s

Benedek Rozemberczki 534 Dec 25, 2022
基于Flask开发后端、VUE开发前端框架,在WEB端部署YOLOv5目标检测模型

基于Flask开发后端、VUE开发前端框架,在WEB端部署YOLOv5目标检测模型

37 Jan 01, 2023
State of the art Semantic Sentence Embeddings

Contrastive Tension State of the art Semantic Sentence Embeddings Published Paper · Huggingface Models · Report Bug Overview This is the official code

Fredrik Carlsson 88 Dec 30, 2022
Pretrained Pytorch face detection (MTCNN) and recognition (InceptionResnet) models

Face Recognition Using Pytorch Python 3.7 3.6 3.5 Status This is a repository for Inception Resnet (V1) models in pytorch, pretrained on VGGFace2 and

Tim Esler 3.3k Jan 04, 2023
UnFlow: Unsupervised Learning of Optical Flow with a Bidirectional Census Loss

UnFlow: Unsupervised Learning of Optical Flow with a Bidirectional Census Loss This repository contains the TensorFlow implementation of the paper UnF

Simon Meister 270 Nov 06, 2022
Official implementation of the paper 'Efficient and Degradation-Adaptive Network for Real-World Image Super-Resolution'

DASR Paper Efficient and Degradation-Adaptive Network for Real-World Image Super-Resolution Jie Liang, Hui Zeng, and Lei Zhang. In arxiv preprint. Abs

81 Dec 28, 2022
Official repository for: Continuous Control With Ensemble DeepDeterministic Policy Gradients

Continuous Control With Ensemble Deep Deterministic Policy Gradients This repository is the official implementation of Continuous Control With Ensembl

4 Dec 06, 2021
The implement of papar "Enhanced Graph Learning for Collaborative Filtering via Mutual Information Maximization"

SIGIR2021-EGLN The implement of paper "Enhanced Graph Learning for Collaborative Filtering via Mutual Information Maximization" Neural graph based Col

15 Dec 27, 2022
PyTorch implementation of PSPNet segmentation network

pspnet-pytorch PyTorch implementation of PSPNet segmentation network Original paper Pyramid Scene Parsing Network Details This is a slightly different

Roman Trusov 532 Dec 29, 2022
A basic implementation of Layer-wise Relevance Propagation (LRP) in PyTorch.

Layer-wise Relevance Propagation (LRP) in PyTorch Basic unsupervised implementation of Layer-wise Relevance Propagation (Bach et al., Montavon et al.)

Kai Fabi 28 Dec 26, 2022
Pixel-level Crack Detection From Images Of Levee Systems : A Comparative Study

PIXEL-LEVEL CRACK DETECTION FROM IMAGES OF LEVEE SYSTEMS : A COMPARATIVE STUDY G

Manisha Panta 2 Jul 23, 2022
DeepFaceLab fork which provides IPython Notebook to use DFL with Google Colab

DFL-Colab — DeepFaceLab fork for Google Colab This project provides you IPython Notebook to use DeepFaceLab with Google Colaboratory. You can create y

779 Jan 05, 2023
Seq2seq - Sequence to Sequence Learning with Keras

Seq2seq Sequence to Sequence Learning with Keras Hi! You have just found Seq2Seq. Seq2Seq is a sequence to sequence learning add-on for the python dee

Fariz Rahman 3.1k Dec 18, 2022
Luminous is a framework for testing the performance of Embodied AI (EAI) models in indoor tasks.

Luminous is a framework for testing the performance of Embodied AI (EAI) models in indoor tasks. Generally, we intergrete different kind of functional

28 Jan 08, 2023
RETRO-pytorch - Implementation of RETRO, Deepmind's Retrieval based Attention net, in Pytorch

RETRO - Pytorch (wip) Implementation of RETRO, Deepmind's Retrieval based Attent

Phil Wang 556 Jan 04, 2023
Official PyTorch implementation of BlobGAN: Spatially Disentangled Scene Representations

BlobGAN: Spatially Disentangled Scene Representations Official PyTorch Implementation Paper | Project Page | Video | Interactive Demo BlobGAN.mp4 This

148 Dec 29, 2022