(CVPR 2022 - oral) Multi-View Depth Estimation by Fusing Single-View Depth Probability with Multi-View Geometry

Overview

Multi-View Depth Estimation by Fusing Single-View Depth Probability with Multi-View Geometry

Official implementation of the paper

Multi-View Depth Estimation by Fusing Single-View Depth Probability with Multi-View Geometry

CVPR 2022 [oral]

Gwangbin Bae, Ignas Budvytis, and Roberto Cipolla

[arXiv]

We present MaGNet (Monocular and Geometric Network), a novel framework for fusing single-view depth probability with multi-view geometry, to improve the accuracy, robustness and efficiency of multi-view depth estimation. For each frame, MaGNet estimates a single-view depth probability distribution, parameterized as a pixel-wise Gaussian. The distribution estimated for the reference frame is then used to sample per-pixel depth candidates. Such probabilistic sampling enables the network to achieve higher accuracy while evaluating fewer depth candidates. We also propose depth consistency weighting for the multi-view matching score, to ensure that the multi-view depth is consistent with the single-view predictions. The proposed method achieves state-of-the-art performance on ScanNet, 7-Scenes and KITTI. Qualitative evaluation demonstrates that our method is more robust against challenging artifacts such as texture-less/reflective surfaces and moving objects.

Datasets

We evaluated MaGNet on ScanNet, 7-Scenes and KITTI

ScanNet

  • In order to download ScanNet, you should submit an agreement to the Terms of Use. Please follow the instructions in this link.
  • The folder should be organized as

/path/to/ScanNet
/path/to/ScanNet/scans
/path/to/ScanNet/scans/scene0000_00 ...
/path/to/ScanNet/scans_test
/path/to/ScanNet/scans_test/scene0707_00 ...

7-Scenes

  • Download all seven scenes (Chess, Fire, Heads, Office, Pumpkin, RedKitchen, Stairs) from this link.
  • The folder should be organized as:

/path/to/SevenScenes
/path/to/SevenScenes/chess ...

KITTI

  • Download raw data from this link.
  • Download depth maps from this link
  • The folder should be organized as:

/path/to/KITTI
/path/to/KITTI/rawdata
/path/to/KITTI/rawdata/2011_09_26 ...
/path/to/KITTI/train
/path/to/KITTI/train/2011_09_26_drive_0001_sync ...
/path/to/KITTI/val
/path/to/KITTI/val/2011_09_26_drive_0002_sync ...

Download model weights

Download model weights by

python ckpts/download.py

If some files are not downloaded properly, download them manually from this link and place the files under ./ckpts.

Install dependencies

We recommend using a virtual environment.

python3.6 -m venv --system-site-packages ./venv
source ./venv/bin/activate

Install the necessary dependencies by

python3.6 -m pip install -r requirements.txt

Test scripts

If you wish to evaluate the accuracy of our D-Net (single-view), run

python test_DNet.py ./test_scripts/dnet/scannet.txt
python test_DNet.py ./test_scripts/dnet/7scenes.txt
python test_DNet.py ./test_scripts/dnet/kitti_eigen.txt
python test_DNet.py ./test_scripts/dnet/kitti_official.txt

You should get the following results:

Dataset abs_rel abs_diff sq_rel rmse rmse_log irmse log_10 silog a1 a2 a3 NLL
ScanNet 0.1186 0.2070 0.0493 0.2708 0.1461 0.1086 0.0515 10.0098 0.8546 0.9703 0.9928 2.2352
7-Scenes 0.1339 0.2209 0.0549 0.2932 0.1677 0.1165 0.0566 12.8807 0.8308 0.9716 0.9948 2.7941
KITTI (eigen) 0.0605 1.1331 0.2086 2.4215 0.0921 0.0075 0.0261 8.4312 0.9602 0.9946 0.9989 2.6443
KITTI (official) 0.0629 1.1682 0.2541 2.4708 0.1021 0.0080 0.0270 9.5752 0.9581 0.9905 0.9971 1.7810

In order to evaluate the accuracy of the full pipeline (multi-view), run

python test_MaGNet.py ./test_scripts/magnet/scannet.txt
python test_MaGNet.py ./test_scripts/magnet/7scenes.txt
python test_MaGNet.py ./test_scripts/magnet/kitti_eigen.txt
python test_MaGNet.py ./test_scripts/magnet/kitti_official.txt

You should get the following results:

Dataset abs_rel abs_diff sq_rel rmse rmse_log irmse log_10 silog a1 a2 a3 NLL
ScanNet 0.0810 0.1466 0.0302 0.2098 0.1101 0.1055 0.0351 8.7686 0.9298 0.9835 0.9946 0.1454
7-Scenes 0.1257 0.2133 0.0552 0.2957 0.1639 0.1782 0.0527 13.6210 0.8552 0.9715 0.9935 1.5605
KITTI (eigen) 0.0535 0.9995 0.1623 2.1584 0.0826 0.0566 0.0235 7.4645 0.9714 0.9958 0.9990 1.8053
KITTI (official) 0.0503 0.9135 0.1667 1.9707 0.0848 0.2423 0.0219 7.9451 0.9769 0.9941 0.9979 1.4750

Training scripts

Coming soon

Citation

If you find our work useful in your research please consider citing our paper:

@InProceedings{Bae2022,
  title = {Multi-View Depth Estimation by Fusing Single-View Depth Probability with Multi-View Geometry}
  author = {Gwangbin Bae and Ignas Budvytis and Roberto Cipolla},
  booktitle = {Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  year = {2022}                         
}
Owner
Bae, Gwangbin
PhD student in Computer Vision @ University of Cambridge
Bae, Gwangbin
PyTorch code to run synthetic experiments.

Code repository for Invariant Risk Minimization Source code for the paper: @article{InvariantRiskMinimization, title={Invariant Risk Minimization}

Facebook Research 345 Dec 12, 2022
This repo contains the official implementations of EigenDamage: Structured Pruning in the Kronecker-Factored Eigenbasis

EigenDamage: Structured Pruning in the Kronecker-Factored Eigenbasis This repo contains the official implementations of EigenDamage: Structured Prunin

Chaoqi Wang 107 Apr 20, 2022
Attention-driven Robot Manipulation (ARM) which includes Q-attention

Attention-driven Robotic Manipulation (ARM) This codebase is home to: Q-attention: Enabling Efficient Learning for Vision-based Robotic Manipulation I

Stephen James 84 Dec 29, 2022
Keras-tensorflow implementation of Fully Convolutional Networks for Semantic Segmentation(Unfinished)

Keras-FCN Fully convolutional networks and semantic segmentation with Keras. Models Models are found in models.py, and include ResNet and DenseNet bas

645 Dec 29, 2022
Learning Spatio-Temporal Transformer for Visual Tracking

STARK The official implementation of the paper Learning Spatio-Temporal Transformer for Visual Tracking Hiring research interns for visual transformer

Multimedia Research 484 Dec 29, 2022
ALIbaba's Collection of Encoder-decoders from MinD (Machine IntelligeNce of Damo) Lab

AliceMind AliceMind: ALIbaba's Collection of Encoder-decoders from MinD (Machine IntelligeNce of Damo) Lab This repository provides pre-trained encode

Alibaba 1.4k Jan 01, 2023
An implementation of "MixHop: Higher-Order Graph Convolutional Architectures via Sparsified Neighborhood Mixing" (ICML 2019).

MixHop and N-GCN ⠀ A PyTorch implementation of "MixHop: Higher-Order Graph Convolutional Architectures via Sparsified Neighborhood Mixing" (ICML 2019)

Benedek Rozemberczki 393 Dec 13, 2022
Conditional Gradients For The Approximately Vanishing Ideal

Conditional Gradients For The Approximately Vanishing Ideal Code for the paper: Wirth, E., and Pokutta, S. (2022). Conditional Gradients for the Appro

IOL Lab @ ZIB 0 May 25, 2022
PyTorch implementation of the paper:A Convolutional Approach to Melody Line Identification in Symbolic Scores.

Symbolic Melody Identification This repository is an unofficial PyTorch implementation of the paper:A Convolutional Approach to Melody Line Identifica

Sophia Y. Chou 3 Feb 21, 2022
Deep Learning for 3D Point Clouds: A Survey (IEEE TPAMI, 2020)

🔥Deep Learning for 3D Point Clouds (IEEE TPAMI, 2020)

Qingyong 1.4k Jan 08, 2023
《Single Image Reflection Removal Beyond Linearity》(CVPR 2019)

Single-Image-Reflection-Removal-Beyond-Linearity Paper Single Image Reflection Removal Beyond Linearity. Qiang Wen, Yinjie Tan, Jing Qin, Wenxi Liu, G

Qiang Wen 51 Jun 24, 2022
ReferFormer - Official Implementation of ReferFormer

The official implementation of the paper: Language as Queries for Referring Vide

Jonas Wu 232 Dec 29, 2022
Diverse Object-Scene Compositions For Zero-Shot Action Recognition

Diverse Object-Scene Compositions For Zero-Shot Action Recognition This repository contains the source code for the use of object-scene compositions f

7 Sep 21, 2022
Python3 / PyTorch implementation of the following paper: Fine-grained Semantics-aware Representation Enhancement for Self-supervisedMonocular Depth Estimation. ICCV 2021 (oral)

FSRE-Depth This is a Python3 / PyTorch implementation of FSRE-Depth, as described in the following paper: Fine-grained Semantics-aware Representation

77 Dec 28, 2022
Testbed of AI Systems Quality Management

qunomon Description A testbed for testing and managing AI system qualities. Demo Sorry. Not deployment public server at alpha version. Requirement Ins

AIST AIRC 15 Nov 27, 2021
This repository contains a pytorch implementation of "StereoPIFu: Depth Aware Clothed Human Digitization via Stereo Vision".

StereoPIFu: Depth Aware Clothed Human Digitization via Stereo Vision | Project Page | Paper | This repository contains a pytorch implementation of "St

87 Dec 09, 2022
DTCN SMP Challenge - Sequential prediction learning framework and algorithm

DTCN This is the implementation of our paper "Sequential Prediction of Social Me

Bobby 2 Jan 24, 2022
Pytorch implementation of "Forward Thinking: Building and Training Neural Networks One Layer at a Time"

forward-thinking-pytorch Pytorch implementation of Forward Thinking: Building and Training Neural Networks One Layer at a Time Requirements Python 2.7

Kim Heecheol 65 Oct 06, 2022
Prometheus exporter for Cisco Unified Computing System (UCS) Manager

prometheus-ucs-exporter Overview Use metrics from the UCS API to export relevant metrics to Prometheus This repository is a fork of Drew Stinnett's or

Marshall Wace 6 Nov 07, 2022
Black-Box-Tuning - Black-Box Tuning for Language-Model-as-a-Service

Black-Box-Tuning Source code for paper "Black-Box Tuning for Language-Model-as-a

Tianxiang Sun 149 Jan 04, 2023