MVS2D: Efficient Multi-view Stereo via Attention-Driven 2D Convolutions

Related tags

Deep LearningMVS2D
Overview

MVS2D: Efficient Multi-view Stereo via Attention-Driven 2D Convolutions

Project Page | Paper


drawing

If you find our work useful for your research, please consider citing our paper:

@article{DBLP:journals/corr/abs-2104-13325,
  author    = {Zhenpei Yang and
               Zhile Ren and
               Qi Shan and
               Qixing Huang},
  title     = {{MVS2D:} Efficient Multi-view Stereo via Attention-Driven 2D Convolutions},
  journal   = {CoRR},
  volume    = {abs/2104.13325},
  year      = {2021},
  url       = {https://arxiv.org/abs/2104.13325},
  eprinttype = {arXiv},
  eprint    = {2104.13325},
  timestamp = {Tue, 04 May 2021 15:12:43 +0200},
  biburl    = {https://dblp.org/rec/journals/corr/abs-2104-13325.bib},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}

✏️ Changelog

Nov 27 2021

  • Initial release. Note that our released code achieve improved results than those reported in the initial arxiv pre-print. In addition, we include the evaluation on DTU dataset. We will update our paper soon.

⚙️ Installation

Click to expand

The code is tested with CUDA10.1. Please use following commands to install dependencies:

conda create --name mvs2d python=3.7
conda activate mvs2d

pip install -r requirements.txt

The folder structure should looks like the following if you have downloaded all data and pretrained models. Download links are inside each dataset tab at the end of this README.

.
├── configs
├── datasets
├── demo
├── networks
├── scripts
├── pretrained_model
│   ├── demon
│   ├── dtu
│   └── scannet
├── data
│   ├── DeMoN
│   ├── DTU_hr
│   ├── SampleSet
│   ├── ScanNet
│   └── ScanNet_3_frame_jitter_pose.npy
├── splits
│   ├── DeMoN_samples_test_2_frame.npy
│   ├── DeMoN_samples_train_2_frame.npy
│   ├── ScanNet_3_frame_test.npy
│   ├── ScanNet_3_frame_train.npy
│   └── ScanNet_3_frame_val.npy

🎬 Demo

Click to expand

After downloading the pretrained models for ScanNet, try to run following command to make a prediction on a sample data.

python demo.py --cfg configs/scannet/release.conf

The results are saved as demo.png

Training & Testing

We use 4 Nvidia V100 GPU for training. You may need to modify 'CUDA_VISIBLE_DEVICES' and batch size to accomodate your GPU resources.

ScanNet

Click to expand

Download

data 🔗 split 🔗 pretrained models 🔗 noisy pose 🔗

Training

First download and extract ScanNet training data and split. Then run following command to train our model.

bash scripts/scannet/train.sh

To train the multi-scale attention model, add --robust 1 to the training command in scripts/scannet/train.sh.

To train our model with noisy input pose, add --perturb_pose 1 to the training command in scripts/scannet/train.sh.

Testing

First download and extract data, split and pretrained models.

Then run:

bash scripts/scannet/test.sh

You should get something like these:

abs_rel sq_rel log10 rmse rmse_log a1 a2 a3 abs_diff abs_diff_median thre1 thre3 thre5
0.059 0.016 0.026 0.157 0.084 0.964 0.995 0.999 0.108 0.079 0.856 0.974 0.996

SUN3D/RGBD/Scenes11

Click to expand

Download

data 🔗 split 🔗 pretrained models 🔗

Training

First download and extract DeMoN training data and split. Then run following command to train our model.

bash scripts/demon/train.sh

Testing

First download and extract data, split and pretrained models.

Then run:

bash scripts/demon/test.sh

You should get something like these:

dataset rgbd: 160

abs_rel sq_rel log10 rmse rmse_log a1 a2 a3 abs_diff abs_diff_median thre1 thre3 thre5
0.082 0.165 0.047 0.440 0.147 0.921 0.939 0.948 0.325 0.284 0.753 0.894 0.933

dataset scenes11: 256

abs_rel sq_rel log10 rmse rmse_log a1 a2 a3 abs_diff abs_diff_median thre1 thre3 thre5
0.046 0.080 0.018 0.439 0.107 0.976 0.989 0.993 0.155 0.058 0.822 0.945 0.979

dataset sun3d: 160

abs_rel sq_rel log10 rmse rmse_log a1 a2 a3 abs_diff abs_diff_median thre1 thre3 thre5
0.099 0.055 0.044 0.304 0.137 0.893 0.970 0.993 0.224 0.171 0.649 0.890 0.969

-> Done!

depth

abs_rel sq_rel log10 rmse rmse_log a1 a2 a3 abs_diff abs_diff_median thre1 thre3 thre5
0.071 0.096 0.033 0.402 0.127 0.938 0.970 0.981 0.222 0.152 0.755 0.915 0.963

DTU

Click to expand

Download

data 🔗 eval data 🔗 pretrained models 🔗

Training

First download and extract DTU training data. Then run following command to train our model.

bash scripts/dtu/test.sh

Testing

First download and extract DTU eval data and pretrained models.

The following command performs three steps together: 1. Generate depth prediction on DTU test set. 2. Fuse depth predictions into final point cloud. 3. Evaluate predicted point cloud. Note that we re-implement the original Matlab Evaluation of DTU dataset using python.

bash scripts/dtu/test.sh

You should get something like these:

Acc 0.4051747996189477
Comp 0.2776021161518006
F-score 0.34138845788537414

Acknowledgement

The fusion code for DTU dataset is heavily built upon from PatchMatchNet

Owner
CS PhD student
piSTAR Lab is a modular platform built to make AI experimentation accessible and fun. (pistar.ai)

piSTAR Lab WARNING: This is an early release. Overview piSTAR Lab is a modular deep reinforcement learning platform built to make AI experimentation a

piSTAR Lab 0 Aug 01, 2022
[ICLR 2021] HW-NAS-Bench: Hardware-Aware Neural Architecture Search Benchmark

HW-NAS-Bench: Hardware-Aware Neural Architecture Search Benchmark Accepted as a spotlight paper at ICLR 2021. Table of content File structure Prerequi

72 Jan 03, 2023
This is Official implementation for "Pose-guided Feature Disentangling for Occluded Person Re-Identification Based on Transformer" in AAAI2022

PFD:Pose-guided Feature Disentangling for Occluded Person Re-identification based on Transformer This repo is the official implementation of "Pose-gui

Tao Wang 93 Dec 18, 2022
Cryptocurrency Prediction with Artificial Intelligence (Deep Learning via LSTM Neural Networks)

Cryptocurrency Prediction with Artificial Intelligence (Deep Learning via LSTM Neural Networks)- Emirhan BULUT

Emirhan BULUT 102 Nov 18, 2022
To Design and Implement Logistic Regression to Classify Between Benign and Malignant Cancer Types

To Design and Implement Logistic Regression to Classify Between Benign and Malignant Cancer Types, from a Database Taken From Dr. Wolberg reports his Clinic Cases.

Astitva Veer Garg 1 Jul 31, 2022
The 1st place solution of track2 (Vehicle Re-Identification) in the NVIDIA AI City Challenge at CVPR 2021 Workshop.

AICITY2021_Track2_DMT The 1st place solution of track2 (Vehicle Re-Identification) in the NVIDIA AI City Challenge at CVPR 2021 Workshop. Introduction

Hao Luo 91 Dec 21, 2022
PyTorch code accompanying the paper "Landmark-Guided Subgoal Generation in Hierarchical Reinforcement Learning" (NeurIPS 2021).

HIGL This is a PyTorch implementation for our paper: Landmark-Guided Subgoal Generation in Hierarchical Reinforcement Learning (NeurIPS 2021). Our cod

Junsu Kim 20 Dec 14, 2022
Energy consumption estimation utilities for Jetson-based platforms

This repository contains a utility for measuring energy consumption when running various programs in NVIDIA Jetson-based platforms. Currently TX-2, NX, and AGX are supported.

OpenDR 10 Jun 17, 2022
pytorch implementation of Attention is all you need

A Pytorch Implementation of the Transformer: Attention Is All You Need Our implementation is largely based on Tensorflow implementation Requirements N

230 Dec 07, 2022
On the Complementarity between Pre-Training and Back-Translation for Neural Machine Translation (Findings of EMNLP 2021))

PTvsBT On the Complementarity between Pre-Training and Back-Translation for Neural Machine Translation (Findings of EMNLP 2021) Citation Please cite a

Sunbow Liu 10 Nov 25, 2022
A web application that provides real time temperature and humidity readings of a house.

About A web application which provides real time temperature and humidity readings of a house. If you're interested in the data collected so far click

Ben Thompson 3 Jan 28, 2022
This is the implementation of the paper LiST: Lite Self-training Makes Efficient Few-shot Learners.

LiST (Lite Self-Training) This is the implementation of the paper LiST: Lite Self-training Makes Efficient Few-shot Learners. LiST is short for Lite S

Microsoft 28 Dec 07, 2022
Variational Attention: Propagating Domain-Specific Knowledge for Multi-Domain Learning in Crowd Counting (ICCV, 2021)

DKPNet ICCV 2021 Variational Attention: Propagating Domain-Specific Knowledge for Multi-Domain Learning in Crowd Counting Baseline of DKPNet is availa

19 Oct 14, 2022
Build and run Docker containers leveraging NVIDIA GPUs

NVIDIA Container Toolkit Introduction The NVIDIA Container Toolkit allows users to build and run GPU accelerated Docker containers. The toolkit includ

NVIDIA Corporation 15.6k Jan 01, 2023
Code for all the Advent of Code'21 challenges mostly written in python

Advent of Code 21 Code for all the Advent of Code'21 challenges mostly written in python. They are not necessarily the best or fastest solutions but j

4 May 26, 2022
4th place solution for the SIGIR 2021 challenge.

SIGIR-2021 (Tinkoff.AI) How to start Download train and test data: https://sigir-ecom.github.io/data-task.html Place it under sigir-2021/data/. Run py

Tinkoff.AI 4 Jul 01, 2022
MEAL V2: Boosting Vanilla ResNet-50 to 80%+ Top-1 Accuracy on ImageNet without Tricks

MEAL-V2 This is the official pytorch implementation of our paper: "MEAL V2: Boosting Vanilla ResNet-50 to 80%+ Top-1 Accuracy on ImageNet without Tric

Zhiqiang Shen 653 Dec 19, 2022
This repo provides code for QB-Norm (Cross Modal Retrieval with Querybank Normalisation)

This repo provides code for QB-Norm (Cross Modal Retrieval with Querybank Normalisation) Usage example python dynamic_inverted_softmax.py --sims_train

36 Dec 29, 2022
Official code for the paper: Deep Graph Matching under Quadratic Constraint (CVPR 2021)

QC-DGM This is the official PyTorch implementation and models for our CVPR 2021 paper: Deep Graph Matching under Quadratic Constraint. It also contain

Quankai Gao 55 Nov 14, 2022
The source code for the Cutoff data augmentation approach proposed in this paper: "A Simple but Tough-to-Beat Data Augmentation Approach for Natural Language Understanding and Generation".

Cutoff: A Simple Data Augmentation Approach for Natural Language This repository contains source code necessary to reproduce the results presented in

Dinghan Shen 49 Dec 22, 2022