Boosting Monocular Depth Estimation Models to High-Resolution via Content-Adaptive Multi-Resolution Merging

Last update: Jan 02, 2023

Related tags

Overview

Boosting Monocular Depth Estimation Models to High-Resolution via Content-Adaptive Multi-Resolution Merging

This repository contains an implementation of our CVPR2021 publication:

Boosting Monocular Depth Estimation Models to High-Resolution via Content-Adaptive Multi-Resolution Merging. S. Mahdi H. Miangoleh, Sebastian Dille, Long Mai, Sylvain Paris, Yağız Aksoy. Main pdf, Supplementary pdf, Project Page.

Change log:

Google Colaboratory notebook is now available [June 2021]
Merge net training dataset generation instructions is now available. [June 2021]
bug fix [June 2021]

Setup

We Provided the implementation of our method using MiDas-v2 and SGRnet as the base.

Environments

Our mergenet model is trained using torch 0.4.1 and python 3.6 and is tested with torch<=1.8.

Download our mergenet model weights from here and put it in

.\pix2pix\checkpoints\mergemodel\latest_net_G.pth

To use MiDas-v2 as base: Install dependancies as following:

conda install pytorch torchvision opencv cudatoolkit=10.2 -c pytorch
conda install matplotlib
conda install scipy
conda install scikit-image

Download the model weights from MiDas-v2 and put it in

./midas/model.pt

activate the environment
python run.py --Final --data_dir PATH_TO_INPUT --output_dir PATH_TO_RESULT --depthNet 0

To use SGRnet as base: Install dependancies as following:

conda install pytorch=0.4.1 cuda92 -c pytorch
conda install torchvision
conda install matplotlib
conda install scikit-image
pip install opencv-python

Follow the official SGRnet repository to compile the syncbn module in ./structuredrl/models/syncbn. Download the model weights from SGRnet and put it in

./structuredrl/model.pth.tar

activate the environment
python run.py --Final --data_dir PATH_TO_INPUT --output_dir PATH_TO_RESULT --depthNet 1

Different input arguments can be used to generate R0 and R20 results as discussed in the paper.

python run.py --R0 --data_dir PATH_TO_INPUT --output_dir PATH_TO_RESULT --depthNet #[0or1]
python run.py --R20 --data_dir PATH_TO_INPUT --output_dir PATH_TO_RESULT --depthNet #[0or1]

Evaluation

Fill in the needed variables in the following matlab file and run:

./evaluation/evaluatedataset.m

estimation_path : path to estimated disparity maps
gt_depth_path : path to gt depth/disparity maps
dataset_disp_gttype : (true) if ground truth data is disparity and (false) if gt depth data is depth.
evaluation_matfile_save_dir : directory to save the evalution results as .mat file.
superpixel_scale : scale parameter to run the superpixels on scaled version of the ground truth images to accelarate the evaluation. use 1 for small gt images.

Training

Navigate to dataset preparation instructions to download and prepare the training dataset.

python ./pix2pix/train.py --dataroot DATASETDIR --name mergemodeltrain --model pix2pix4depth --no_flip --no_dropout

python ./pix2pix/test.py --dataroot DATASETDIR --name mergemodeleval --model pix2pix4depth --no_flip --no_dropout

Citation

This implementation is provided for academic use only. Please cite our paper if you use this code or any of the models.

@INPROCEEDINGS{Miangoleh2021Boosting,
author={S. Mahdi H. Miangoleh and Sebastian Dille and Long Mai and Sylvain Paris and Ya\u{g}{\i}z Aksoy},
title={Boosting Monocular Depth Estimation Models to High-Resolution via Content-Adaptive Multi-Resolution Merging},
journal={Proc. CVPR},
year={2021},
}

Credits

The "Merge model" code skeleton (./pix2pix folder) was adapted from the pytorch-CycleGAN-and-pix2pix repository.

For MiDaS and SGR inferences we used the scripts and models from MiDas-v2 and SGRnet respectively (./midas and ./structuredrl folders).

Thanks to k-washi for providing us with a Google Colaboratory notebook implementation.

Boosting Monocular Depth Estimation Models to High-Resolution via Content-Adaptive Multi-Resolution Merging

Related tags

Overview

Boosting Monocular Depth Estimation Models to High-Resolution via Content-Adaptive Multi-Resolution Merging

Change log:

Setup

Environments

Evaluation

Training

Citation

Credits

Owner

Computational Photography Lab @ SFU

Official code for: A Probabilistic Hard Attention Model For Sequentially Observed Scenes

Deep Learning Training Scripts With Python

S2-BNN: Bridging the Gap Between Self-Supervised Real and 1-bit Neural Networks via Guided Distribution Calibration (CVPR 2021)

Subpopulation detection in high-dimensional single-cell data

基于DouZero定制AI实战欢乐斗地主

CarND-LaneLines-P1 - Lane Finding Project for Self-Driving Car ND

MixText: Linguistically-Informed Interpolation of Hidden Space for Semi-Supervised Text Classification

Global Pooling, More than Meets the Eye: Position Information is Encoded Channel-Wise in CNNs, ICCV 2021

Towards Understanding Quality Challenges of the Federated Learning: A First Look from the Lens of Robustness

Multi-Anchor Active Domain Adaptation for Semantic Segmentation (ICCV 2021 Oral)

Simple cross-platform application for DaVinci surgical video frame annotation

The easiest way to use deep metric learning in your application. Modular, flexible, and extensible. Written in PyTorch.

A Rao-Blackwellized Particle Filter for 6D Object Pose Tracking

Artstation-Artistic-face-HQ Dataset (AAHQ)

Monocular 3D pose estimation. OpenVINO. CPU inference or iGPU (OpenCL) inference.

ParaGen is a PyTorch deep learning framework for parallel sequence generation

Emblaze - Interactive Embedding Comparison

TimeSHAP explains Recurrent Neural Network predictions.

The repo of the preprinting paper "Labels Are Not Perfect: Inferring Spatial Uncertainty in Object Detection"

Making self-supervised learning work on molecules by using their 3D geometry to pre-train GNNs. Implemented in DGL and Pytorch Geometric.