Monocular Depth Estimation Using Laplacian Pyramid-Based Depth Residuals

Overview

LapDepth-release

PWC PWC

This repository is a Pytorch implementation of the paper "Monocular Depth Estimation Using Laplacian Pyramid-Based Depth Residuals"

Minsoo Song, Seokjae Lim, and Wonjun Kim*
IEEE Transactions on Circuits and Systems for Video Technology (TCSVT)

Video presentation

Screenshot

Requirements

  • Python >= 3.7
  • Pytorch >= 1.6.0
  • Ubuntu 16.04
  • CUDA 9.2
  • cuDNN (if CUDA available)

some other packages: geffnet, path, IPython, blessings, progressbar

Pretrained models

You can download pre-trained model

  • Trained with KITTI

    • batch 16, SyncBatchNorm, data loss
    cap a1 a2 a3 Abs Rel Sq Rel RMSE RMSE log
    0-80m 0.965 0.995 0.999 0.059 0.201 2.397 0.090
    cap a1 a2 a3 Abs Rel Sq Rel RMSE RMSE log
    0-50m 0.970 0.996 0.999 0.057 0.155 1.788 0.085
  • Trained with KITTI

    • batch 16, GroupNorm, data loss + gradient loss
    cap a1 a2 a3 Abs Rel Sq Rel RMSE RMSE log
    0-80m 0.961 0.994 0.999 0.059 0.209 2.489 0.091
    cap a1 a2 a3 Abs Rel Sq Rel RMSE RMSE log
    0-50m 0.968 0.996 0.999 0.057 0.155 1.807 0.085
  • Trained with NYU Depth V2

    • batch 16, SyncBatchNorm, data loss
    cap a1 a2 a3 Abs Rel log10 RMSE RMSE log
    0-10m 0.895 0.983 0.996 0.105 0.045 0.384 0.135

Demo images (Single Test Image Prediction)

Make sure you download the pre-trained model and placed it in the './pretrained/' directory before running the demo.
Demo Command Line:

############### Example of argument usage #####################
## Running demo using a specified image (jpg or png)
python demo.py --model_dir ./pretrained/LDRN_KITTI_ResNext101_pretrained_data.pkl --img_dir ./your/file/path/filename --pretrained KITTI --cuda --gpu_num 0
python demo.py --model_dir ./pretrained/LDRN_NYU_ResNext101_pretrained_data.pkl --img_dir ./your/file/path/filename --pretrained NYU --cuda --gpu_num 0
# output image name => 'out_' + filename

## Running demo using a whole folder of images
python demo.py --model_dir ./pretrained/LDRN_KITTI_ResNext101_pretrained_data.pkl --img_folder_dir ./your/folder/path/folder_name --pretrained KITTI --cuda --gpu_num 0
# output folder name => 'out_' + folder_name

If you are using a model pre-trained from KITTI, insert '--pretrained KITTI' command
(in the case of NYU, '--pretrained NYU').
If you run the demo on GPU, insert '--cuda'.
'--gpu_num' argument is an index list of your available GPUs you want to use (e.g., 0,1,2,3).
ex) If you want to activate only the 3rd gpu out of 4 gpus, insert '--gpu_num 2'

Dataset Preparation

We referred to BTS in the data preparation process.

KITTI

1. Official ground truth

  • Download official KITTI ground truth on the link and make KITTI dataset directory.
    $ cd ./datasets
    $ mkdir KITTI && cd KITTI
    $ mv ~/Downloads/data_depth_annotated.zip ./datasets/KITTI
    $ unzip data_depth_annotated.zip

2. Raw dataset

  • Construct raw KITTI dataset using following commands.
    $ mv ./datasets/kitti_archives_to_download.txt ./datasets/KITTI
    $ cd ./datasets/KITTI
    $ aria2c -x 16 -i ./kitti_archives_to_download.txt
    $ parallel unzip ::: *.zip

3. Dense g.t dataset
We take an inpainting method from DenseDepth to get dense g.t for gradient loss.
(You can train our model using only data loss without gradient loss, then you don't need dense g.t)
Corresponding inpainted results from './datasets/KITTI/data_depth_annotated/2011_xx_xx_drive_xxxx_sync/proj_depth/groundtruth/image_02' are should be saved in './datasets/KITTI/data_depth_annotated/2011_xx_xx_drive_xxxx_sync/dense_gt/image_02'.
KITTI data structures are should be organized as below:

|-- datasets
  |-- KITTI
     |-- data_depth_annotated  
        |-- 2011_xx_xx_drive_xxxx_sync
           |-- proj_depth  
              |-- groundtruth            # official G.T folder
        |-- ... (all drives of all days in the raw KITTI)  
     |-- 2011_09_26                      # raw RGB data folder  
        |-- 2011_09_26_drive_xxxx_sync
     |-- 2011_09_29
     |-- ... (all days in the raw KITTI)  

NYU Depth V2

1. Training set
Make NYU dataset directory

    $ cd ./datasets
    $ mkdir NYU_Depth_V2 && cd NYU_Depth_V2
  • Constructing training data using following steps :
    • Download Raw NYU Depth V2 dataset (450GB) from this Link.
    • Extract the raw dataset into './datasets/NYU_Depth_V2'
      (It should make './datasets/NYU_Depth_V2/raw/....').
    • Run './datasets/sync_project_frames_multi_threads.m' to get synchronized data. (need Matlab)
      (It shoud make './datasets/NYU_Depth_V2/sync/....').
  • Or, you can directly download whole 'sync' folder from our Google drive Link into './datasets/NYU_Depth_V2/'

2. Testing set
Download official nyu_depth_v2_labeled.mat and extract image files from the mat file.

    $ cd ./datasets
    ## Download official labled NYU_Depth_V2 mat file
    $ wget http://horatio.cs.nyu.edu/mit/silberman/nyu_depth_v2/nyu_depth_v2_labeled.mat
    ## Extract image files from the mat file
    $ python extract_official_train_test_set_from_mat.py nyu_depth_v2_labeled.mat splits.mat ./NYU_Depth_V2/official_splits/

Evaluation

Make sure you download the pre-trained model and placed it in the './pretrained/' directory before running the evaluation code.

  • Evaluation Command Line:
# Running evaluation using a pre-trained models
## KITTI
python eval.py --model_dir ./pretrained/LDRN_KITTI_ResNext101_pretrained_data.pkl --evaluate --batch_size 1 --dataset KITTI --data_path ./datasets/KITTI --gpu_num 0
## NYU Depth V2
python eval.py --model_dir ./pretrained/LDRN_NYU_ResNext101_pretrained_data.pkl --evaluate --batch_size 1 --dataset NYU --data_path --data_path ./datasets/NYU_Depth_V2/official_splits/test --gpu_num 0

### if you want to save image files from results, insert `--img_save` command
### if you have dense g.t files, insert `--img_save` with `--use_dense_depth` command

Training

LDRN (Laplacian Depth Residual Network) training

  • Training Command Line:
# KITTI 
python train.py --distributed --batch_size 16 --dataset KITTI --data_path ./datasets/KITTI --gpu_num 0,1,2,3
# NYU
python train.py --distributed --batch_size 16 --dataset NYU --data_path ./datasets/NYU_Depth_V2/sync --epochs 30 --gpu_num 0,1,2,3 
## if you want to train using gradient loss, insert `--use_dense_depth` command
## if you don't want distributed training, remove `--distributed` command

'--gpu_num' argument is an index list of your available GPUs you want to use (e.g., 0,1,2,3).
ex) If you want to activate only the 3rd gpu out of 4 gpus, insert '--gpu_num 2'

Reference

When using this code in your research, please cite the following paper:

M. Song, S. Lim and W. Kim, "Monocular Depth Estimation Using Laplacian Pyramid-Based Depth Residuals," in IEEE Transactions on Circuits and Systems for Video Technology, doi: 10.1109/TCSVT.2021.3049869.

@ARTICLE{9316778,
  author={M. {Song} and S. {Lim} and W. {Kim}},
  journal={IEEE Transactions on Circuits and Systems for Video Technology}, 
  title={Monocular Depth Estimation Using Laplacian Pyramid-Based Depth Residuals}, 
  year={2021},
  volume={},
  number={},
  pages={1-1},
  doi={10.1109/TCSVT.2021.3049869}}
Owner
Minsoo Song
B.S. degree with the Department of Electrical and Electronics Engineering, Konkuk University (2014.03 ~)
Minsoo Song
The official implementation of "Rethink Dilated Convolution for Real-time Semantic Segmentation"

RegSeg The official implementation of "Rethink Dilated Convolution for Real-time Semantic Segmentation" Paper: arxiv D block Decoder Setup Install the

Roland 61 Dec 27, 2022
Trading Strategies for Freqtrade

Freqtrade Strategies Strategies for Freqtrade, developed primarily in a partnership between @werkkrew and @JimmyNixx from the Freqtrade Discord. Use t

Bryan Chain 242 Jan 07, 2023
Supervised Sliding Window Smoothing Loss Function Based on MS-TCN for Video Segmentation

SSWS-loss_function_based_on_MS-TCN Supervised Sliding Window Smoothing Loss Function Based on MS-TCN for Video Segmentation Supervised Sliding Window

3 Aug 03, 2022
The versatile ocean simulator, in pure Python, powered by JAX.

Veros is the versatile ocean simulator -- it aims to be a powerful tool that makes high-performance ocean modeling approachable and fun. Because Veros

TeamOcean 245 Dec 20, 2022
Project page of the paper 'Analyzing Perception-Distortion Tradeoff using Enhanced Perceptual Super-resolution Network' (ECCVW 2018)

EPSR (Enhanced Perceptual Super-resolution Network) paper This repo provides the test code, pretrained models, and results on benchmark datasets of ou

Subeesh Vasu 78 Nov 19, 2022
Code for the paper "Improved Techniques for Training GANs"

Status: Archive (code is provided as-is, no updates expected) improved-gan code for the paper "Improved Techniques for Training GANs" MNIST, SVHN, CIF

OpenAI 2.2k Jan 01, 2023
Generalized Proximal Policy Optimization with Sample Reuse (GePPO)

Generalized Proximal Policy Optimization with Sample Reuse This repository is the official implementation of the reinforcement learning algorithm Gene

Jimmy Queeney 9 Nov 28, 2022
Bravia core script for python

Bravia-Core-Script You need to have a mandatory account If this L3 does not work, try another L3. enjoy

5 Dec 26, 2021
这是一个unet-pytorch的源码,可以训练自己的模型

Unet:U-Net: Convolutional Networks for Biomedical Image Segmentation目标检测模型在Pytorch当中的实现 目录 性能情况 Performance 所需环境 Environment 注意事项 Attention 文件下载 Downl

Bubbliiiing 567 Jan 05, 2023
Cooperative Driving Dataset: a dataset for multi-agent driving scenarios

Cooperative Driving Dataset (CODD) The Cooperative Driving dataset is a synthetic dataset generated using CARLA that contains lidar data from multiple

Eduardo Henrique Arnold 124 Dec 28, 2022
Co-mining: Self-Supervised Learning for Sparsely Annotated Object Detection, AAAI 2021.

Co-mining: Self-Supervised Learning for Sparsely Annotated Object Detection This repository is an official implementation of the AAAI 2021 paper Co-mi

MEGVII Research 20 Dec 07, 2022
Unofficial & improved implementation of NeRF--: Neural Radiance Fields Without Known Camera Parameters

[Unofficial code-base] NeRF--: Neural Radiance Fields Without Known Camera Parameters [ Project | Paper | Official code base ] ⬅️ Thanks the original

Jianfei Guo 239 Dec 22, 2022
Official implementation of our CVPR2021 paper "OTA: Optimal Transport Assignment for Object Detection" in Pytorch.

OTA: Optimal Transport Assignment for Object Detection This project provides an implementation for our CVPR2021 paper "OTA: Optimal Transport Assignme

217 Jan 03, 2023
CTRL-C: Camera calibration TRansformer with Line-Classification

CTRL-C: Camera calibration TRansformer with Line-Classification This repository contains the official code and pretrained models for CTRL-C (Camera ca

57 Nov 14, 2022
Code release for General Greedy De-bias Learning

General Greedy De-bias for Dataset Biases This is an extention of "Greedy Gradient Ensemble for Robust Visual Question Answering" (ICCV 2021, Oral). T

4 Mar 15, 2022
Unofficial implementation of Proxy Anchor Loss for Deep Metric Learning

Proxy Anchor Loss for Deep Metric Learning Unofficial pytorch, tensorflow and mxnet implementations of Proxy Anchor Loss for Deep Metric Learning. Not

Geonmo Gu 3 Jun 09, 2021
Efficient Deep Learning Systems course

Efficient Deep Learning Systems This repository contains materials for the Efficient Deep Learning Systems course taught at the Faculty of Computer Sc

Max Ryabinin 173 Dec 29, 2022
A fuzzing framework for SMT solvers

yinyang A fuzzing framework for SMT solvers. Given a set of seed SMT formulas, yinyang generates mutant formulas to stress-test SMT solvers. yinyang c

Project Yin-Yang for SMT Solver Testing 145 Jan 04, 2023
SpinalNet: Deep Neural Network with Gradual Input

SpinalNet: Deep Neural Network with Gradual Input This repository contains scripts for training different variations of the SpinalNet and its counterp

H M Dipu Kabir 142 Dec 30, 2022
ZEBRA: Zero Evidence Biometric Recognition Assessment

ZEBRA: Zero Evidence Biometric Recognition Assessment license: LGPLv3 - please reference our paper version: 2020-06-11 author: Andreas Nautsch (EURECO

Voice Privacy Challenge 2 Dec 12, 2021