Code and datasets for the paper "Combining Events and Frames using Recurrent Asynchronous Multimodal Networks for Monocular Depth Prediction" (RA-L, 2021)

Last update: Dec 26, 2022

Related tags

Deep Learning rpg_ramnet

Overview

Combining Events and Frames using Recurrent Asynchronous Multimodal Networks for Monocular Depth Prediction

This is the code for the paper Combining Events and Frames using Recurrent Asynchronous Multimodal Networks for Monocular Depth Prediction by Daniel Gehrig*, Michelle Rüegg*, Mathias Gehrig, Javier Hidalgo-Carrió, and Davide Scaramuzza:

You can find a pdf of the paper here and the project homepage here. If you use this work in an academic context, please cite the following publication:

@Article{RAL21Gehrig,
  author        = {Daniel Gehrig, Michelle Rüegg, Mathias Gehrig, Javier Hidalgo-Carrio and Davide Scaramuzza},
  title         = {Combining Events and Frames using Recurrent Asynchronous Multimodal Networks for Monocular Depth Prediction},
  journal       = {{IEEE} Robotic and Automation Letters. (RA-L)},
  url           = {http://rpg.ifi.uzh.ch/docs/RAL21_Gehrig.pdf},
  year          = 2021
}

If you use the event-camera plugin go to CARLA, please cite the following publication:

@Article{Hidalgo20threedv,
  author        = {Javier Hidalgo-Carrio, Daniel Gehrig and Davide Scaramuzza},
  title         = {Learning Monocular Dense Depth from Events},
  journal       = {{IEEE} International Conference on 3D Vision.(3DV)},
  url           = {http://rpg.ifi.uzh.ch/docs/3DV20_Hidalgo.pdf},
  year          = 2020
}

Install with Anaconda

The installation requires Anaconda3. You can create a new Anaconda environment with the required dependencies as follows (make sure to adapt the CUDA toolkit version according to your setup):

conda create --name RAMNET python=3.7
conda install pytorch torchvision torchaudio cudatoolkit=10.2 -c pytorch
pip install tb-nightly kornia scikit-learn scikit-image opencv-python

Branches

To run experiments on Event Scape plese switch to the main branch

git checkout main

To run experiments on real data from MVSEC, switch to asynchronous_irregular_real_data.

git checkout asynchronous_irregular_real_data

Checkpoints

The checkpoints for RAM-Net can be found here:

EventScape

This work uses the EventScape dataset which can be downloaded here:

Qualitative results on MVSEC

Here the qualitative results of RAM-Net against state-of-the-art is shown. The video shows MegaDepth, E2Depth and RAM-Net in the upper row, image and event inputs and depth ground truth in the lower row.

Using RAM-Net

A detailed description on how to run the code can be found in the README in the folder /RAM_Net. Another README can be found in /RAM_Net/configs, it describes the meaning of the different parameters in the configs.

Code and datasets for the paper "Combining Events and Frames using Recurrent Asynchronous Multimodal Networks for Monocular Depth Prediction" (RA-L, 2021)

Related tags

Overview

Combining Events and Frames using Recurrent Asynchronous Multimodal Networks for Monocular Depth Prediction

Install with Anaconda

Branches

Checkpoints

EventScape

Qualitative results on MVSEC

Using RAM-Net

Owner

Robotics and Perception Group

RITA is a family of autoregressive protein models, developed by LightOn in collaboration with the OATML group at Oxford and the Debora Marks Lab at Harvard.

This repository includes code of my study about Asynchronous in Frequency domain of GAN images.

CSAW-M: An Ordinal Classification Dataset for Benchmarking Mammographic Masking of Cancer

MoveNetを用いたPythonでの姿勢推定のデモ

CBKH: The Cornell Biomedical Knowledge Hub

Eye-Blink-Counter - Python based Computer Vision project which counts how many time a person blinks

VIL-100: A New Dataset and A Baseline Model for Video Instance Lane Detection (ICCV 2021)

GAT - Graph Attention Network (PyTorch) 💻 + graphs + 📣 = ❤️

An efficient PyTorch implementation of the winning entry of the 2017 VQA Challenge.

Revisiting Self-Training for Few-Shot Learning of Language Model.

[ECCV 2020] Reimplementation of 3DDFAv2, including face mesh, head pose, landmarks, and more.

VIsually-Pivoted Audio and(N) Text

Code for "Multi-Time Attention Networks for Irregularly Sampled Time Series", ICLR 2021.

Transformers are Graph Neural Networks!

Official code for the ICCV 2021 paper "DECA: Deep viewpoint-Equivariant human pose estimation using Capsule Autoencoders"

SphereFace: Deep Hypersphere Embedding for Face Recognition

Matlab Python Heuristic Battery Opt - SMOP conversion and manual conversion

A convolutional recurrent neural network for classifying A/B phases in EEG signals recorded for sleep analysis.

Image Restoration Using Swin Transformer for VapourSynth

[NAACL & ACL 2021] SapBERT: Self-alignment pretraining for BERT.