Image-based Navigation in Real-World Environments via Multiple Mid-level Representations: Fusion Models Benchmark and Efficient Evaluation

Overview

Image-based Navigation in Real-World Environments via Multiple Mid-level Representations: Fusion Models Benchmark and Efficient Evaluation

This repository hosts the code related to the paper:

Marco Rosano, Antonino Furnari, Luigi Gulino, Corrado Santoro and Giovanni Maria Farinella, "Image-based Navigation in Real-World Environments via Multiple Mid-level Representations: Fusion Models Benchmark and Efficient Evaluation". Submitted to "Robotics and Autonomous Systems" (RAS), 2022.

For more details please see the project web page at https://iplab.dmi.unict.it/EmbodiedVN.

Overview

This code is built on top of the Habitat-api/Habitat-lab project. Please see the Habitat project page for more details.

This repository provides the following components:

  1. The implementation of the proposed tool, integrated with Habitat, to train visual navigation models on synthetic observations and test them on realistic episodes containing real-world images. This allows the estimation of real-world performance, avoiding the physical deployment of the robotic agent;

  2. The official PyTorch implementation of the proposed visual navigation models, which follow different strategies to combine a range of visual mid-level representations

  3. the synthetic 3D model of the proposed environment, acquired using the Matterport 3D scanner and used to perform the navigation episodes at train and test time;

  4. the photorealistic 3D model that contains real-world images of the proposed environment, labeled with their pose (X, Z, Angle). The sparse 3D reconstruction was performed using the COLMAP Structure from Motion tool, to then be aligned with the Matterport virtual 3D map.

  5. An integration with CycleGAN to train and evaluate navigation models with Habitat on sim2real adapted images.

  6. The checkpoints of the best performing navigation models.

Installation

Requirements

  • Python >= 3.7, use version 3.7 to avoid possible issues.
  • Other requirements will be installed via pip in the following steps.

Steps

  1. (Optional) Create an Anaconda environment and install all on it ( conda create -n fusion-habitat python=3.7; conda activate fusion-habitat )

  2. Install the Habitat simulator following the official repo instructions .The development and testing was done on commit bfbe9fc30a4e0751082824257d7200ad543e4c0e, installing the simulator "from source", launching the ./build.sh --headless --with-cuda command (guide). Please consider to follow these suggestions if you encounter issues while installing the simulator.

  3. Install the customized Habitat-lab (this repo):

    git clone https://github.com/rosanom/mid-level-fusion-nav.git
    cd mid-level-fusion-nav/
    pip install -r requirements.txt
    python setup.py develop --all # install habitat and habitat_baselines
    
  4. Download our dataset (journal version) from here, and extract it to the repository folder (mid-level-fusion-nav/). Inside the data folder you should see this structure:

    datasets/pointnav/orangedev/v1/...
    real_images/orangedev/...
    scene_datasets/orangedev/...
    orangedev_checkpoints/...
    
  5. (Optional, to check if the software works properly) Download the test scenes data and extract the zip file to the repository folder (mid-level-fusion-nav/). To verify that the tool was successfully installed, run python examples/benchmark.py or python examples/example.py.

Data Structure

All data can be found inside the mid-level-fusion-nav/data/ folder:

  • the datasets/pointnav/orangedev/v1/... folder contains the generated train and validation navigation episodes files;
  • the real_images/orangedev/... folder contains the real world images of the proposed environment and the csv file with their pose information (obtained with COLMAP);
  • the scene_datasets/orangedev/... folder contains the 3D mesh of the proposed environment.
  • orangedev_checkpoints/ is the folder where the checkpoints are saved during training. Place the checkpoint file here if you want to restore the training process or evaluate the model. The system will load the most recent checkpoint file.

Config Files

There are two configuration files:

habitat_domain_adaptation/configs/tasks/pointnav_orangedev.yaml

and

habitat_domain_adaptation/habitat_baselines/config/pointnav/ddppo_pointnav_orangedev.yaml.

In the first file you can change the robot's properties, the sensors used by the agent and the dataset used in the experiment. You don't have to modify it.

In the second file you can decide:

  1. if evaluate the navigation models using RGB or mid-level representations;
  2. the set of mid-level representations to use;
  3. the fusion architecture to use;
  4. if train or evaluate the models using real images, or using the CycleGAN sim2real adapted observations.
...
EVAL_W_REAL_IMAGES: True
EVAL_CKPT_PATH_DIR: "data/orangedev_checkpoints/"

SIM_2_REAL: False #use cycleGAN for sim2real image adaptation?

USE_MIDLEVEL_REPRESENTATION: True
MIDLEVEL_PARAMS:
ENCODER: "simple" # "simple", SE_attention, "mid_fusion", ...
FEATURE_TYPE: ["normal"] #["normal", "keypoints3d","curvature", "depth_zbuffer"]
...

CycleGAN Integration (baseline)

In order to use CycleGAN on Habitat for the sim2real domain adaptation during train or evaluation, follow the steps suggested in the repository of our previous resease.

Train and Evaluation

To train the navigation model using the DD-PPO RL algorithm, run:

sh habitat_baselines/rl/ddppo/single_node_orangedev.sh

To evaluate the navigation model using the DD-PPO RL algorithm, run:

sh habitat_baselines/rl/ddppo/single_node_orangedev_eval.sh

For more information about DD-PPO RL algorithm, please check out the habitat-lab dd-ppo repo page.

License

The code in this repository, the 3D models and the images of the proposed environment are MIT licensed. See the LICENSE file for details.

The trained models and the task datasets are considered data derived from the correspondent scene datasets.

Acknowledgements

This research is supported by OrangeDev s.r.l, by Next Vision s.r.l, the project MEGABIT - PIAno di inCEntivi per la RIcerca di Ateneo 2020/2022 (PIACERI) – linea di intervento 2, DMI - University of Catania, and the grant MIUR AIM - Attrazione e Mobilità Internazionale Linea 1 - AIM1893589 - CUP E64118002540007.

Owner
First Person Vision @ Image Processing Laboratory - University of Catania
First Person Vision @ Image Processing Laboratory - University of Catania
Implementation of a Transformer that Ponders, using the scheme from the PonderNet paper

Ponder(ing) Transformer Implementation of a Transformer that learns to adapt the number of computational steps it takes depending on the difficulty of

Phil Wang 65 Oct 04, 2022
kapre: Keras Audio Preprocessors

Kapre Keras Audio Preprocessors - compute STFT, ISTFT, Melspectrogram, and others on GPU real-time. Tested on Python 3.6 and 3.7 Why Kapre? vs. Pre-co

Keunwoo Choi 867 Dec 29, 2022
PyTorch implementation of "Supervised Contrastive Learning" (and SimCLR incidentally)

PyTorch implementation of "Supervised Contrastive Learning" (and SimCLR incidentally)

Yonglong Tian 2.2k Jan 08, 2023
It helps user to learn Pick-up lines and share if he has a better one

Pick-up-Lines-Generator(Open Source) It helps user to learn Pick-up lines Share and Add one or many to the DataBase Unique SQLite DataBase AI Undercon

knock_nott 0 May 04, 2022
Official repository of the paper Learning to Regress 3D Face Shape and Expression from an Image without 3D Supervision

Official repository of the paper Learning to Regress 3D Face Shape and Expression from an Image without 3D Supervision

Soubhik Sanyal 689 Dec 25, 2022
Setup freqtrade/freqUI on Heroku

UNMAINTAINED - REPO MOVED TO https://github.com/p-zombie/freqtrade Creating the app git clone https://github.com/joaorafaelm/freqtrade.git && cd freqt

João 51 Aug 29, 2022
PyTorch implementation of UNet++ (Nested U-Net).

PyTorch implementation of UNet++ (Nested U-Net) This repository contains code for a image segmentation model based on UNet++: A Nested U-Net Architect

4ui_iurz1 642 Jan 04, 2023
Python implementation of Wu et al (2018)'s registration fusion

reg-fusion Projection of a central sulcus probability map using the RF-ANTs approach (right hemisphere shown). This is a Python implementation of Wu e

Dan Gale 26 Nov 12, 2021
Implementation based on Paper - Learning a Probabilistic Latent Space of Object Shapes via 3D Generative-Adversarial Modeling

Implementation based on Paper - Learning a Probabilistic Latent Space of Object Shapes via 3D Generative-Adversarial Modeling

HamasKhan 3 Jul 08, 2022
Official implementation of SIGIR'2021 paper: "Sequential Recommendation with Graph Neural Networks".

SURGE: Sequential Recommendation with Graph Neural Networks This is our TensorFlow implementation for the paper: Sequential Recommendation with Graph

FIB LAB, Tsinghua University 53 Dec 26, 2022
This repository is based on Ultralytics/yolov5, with adjustments to enable rotate prediction boxes.

Rotate-Yolov5 This repository is based on Ultralytics/yolov5, with adjustments to enable rotate prediction boxes. Section I. Description The codes are

xinzelee 90 Dec 13, 2022
Official implementation for the paper "Attentive Prototypes for Source-free Unsupervised Domain Adaptive 3D Object Detection"

Attentive Prototypes for Source-free Unsupervised Domain Adaptive 3D Object Detection PyTorch code release of the paper "Attentive Prototypes for Sour

Deepti Hegde 23 Oct 17, 2022
Install alphafold on the local machine, get out of docker.

AlphaFold This package provides an implementation of the inference pipeline of AlphaFold v2.0. This is a completely new model that was entered in CASP

Kui Xu 73 Dec 13, 2022
Mixup for Supervision, Semi- and Self-Supervision Learning Toolbox and Benchmark

OpenSelfSup News Downstream tasks now support more methods(Mask RCNN-FPN, RetinaNet, Keypoints RCNN) and more datasets(Cityscapes). 'GaussianBlur' is

AI Lab, Westlake University 332 Jan 03, 2023
The codes and related files to reproduce the results for Image Similarity Challenge Track 2.

The codes and related files to reproduce the results for Image Similarity Challenge Track 2.

Wenhao Wang 89 Jan 02, 2023
AutoVideo: An Automated Video Action Recognition System

AutoVideo is a system for automated video analysis. It is developed based on D3M infrastructure, which describes machine learning with generic pipeline languages. Currently, it focuses on video actio

Data Analytics Lab at Texas A&M University 267 Dec 17, 2022
Discover hidden deepweb pages

DeepWeb Scapper Att: Demo version An simple script to scrappe deepweb to find pages. Will return if any of those exists and will save on a file. You s

Héber Júlio 77 Oct 02, 2022
DetCo: Unsupervised Contrastive Learning for Object Detection

DetCo: Unsupervised Contrastive Learning for Object Detection arxiv link News Sparse RCNN+DetCo improves from 45.0 AP to 46.5 AP(+1.5) with 3x+ms trai

Enze Xie 234 Dec 18, 2022
Build a medical knowledge graph based on Unified Language Medical System (UMLS)

UMLS-Graph Build a medical knowledge graph based on Unified Language Medical System (UMLS) Requisite Install MySQL Server 5.6 and import UMLS data int

Donghua Chen 6 Dec 25, 2022
Official Code Release for Container : Context Aggregation Network

Container: Context Aggregation Network Official Code Release for Container : Context Aggregation Network Comparion between CNN, MLP-Mixer and Transfor

peng gao 42 Nov 17, 2021