SporeAgent: Reinforced Scene-level Plausibility for Object Pose Refinement

Overview

SporeAgent: Reinforced Scene-level Plausibility for Object Pose Refinement

This repository implements the approach described in SporeAgent: Reinforced Scene-level Plausibility for Object Pose Refinement (WACV 2022).

Iterative refinement using SporeAgent

Iterative registration using SporeAgent:
The initial pose from PoseCNN (purple) and the final pose using SporeAgent (blue) on the LINEMOD (left,cropped) and YCB-Video (right) datasets.

Scene-level Plausibility

Scene-level Plausibility:
The initial scene configuration from PoseCNN (left) results in an implausible pose of the target object (gray). Refinement using SporeAgent (right) results in a plausible scene configuration where the intersecting points (red) are resolved and the object rests on its supported points (cyan).

LINEMOD AD < 0.10d AD < 0.05d AD <0.02d YCB-Video ADD AUC AD AUC ADI AUC
PoseCNN 62.7 26.9 3.3 51.5 61.3 75.2
Point-to-Plane ICP 92.6 79.8 29.9 68.2 79.2 87.8
w/ VeREFINE 96.1 85.8 32.5 70.1 81.0 88.8
Multi-hypothesis ICP 99.3 89.9 35.6 77.4 86.6 92.6
SporeAgent 99.3 93.7 50.3 79.0 88.8 93.6

Comparison on LINEMOD and YCB-Video:
The initial pose and segmentation estimates are computed using PoseCNN. We compare our approach to vanilla Point-to-Plane ICP (from Open3D), Point-to-Plane ICP augmented by the simulation-based VeREFINE approach and the ICP-based multi-hypothesis approach used for refinement in PoseCNN.

Dependencies

The code has been tested on Ubuntu 16.04 and 20.04 with Python 3.6 and CUDA 10.2. To set-up the Python environment, use Anaconda and the provided YAML file:

conda env create -f environment.yml --name sporeagent

conda activate sporeagent.

The BOP Toolkit is additionally required. The BOP_PATH in config.py needs to be changed to the respective clone directory and the packages required by the BOP Toolkit need to be installed.

The YCB-Video Toolbox is required for experiments on the YCB-Video dataset.

Datasets

We use the dataset versions prepared for the BOP challenge. The required files can be downloaded to a directory of your choice using the following bash script:

export SRC=http://ptak.felk.cvut.cz/6DB/public/bop_datasets
export DATASET=ycbv                     # either "lm" or "ycbv"
wget $SRC/$DATASET_base.zip             # Base archive with dataset info, camera parameters, etc.
wget $SRC/$DATASET_models.zip           # 3D object models.
wget $SRC/$DATASET_test_all.zip         # All test images.
unzip $DATASET_base.zip                 # Contains folder DATASET.
unzip $DATASET_models.zip -d $DATASET   # Unpacks to DATASET.
unzip $DATASET_test_all.zip -d $DATASET # Unpacks to DATASET.

For training on YCB-Video, the $DATASET_train_real.zip is moreover required.

In addition, we have prepared point clouds sampled within the ground-truth masks (for training) and the segmentation masks computed using PoseCNN (for evaluation) for the LINEMOD and YCB-Video dataset. The samples for evaluation also include the initial pose estimates from PoseCNN.

LINEMOD

Extract the prepared samples into PATH_TO_BOP_LM/sporeagent/ and set LM_PATH in config.py to the base directory, i.e., PATH_TO_BOP_LM. Download the PoseCNN results and the corresponding image set definitions provided with DeepIM and extract both into POSECNN_LM_RESULTS_PATH. Finally, since the BOP challenge uses a different train/test split than the compared methods, the appropriate target file found here needs to be placed in the PATH_TO_BOP_LM directory.

To compute the AD scores using the BOP Toolkit, BOP_PATH/scripts/eval_bop19.py needs to be adapted:

  • to use ADI for symmetric objects and ADD otherwise with a 2/5/10% threshold, change p['errors'] to
{
  'n_top': -1,
  'type': 'ad',
  'correct_th': [[0.02], [0.05], [0.1]]
}
  • to use the correct test targets, change p['targets_filename'] to 'test_targets_add.json'

YCB-Video

Extract the prepared samples into PATH_TO_BOP_YCBV/reagent/ and set YCBV_PATH in config.py to the base directory, i.e., PATH_TO_BOP_YCBV. Clone the YCB Video Toolbox to POSECNN_YCBV_RESULTS_PATH. Extract the results_PoseCNN_RSS2018.zip and copy test_data_list.txt to the same directory. The POSECNN_YCBV_RESULTS_PATH in config.py needs to be changed to the respective directory. Additionally, place the meshes in the canonical frame models_eval_canonical in the PATH_TO_BOP_YCBV directory.

To compute the ADD/AD/ADI AUC scores using the YCB-Video Toolbox, replace the respective files in the toolbox by the ones provided in sporeagent/ycbv_toolbox.

Pretrained models

Weights for both datasets can be found here. Download and copy them to sporeagent/weights/.

Training

For LINEMOD: python registration/train.py --dataset=lm

For YCB-Video: python registration/train.py --dataset=ycbv

Evaluation

Note that we precompute the normal images used for pose scoring on the first run and store them to disk.

LINEMOD

The results for LINEMOD are computed using the BOP Toolkit. The evaluation script exports the required file by running

python registration/eval.py --dataset=lm,

which can then be processed via

python BOP_PATH/scripts/eval_bop19.py --result_filenames=PATH_TO_CSV_WITH_RESULTS.

YCB-Video

The results for YCB-Video are computed using the YCB-Video Toolbox. The evaluation script exports the results in BOP format by running

python registration/eval.py --dataset=ycbv,

which can then be parsed into the format used by the YCB-Video Toolbox by running

python utility/parse_matlab.py.

In MATLAB, run evaluate_poses_keyframe.m to generate the per-sample results and plot_accuracy_keyframe.m to compute the statistics.

Citation

If you use this repository in your publications, please cite

@article{bauer2022sporeagent,
    title={SporeAgent: Reinforced Scene-level Plausibility for Object Pose Refinement},
    author={Bauer, Dominik and Patten, Timothy and Vincze, Markus},
    booktitle={IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},
    year={2022},
    pages={654-662}
}
Owner
Dominik Bauer
Dominik Bauer
A transformer model to predict pathogenic mutations

MutFormer MutFormer is an application of the BERT (Bidirectional Encoder Representations from Transformers) NLP (Natural Language Processing) model wi

Wang Genomics Lab 2 Nov 29, 2022
Implementation of Deep Deterministic Policy Gradiet Algorithm in Tensorflow

ddpg-aigym Deep Deterministic Policy Gradient Implementation of Deep Deterministic Policy Gradiet Algorithm (Lillicrap et al.arXiv:1509.02971.) in Ten

Steven Spielberg P 247 Dec 07, 2022
PyTorch CZSL framework containing GQA, the open-world setting, and the CGE and CompCos methods.

Compositional Zero-Shot Learning This is the official PyTorch code of the CVPR 2021 works Learning Graph Embeddings for Compositional Zero-shot Learni

EML Tübingen 70 Dec 27, 2022
Code for EMNLP2020 long paper: BERT-Attack: Adversarial Attack Against BERT Using BERT

BERT-ATTACK Code for our EMNLP2020 long paper: BERT-ATTACK: Adversarial Attack Against BERT Using BERT Dependencies Python 3.7 PyTorch 1.4.0 transform

Linyang Li 142 Jan 04, 2023
This is an implementation of PIFuhd based on Pytorch

Open-PIFuhd This is a unofficial implementation of PIFuhd PIFuHD: Multi-Level Pixel-Aligned Implicit Function forHigh-Resolution 3D Human Digitization

Lingteng Qiu 235 Dec 19, 2022
small collection of functions for neural networks

neurobiba other languages: RU small collection of functions for neural networks. very easy to use! Installation: pip install neurobiba See examples h

4 Aug 23, 2021
This is an unofficial implementation of the paper “Student-Teacher Feature Pyramid Matching for Unsupervised Anomaly Detection”.

This is an unofficial implementation of the paper “Student-Teacher Feature Pyramid Matching for Unsupervised Anomaly Detection”.

haifeng xia 32 Oct 26, 2022
MoveNetを用いたPythonでの姿勢推定のデモ

MoveNet-Python-Example MoveNetのPythonでの動作サンプルです。 ONNXに変換したモデルも同梱しています。変換自体を試したい方はMoveNet_tf2onnx.ipynbを使用ください。 2021/08/24時点でTensorFlow Hubで提供されている以下モデ

KazuhitoTakahashi 38 Dec 17, 2022
mlpack: a scalable C++ machine learning library --

a fast, flexible machine learning library Home | Documentation | Doxygen | Community | Help | IRC Chat Download: current stable version (3.4.2) mlpack

mlpack 4.2k Jan 09, 2023
A SAT-based sudoku solver

SAT Sudoku solver A SAT-based Sudoku solver made in the context of a small project in the "Logic Problem Solving" class in the first year at the Polyt

Alexandre Malfreyt 5 Apr 15, 2022
BOVText: A Large-Scale, Multidimensional Multilingual Dataset for Video Text Spotting

BOVText: A Large-Scale, Bilingual Open World Dataset for Video Text Spotting Updated on December 10, 2021 (Release all dataset(2021 videos)) Updated o

weijiawu 47 Dec 26, 2022
Source code for the paper: Variance-Aware Machine Translation Test Sets (NeurIPS 2021 Datasets and Benchmarks Track)

Variance-Aware-MT-Test-Sets Variance-Aware Machine Translation Test Sets License See LICENSE. We follow the data licensing plan as the same as the WMT

NLP2CT Lab, University of Macau 5 Dec 21, 2021
Learnable Motion Coherence for Correspondence Pruning

Learnable Motion Coherence for Correspondence Pruning Yuan Liu, Lingjie Liu, Cheng Lin, Zhen Dong, Wenping Wang Project Page Any questions or discussi

liuyuan 41 Nov 30, 2022
[CVPR 2022 Oral] MixFormer: End-to-End Tracking with Iterative Mixed Attention

MixFormer The official implementation of the CVPR 2022 paper MixFormer: End-to-End Tracking with Iterative Mixed Attention [Models and Raw results] (G

Multimedia Computing Group, Nanjing University 235 Jan 03, 2023
A python script to lookup Passport Index Dataset

visa-cli A python script to lookup Passport Index Dataset Installation pip install visa-cli Usage usage: visa-cli [-h] [-d DESTINATION_COUNTRY] [-f]

rand-net 16 Oct 18, 2022
The code for the NeurIPS 2021 paper "A Unified View of cGANs with and without Classifiers".

Energy-based Conditional Generative Adversarial Network (ECGAN) This is the code for the NeurIPS 2021 paper "A Unified View of cGANs with and without

sianchen 22 May 28, 2022
reimpliment of DFANet: Deep Feature Aggregation for Real-Time Semantic Segmentation

DFANet This repo is an unofficial pytorch implementation of DFANet:Deep Feature Aggregation for Real-Time Semantic Segmentation log 2019.4.16 After 48

shen hui xiang 248 Oct 21, 2022
Prml - Repository of notes, code and notebooks in Python for the book Pattern Recognition and Machine Learning by Christopher Bishop

Pattern Recognition and Machine Learning (PRML) This project contains Jupyter notebooks of many the algorithms presented in Christopher Bishop's Patte

Gerardo Durán-Martín 1k Jan 07, 2023
General Vision Benchmark, a project from OpenGVLab

Introduction We build GV-B(General Vision Benchmark) on Classification, Detection, Segmentation and Depth Estimation including 26 datasets for model e

174 Dec 27, 2022
Experiments on Flood Segmentation on Sentinel-1 SAR Imagery with Cyclical Pseudo Labeling and Noisy Student Training

Flood Detection Challenge This repository contains code for our submission to the ETCI 2021 Competition on Flood Detection (Winning Solution #2). Acco

Siddha Ganju 108 Dec 28, 2022