Code for "Single-view robot pose and joint angle estimation via render & compare", CVPR 2021 (Oral).

Related tags

Deep Learningrobopose
Overview

Single-view robot pose and joint angle estimation via render & compare

Yann Labbé, Justin Carpentier, Mathieu Aubry, Josef Sivic

CVPR: Conference on Computer Vision and Pattern Recognition, 2021 (Oral)

[Paper] [Project page] [Supplementary Video]

overview RoboPose. (a) Given a single RGB image of a known articulated robot in an unknown configuration (left), RoboPose estimates the joint angles and the 6D camera-to-robot pose (rigid translation and rotation) providing the complete state of the robot within the 3D scene, here illustrated by overlaying the articulated CAD model of the robot over the input image (right). (b) When the joint angles are known at test-time (e.g. from internal measurements of the robot), RoboPose can use them as an additional input to estimate the 6D camera-to-robot pose to enable, for example, visually guided manipulation without fiducial markers.

Citation

If you use this code in your research, please cite the paper:

@inproceedings{labbe2021robopose,
title= {Single-view robot pose and joint angle estimation via render & compare}
author={Y. {Labb\'e} and J. {Carpentier} and M. {Aubry} and J. {Sivic}},
booktitle={Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR)},
year={2021}}

Table of content

Overview

This repository contains the code for the full RoboPose approach and for reproducing all the results from the paper (training, inference and evaluation).

overview

Installation

git clone --recurse-submodules https://github.com/ylabbe/robopose.git
cd robopose
conda env create -n robopose --file environment.yaml
conda activate robopose
python setup.py install
mkdir local_data

The installation may take some time as several packages must be downloaded and installed/compiled. If you plan to change the code, run python setup.py develop.

Downloading and preparing data

All data used (datasets, models, results, ...) are stored in a directory local_data at the root of the repository. Create it with mkdir local_data or use a symlink if you want the data to be stored at a different place. We provide the utility robopose/scripts/download.py for downloading required data and models. All of the files can also be downloaded manually.

Robot URDF & CAD models

python -m robopose.scripts.download --robot=owi
python -m robopose.scripts.download --robot=kuka
python -m robopose.scripts.download --robot=panda
python -m robopose.scripts.download --robot=baxter

DREAM & CRAVES Datasets

python -m robopose.scripts.download --datasets=craves.test
python -m robopose.scripts.download --datasets=dream.test

# Only for re-training the models
python -m robopose.scripts.download --datasets=craves.train
python -m robopose.scripts.download --datasets=dream.train

Pre-trained models

python -m robopose.scripts.download --model=panda-known_angles
python -m robopose.scripts.download --model=panda-predict_angles
python -m robopose.scripts.download --model=kuka-known_angles
python -m robopose.scripts.download --model=kuka-predict_angles
python -m robopose.scripts.download --model=baxter-known_angles
python -m robopose.scripts.download --model=baxter-predict_angles
python -m robopose.scripts.download --model=owi-predict_angles

DREAM & CRAVES original results

python -m robopose.scripts.download --dream_paper_results
python -m robopose.scripts.download --craves_paper_results

Notes:

  • Dream results were extracted using the official code from https://github.com/NVlabs/DREAM.
  • CRAVES results were extracted using the code provided with the paper. We slightly modified this code to compute the errors on the whole LAB dataset, the code used can be found on our fork.

Note on GPU parallelization

Training and evaluation code can be parallelized across multiple gpus and multiple machines using vanilla torch.distributed. This is done by simply starting multiple processes with the same arguments and assigning each process to a specific GPU via CUDA_VISIBLE_DEVICES. To run the processes on a local machine or on a SLUMR cluster, we use our own utility job-runner but other similar tools such as dask-jobqueue or submitit could be used. We provide instructions for single-node multi-gpu training, and for multi-gpu multi-node training on a SLURM cluster.

Single gpu on a single node

# CUDA ID of GPU you want to use
export CUDA_VISIBLE_DEVICES=0
python -m robopose.scripts.example_multigpu

where scripts.example_multigpu can be replaced by scripts.run_pose_training or scripts.run_robopose_eval (see below for usage of training/evaluation scripts).

Configuration of job-runner for multi-gpu usage

Change the path to the code directory, anaconda location and specify a temporary directory for storing job logs by modifying `job-runner-config.yaml'. If you have access to a SLURM cluster, specify the name of the queue, it's specifications (number of GPUs/CPUs per node) and the flags you typically use in a slurm script. Once you are done, run:

runjob-config job-runner-config.yaml

Multi-gpu on a single node

# CUDA IDS of GPUs you want to use
export CUDA_VISIBLE_DEVICES=0,1
runjob --ngpus=2 --queue=local python -m robopose.scripts.example_multigpu

The logs of the first process will be printed. You can check the logs of the other processes in the job directory.

On a SLURM cluster

runjob --ngpus=8 --queue=gpu_p1  python -m robopose.scripts.example_multigpu

Reproducing results using pre-trained models

We provide the inference results on all datasets to reproduce the results from the paper. You can download these results, generate the tables and qualitative visualization of our predictions on the test datasets. The results will be downloaded to local_data/results.

Downloading inference results

# Table 1, DREAM paper results (converted from the original format)
python -m robopose.scripts.download --results=dream-paper-all-models

# Table 1, DREAM Known joint angles
python -m robopose.scripts.download --results=dream-known-angles

# Table 1, DREAM Unknown joint angles
python -m robopose.scripts.download --results=dream-unknown-angles

# Table 2, Iterative results
python -m robopose.scripts.download --results=panda-orb-known-angles-iterative

# Table 3, Craves-Lab
python -m robopose.scripts.download --results=craves-lab

# Table 4, Craves Youtube
python -m robopose.scripts.download --results=craves-youtube

# Table 5, Analysis of the choice of reference point
python -m robopose.scripts.download --results=panda-reference-point-ablation

# Table 6, Analysis of the choice of the anchor part
python -m robopose.scripts.download --results=panda-anchor-ablation

# Sup. Mat analysis of the number of iterations
python -m robopose.scripts.download --results=panda-train_iterations-ablation

You can generate the numbers from the tables from these inference/evaluation results using the notebook notebooks/generate_results.ipynb.

You can generate visualization of the results using the notebook notebooks/visualize_predictions.ipynb. overview

Running inference

We provide the code for running inference and re-generate all results. This is done using the run_robot_eval script. The results were obtained using the following commands:

## Main results and comparisons
# DREAM datasets,  DREAM models
runjob --ngpus=8  python scripts/run_robot_eval.py --datasets=dream-panda  --model=dream-all-models --id 1804
runjob --ngpus=8  python scripts/run_robot_eval.py --datasets=dream-baxter --model=dream-all-models --id 1804
runjob --ngpus=8  python scripts/run_robot_eval.py --datasets=dream-kuka  --model=dream-all-models --id 1804

# DREAM datasets, ours (known joints)
runjob --ngpus=8  python scripts/run_robot_eval.py --datasets=dream-panda  --model=knownq --id 1804
runjob --ngpus=8  python scripts/run_robot_eval.py --datasets=dream-baxter --model=knownq --id 1804
runjob --ngpus=8  python scripts/run_robot_eval.py --datasets=dream-kuka   --model=knownq --id 1804

# DREAM datasets, ours (unknown joints)
runjob --ngpus=8  python scripts/run_robot_eval.py --datasets=dream-panda  --model=unknownq --id 1804
runjob --ngpus=8  python scripts/run_robot_eval.py --datasets=dream-baxter --model=unknownq --id 1804
runjob --ngpus=8  python scripts/run_robot_eval.py --datasets=dream-kuka   --model=unknownq --id 1804

# CRAVES LAB dataset
runjob --ngpus=8 python scripts/run_robot_eval.py --datasets=craves-lab --model=unknownq --id 1804

# CRAVES Youtube dataset
runjob --ngpus=8  python scripts/run_robot_eval.py --datasets=craves-youtube --model=unknownq-focal=500 --id 1804
runjob --ngpus=8  python scripts/run_robot_eval.py --datasets=craves-youtube --model=unknownq-focal=750 --id 1804
runjob --ngpus=8  python scripts/run_robot_eval.py --datasets=craves-youtube --model=unknownq-focal=1000 --id 1804
runjob --ngpus=8  python scripts/run_robot_eval.py --datasets=craves-youtube --model=unknownq-focal=1250 --id 1804
runjob --ngpus=8  python scripts/run_robot_eval.py --datasets=craves-youtube --model=unknownq-focal=1500 --id 1804
runjob --ngpus=8  python scripts/run_robot_eval.py --datasets=craves-youtube --model=unknownq-focal=1750 --id 1804
runjob --ngpus=8  python scripts/run_robot_eval.py --datasets=craves-youtube --model=unknownq-focal=2000 --id 1804
runjob --ngpus=8  python scripts/run_robot_eval.py --datasets=craves-youtube --model=unknownq-focal=5000 --id 1804


## Ablations
# Online evaluation, Table 2
runjob --ngpus=8 python scripts/run_robot_eval.py --datasets=dream-panda-orb --model=knownq --id 1804 --eval_all_iter
runjob --ngpus=1 python scripts/run_robot_eval.py --datasets=dream-panda-orb --model=knownq-online --id 1804

# Analysis of reference point, Table 5
python -m robopose.scripts.download --models=ablation_reference_point
runjob --ngpus=8  python scripts/run_robot_eval.py --datasets=dream-panda-orb  --model=knownq-link0 --id 1804
runjob --ngpus=8  python scripts/run_robot_eval.py --datasets=dream-panda-orb  --model=knownq-link1 --id 1804
runjob --ngpus=8  python scripts/run_robot_eval.py --datasets=dream-panda-orb  --model=knownq-link5 --id 1804
runjob --ngpus=8  python scripts/run_robot_eval.py --datasets=dream-panda-orb  --model=knownq-link2 --id 1804
runjob --ngpus=8  python scripts/run_robot_eval.py --datasets=dream-panda-orb  --model=knownq-link4 --id 1804
runjob --ngpus=8  python scripts/run_robot_eval.py --datasets=dream-panda-orb  --model=knownq-link9 --id 1804

# Analysis of anchor part, Table 6
python -m robopose.scripts.download --models=ablation_anchor
runjob --ngpus=8  python scripts/run_robot_eval.py --datasets=dream-panda-orb  --model=unknownq-link1 --id 1804
runjob --ngpus=8  python scripts/run_robot_eval.py --datasets=dream-panda-orb  --model=unknownq-link2 --id 1804
runjob --ngpus=8  python scripts/run_robot_eval.py --datasets=dream-panda-orb  --model=unknownq-link5 --id 1804
runjob --ngpus=8  python scripts/run_robot_eval.py --datasets=dream-panda-orb  --model=unknownq-link0 --id 1804
runjob --ngpus=8  python scripts/run_robot_eval.py --datasets=dream-panda-orb  --model=unknownq-link4 --id 1804
runjob --ngpus=8  python scripts/run_robot_eval.py --datasets=dream-panda-orb  --model=unknownq-link9 --id 1804
runjob --ngpus=8  python scripts/run_robot_eval.py --datasets=dream-panda-orb  --model=unknownq-random_all --id 1804
runjob --ngpus=8  python scripts/run_robot_eval.py --datasets=dream-panda-orb  --model=unknownq-random_top5 --id 1804
runjob --ngpus=8  python scripts/run_robot_eval.py --datasets=dream-panda-orb  --model=unknownq-random_top3 --id 1804

# Analysis of number of iterations, Supplementary Material.
python -m robopose.scripts.download --models=ablation_train_iterations
runjob --ngpus=8  python scripts/run_robot_eval.py --datasets=dream-panda-orb  --model=train_K=1 --id 1804
runjob --ngpus=8  python scripts/run_robot_eval.py --datasets=dream-panda-orb  --model=train_K=2 --id 1804
runjob --ngpus=8  python scripts/run_robot_eval.py --datasets=dream-panda-orb  --model=train_K=3 --id 1804
runjob --ngpus=8  python scripts/run_robot_eval.py --datasets=dream-panda-orb  --model=train_K=5 --id 1804

Re-training the models

We provide all the training code.

Background images for data augmentation

We apply data augmentation to the training images. Data augmentation includes pasting random images of the pascal VOC dataset on the background of the scenes. You can download Pascal VOC using the following commands:

cd local_data
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar
tar -xvf VOCtrainval_11-May-2012.tar

(If the website is down, which happens periodically, you can alternatively download these files from a mirror at https://pjreddie.com/media/files/VOCtrainval_11-May-2012.tar)

Reproducing models from the paper

runjob --ngpus=44  python scripts/run_articulated_training.py --config=dream-panda-gt_joints
runjob --ngpus=44  python scripts/run_articulated_training.py --config=dream-panda-predict_joints

runjob --ngpus=44  python scripts/run_articulated_training.py --config=dream-baxter-gt_joints
runjob --ngpus=44  python scripts/run_articulated_training.py --config=dream-baxter-predict_joints

runjob --ngpus=44  python scripts/run_articulated_training.py --config=dream-kuka-gt_joints
runjob --ngpus=44  python scripts/run_articulated_training.py --config=dream-kuka-predict_joints

runjob --ngpus=44  python scripts/run_articulated_training.py --config=craves-owi535-predict_joints
Owner
Yann Labbé
PhD Student at INRIA Willow in computer vision and robotics.
Yann Labbé
Official PyTorch implementation of Segmenter: Transformer for Semantic Segmentation

Segmenter: Transformer for Semantic Segmentation Segmenter: Transformer for Semantic Segmentation by Robin Strudel*, Ricardo Garcia*, Ivan Laptev and

594 Jan 06, 2023
Face Alignment using python

Face Alignment Face Alignment using python Input Image Aligned Face Aligned Face Aligned Face Input Image Aligned Face Input Image Aligned Face Instal

Sajjad Aemmi 28 Nov 23, 2022
PyTorch implementation of SQN based on CloserLook3D's encoder

SQN_pytorch This repo is an implementation of Semantic Query Network (SQN) using CloserLook3D's encoder in Pytorch. For TensorFlow implementation, che

PointCloudYC 1 Oct 21, 2021
Denoising Diffusion Implicit Models

Denoising Diffusion Implicit Models (DDIM) Jiaming Song, Chenlin Meng and Stefano Ermon, Stanford Implements sampling from an implicit model that is t

465 Jan 05, 2023
A general-purpose programming language, focused on simplicity, safety and stability.

The Rivet programming language A general-purpose programming language, focused on simplicity, safety and stability. Rivet's goal is to be a very power

The Rivet programming language 17 Dec 29, 2022
Background-Click Supervision for Temporal Action Localization

Background-Click Supervision for Temporal Action Localization This repository is the official implementation of BackTAL. In this work, we study the te

LeYang 221 Oct 09, 2022
PyTorch implementation of EigenGAN

PyTorch Implementation of EigenGAN Train python train.py [image_folder_path] --name [experiment name] Test python test.py [ckpt path] --traverse FFH

62 Nov 12, 2022
This repository contains a re-implementation of the code for the CVPR 2021 paper "Omnimatte: Associating Objects and Their Effects in Video."

Omnimatte in PyTorch This repository contains a re-implementation of the code for the CVPR 2021 paper "Omnimatte: Associating Objects and Their Effect

Erika Lu 728 Dec 28, 2022
The project is an official implementation of our paper "3D Human Pose Estimation with Spatial and Temporal Transformers".

3D Human Pose Estimation with Spatial and Temporal Transformers This repo is the official implementation for 3D Human Pose Estimation with Spatial and

Ce Zheng 363 Dec 28, 2022
Yolox-bytetrack-sample - Python sample of MOT (Multiple Object Tracking) using YOLOX and ByteTrack

yolox-bytetrack-sample YOLOXとByteTrackを用いたMOT(Multiple Object Tracking)のPythonサン

KazuhitoTakahashi 12 Nov 09, 2022
AgML is a comprehensive library for agricultural machine learning

AgML is a comprehensive library for agricultural machine learning. Currently, AgML provides access to a wealth of public agricultural datasets for common agricultural deep learning tasks.

Plant AI and Biophysics Lab 1 Jul 07, 2022
This is a collection of our NAS and Vision Transformer work.

AutoML - Neural Architecture Search This is a collection of our AutoML-NAS work iRPE (NEW): Rethinking and Improving Relative Position Encoding for Vi

Microsoft 832 Jan 08, 2023
Awesome Remote Sensing Toolkit based on PaddlePaddle.

基于飞桨框架开发的高性能遥感图像处理开发套件,端到端地完成从训练到部署的全流程遥感深度学习应用。 最新动态 PaddleRS 即将发布alpha版本!欢迎大家试用 简介 PaddleRS是遥感科研院所、相关高校共同基于飞桨开发的遥感处理平台,支持遥感图像分类,目标检测,图像分割,以及变化检测等常用遥

146 Dec 11, 2022
OpenABC-D: A Large-Scale Dataset For Machine Learning Guided Integrated Circuit Synthesis

OpenABC-D: A Large-Scale Dataset For Machine Learning Guided Integrated Circuit Synthesis Overview OpenABC-D is a large-scale labeled dataset generate

NYU Machine-Learning guided Design Automation (MLDA) 31 Nov 22, 2022
QICK: Quantum Instrumentation Control Kit

QICK: Quantum Instrumentation Control Kit The QICK is a kit of firmware and software to use the Xilinx RFSoC to control quantum systems. It consists o

81 Dec 15, 2022
TensorFlow, PyTorch and Numpy layers for generating Orthogonal Polynomials

OrthNet TensorFlow, PyTorch and Numpy layers for generating multi-dimensional Orthogonal Polynomials 1. Installation 2. Usage 3. Polynomials 4. Base C

Chuan 29 May 25, 2022
MoveNetを用いたPythonでの姿勢推定のデモ

MoveNet-Python-Example MoveNetのPythonでの動作サンプルです。 ONNXに変換したモデルも同梱しています。変換自体を試したい方はMoveNet_tf2onnx.ipynbを使用ください。 2021/08/24時点でTensorFlow Hubで提供されている以下モデ

KazuhitoTakahashi 38 Dec 17, 2022
Bootstrapped Representation Learning on Graphs

Bootstrapped Representation Learning on Graphs This is the PyTorch implementation of BGRL Bootstrapped Representation Learning on Graphs The main scri

NerDS Lab :: Neural Data Science Lab 55 Jan 07, 2023
Current state of supervised and unsupervised depth completion methods

Awesome Depth Completion Table of Contents About Sparse-to-Dense Depth Completion Current State of Depth Completion Unsupervised VOID Benchmark Superv

224 Dec 28, 2022
Pretraining on Dynamic Graph Neural Networks

Pretraining on Dynamic Graph Neural Networks Our article is PT-DGNN and the code is modified based on GPT-GNN Requirements python 3.6 Ubuntu 18.04.5 L

7 Dec 17, 2022