Repository of 3D Object Detection with Pointformer (CVPR2021)

Last update: Jan 06, 2023

Related tags

Overview

3D Object Detection with Pointformer

This repository contains the code for the paper 3D Object Detection with Pointformer (CVPR 2021) [arXiv]. This work is developed on the top of MMDetection3D toolbox and includes the models and results on SUN RGB-D and ScanNet datasets in the paper.

More models results on KITTI and nuScenes datasets will be released soon.

Installation and Usage

The code is developed with MMDetection3D v0.6.1 and works well with v0.14.0.

Dependencies

NVIDIA GPU + CUDA 10.2
Python 3.8 (Recommend to use Anaconda)
PyTorch == 1.8.0
mmcv-full == 1.3.7
mmdet == 2.11.0
mmsegmentation == 0.13.0

Installation

Install dependencies following their guidelines.
Clone and install mmdet3d in develop mode.

git clone https://github.com/open-mmlab/mmdetection3d.git
cd mmdetection3d
python setup.py develop

Add the files in this repo into the directories in mmdet3d.

Training and Testing

Download the pretrained weights from Google Drive or Tsinghua Cloud and put them in the checkpoints folder. Use votenet_ptr_sunrgbd-3d-10class as an example:

# Training
bash -x tools/dist_train.sh configs/pointformer/votenet_ptr_sunrgbd-3d-10class.py 8

# Testing 
bash tools/dist_test.sh configs/pointformer/votenet_ptr_sunrgbd-3d-10class.py checkpoints/votenet_ptr_sunrgbd-3d-10class.pth 8 --eval mAP

Results

SUN RGB-D

classes	AP_0.25	AR_0.25	AP_0.50	AR_0.50
bed	0.8343	0.9515	0.5556	0.7029
table	0.5353	0.8705	0.2344	0.4604
sofa	0.6588	0.9171	0.4979	0.6715
chair	0.7681	0.8700	0.5664	0.6703
toilet	0.9117	0.9931	0.5538	0.7103
desk	0.2458	0.8050	0.0754	0.3395
dresser	0.3626	0.8028	0.2357	0.4908
night_stand	0.6701	0.9020	0.4525	0.6196
bookshelf	0.3383	0.6809	0.0968	0.2624
bathtub	0.7821	0.8980	0.4259	0.5510
Overall	0.6107	0.8691	0.3694	0.5479

ScanNet

classes	AP_0.25	AR_0.25	AP_0.50	AR_0.50
cabinet	0.4548	0.7930	0.1757	0.4435
bed	0.8839	0.9506	0.8006	0.8889
chair	0.9011	0.9386	0.7562	0.8136
sofa	0.8915	0.9794	0.6619	0.8041
table	0.6763	0.8714	0.4858	0.6971
door	0.5413	0.7216	0.2107	0.4283
window	0.4821	0.7021	0.1504	0.2979
bookshelf	0.5255	0.8701	0.4422	0.7273
picture	0.1815	0.3649	0.0748	0.1351
counter	0.6210	0.8654	0.2333	0.3846
desk	0.6859	0.9370	0.3774	0.6535
curtain	0.5522	0.7910	0.3156	0.4627
refrigerator	0.5215	0.9649	0.4028	0.7193
showercurtrain	0.6709	0.9643	0.1941	0.5000
toilet	0.9922	1.0000	0.8210	0.8793
sink	0.6361	0.7347	0.4119	0.5000
bathtub	0.8710	0.8710	0.8375	0.8387
garbagebin	0.4762	0.7264	0.2244	0.4604
Overall	0.6425	0.8359	0.4209	0.5908

For more details of experimetns please refer to the paper.

Acknowledgement

This code is based on MMDetection3D.

Citation

If you find our work is useful in your research, please consider citing:

@InProceedings{Pan_2021_CVPR,
    author    = {Pan, Xuran and Xia, Zhuofan and Song, Shiji and Li, Li Erran and Huang, Gao},
    title     = {3D Object Detection With Pointformer},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2021},
    pages     = {7463-7472}
}

@misc{pan20203d,
  title={3D Object Detection with Pointformer}, 
  author={Xuran Pan and Zhuofan Xia and Shiji Song and Li Erran Li and Gao Huang},
  year={2020},
  eprint={2012.11409},
  archivePrefix={arXiv},
  primaryClass={cs.CV}
}

Repository of 3D Object Detection with Pointformer (CVPR2021)

Related tags

Overview

3D Object Detection with Pointformer

Installation and Usage

Dependencies

Installation

Training and Testing

Results

Acknowledgement

Citation

Owner

Zhuofan Xia

Minimal PyTorch implementation of Generative Latent Optimization from the paper "Optimizing the Latent Space of Generative Networks"

Official code for paper "ISNet: Costless and Implicit Image Segmentation for Deep Classifiers, with Application in COVID-19 Detection"

When are Iterative GPs Numerically Accurate?

MicRank is a Learning to Rank neural channel selection framework where a DNN is trained to rank microphone channels.

Adversarial Color Enhancement: Generating Unrestricted Adversarial Images by Optimizing a Color Filter

Official Pytorch Implementation of: "ImageNet-21K Pretraining for the Masses"(2021) paper

Improving Transferability of Representations via Augmentation-Aware Self-Supervision

Offical code for the paper: "Growing 3D Artefacts and Functional Machines with Neural Cellular Automata" https://arxiv.org/abs/2103.08737

Deep Learning for Human Part Discovery in Images - Chainer implementation

Grow Function: Generate 3D Stacked Bifurcating Double Deep Cellular Automata based organisms which differentiate using a Genetic Algorithm...

RL algorithm PPO and IRL algorithm AIRL written with Tensorflow.

Official implementation for "Symbolic Learning to Optimize: Towards Interpretability and Scalability"

This is a template for the Non-autoregressive Deep Learning-Based TTS model (in PyTorch).

Exploring Visual Engagement Signals for Representation Learning

A Haskell kernel for IPython.

PyTorch implementation of paper "IBRNet: Learning Multi-View Image-Based Rendering", CVPR 2021.

A "gym" style toolkit for building lightweight Neural Architecture Search systems

GenGNN: A Generic FPGA Framework for Graph Neural Network Acceleration

Tensorflow 2 implementations of the C-SimCLR and C-BYOL self-supervised visual representation methods from "Compressive Visual Representations" (NeurIPS 2021)

Single object tracking and segmentation.