PointRCNN: 3D Object Proposal Generation and Detection from Point Cloud, CVPR 2019.

Overview

PointRCNN

PointRCNN: 3D Object Proposal Generation and Detection from Point Cloud

teaser

Code release for the paper PointRCNN:3D Object Proposal Generation and Detection from Point Cloud, CVPR 2019.

Authors: Shaoshuai Shi, Xiaogang Wang, Hongsheng Li.

[arXiv]  [Project Page] 

New: We have provided another implementation of PointRCNN for joint training with multi-class in a general 3D object detection toolbox [OpenPCDet].

Introduction

In this work, we propose the PointRCNN 3D object detector to directly generated accurate 3D box proposals from raw point cloud in a bottom-up manner, which are then refined in the canonical coordinate by the proposed bin-based 3D box regression loss. To the best of our knowledge, PointRCNN is the first two-stage 3D object detector for 3D object detection by using only the raw point cloud as input. PointRCNN is evaluated on the KITTI dataset and achieves state-of-the-art performance on the KITTI 3D object detection leaderboard among all published works at the time of submission.

For more details of PointRCNN, please refer to our paper or project page.

Supported features and ToDo list

  • Multiple GPUs for training
  • GPU version rotated NMS
  • Faster PointNet++ inference and training supported by Pointnet2.PyTorch
  • PyTorch 1.0
  • TensorboardX
  • Still in progress

Installation

Requirements

All the codes are tested in the following environment:

  • Linux (tested on Ubuntu 14.04/16.04)
  • Python 3.6+
  • PyTorch 1.0

Install PointRCNN

a. Clone the PointRCNN repository.

git clone --recursive https://github.com/sshaoshuai/PointRCNN.git

If you forget to add the --recursive parameter, just run the following command to clone the Pointnet2.PyTorch submodule.

git submodule update --init --recursive

b. Install the dependent python libraries like easydict,tqdm, tensorboardX etc.

c. Build and install the pointnet2_lib, iou3d, roipool3d libraries by executing the following command:

sh build_and_install.sh

Dataset preparation

Please download the official KITTI 3D object detection dataset and organize the downloaded files as follows:

PointRCNN
├── data
│   ├── KITTI
│   │   ├── ImageSets
│   │   ├── object
│   │   │   ├──training
│   │   │      ├──calib & velodyne & label_2 & image_2 & (optional: planes)
│   │   │   ├──testing
│   │   │      ├──calib & velodyne & image_2
├── lib
├── pointnet2_lib
├── tools

Here the images are only used for visualization and the road planes are optional for data augmentation in the training.

Pretrained model

You could download the pretrained model(Car) of PointRCNN from here(~15MB), which is trained on the train split (3712 samples) and evaluated on the val split (3769 samples) and test split (7518 samples). The performance on validation set is as follows:

Car [email protected], 0.70, 0.70:
bbox AP:96.91, 89.53, 88.74
bev  AP:90.21, 87.89, 85.51
3d   AP:89.19, 78.85, 77.91
aos  AP:96.90, 89.41, 88.54

Quick demo

You could run the following command to evaluate the pretrained model (set RPN.LOC_XZ_FINE=False since it is a little different with the default configuration):

python eval_rcnn.py --cfg_file cfgs/default.yaml --ckpt PointRCNN.pth --batch_size 1 --eval_mode rcnn --set RPN.LOC_XZ_FINE False

Inference

  • To evaluate a single checkpoint, run the following command with --ckpt to specify the checkpoint to be evaluated:
python eval_rcnn.py --cfg_file cfgs/default.yaml --ckpt ../output/rpn/ckpt/checkpoint_epoch_200.pth --batch_size 4 --eval_mode rcnn 
  • To evaluate all the checkpoints of a specific training config file, add the --eval_all argument, and run the command as follows:
python eval_rcnn.py --cfg_file cfgs/default.yaml --eval_mode rcnn --eval_all
  • To generate the results on the test split, please modify the TEST.SPLIT=TEST and add the --test argument.

Here you could specify a bigger --batch_size for faster inference based on your GPU memory. Note that the --eval_mode argument should be consistent with the --train_mode used in the training process. If you are using --eval_mode=rcnn_offline, then you should use --rcnn_eval_roi_dir and --rcnn_eval_feature_dir to specify the saved features and proposals of the validation set. Please refer to the training section for more details.

Training

Currently, the two stages of PointRCNN are trained separately. Firstly, to use the ground truth sampling data augmentation for training, we should generate the ground truth database as follows:

python generate_gt_database.py --class_name 'Car' --split train

Training of RPN stage

  • To train the first proposal generation stage of PointRCNN with a single GPU, run the following command:
python train_rcnn.py --cfg_file cfgs/default.yaml --batch_size 16 --train_mode rpn --epochs 200
  • To use mutiple GPUs for training, simply add the --mgpus argument as follows:
CUDA_VISIBLE_DEVICES=0,1 python train_rcnn.py --cfg_file cfgs/default.yaml --batch_size 16 --train_mode rpn --epochs 200 --mgpus

After training, the checkpoints and training logs will be saved to the corresponding directory according to the name of your configuration file. Such as for the default.yaml, you could find the checkpoints and logs in the following directory:

PointRCNN/output/rpn/default/

which will be used for the training of RCNN stage.

Training of RCNN stage

Suppose you have a well-trained RPN model saved at output/rpn/default/ckpt/checkpoint_epoch_200.pth, then there are two strategies to train the second stage of PointRCNN.

(a) Train RCNN network with fixed RPN network to use online GT augmentation: Use --rpn_ckpt to specify the path of a well-trained RPN model and run the command as follows:

python train_rcnn.py --cfg_file cfgs/default.yaml --batch_size 4 --train_mode rcnn --epochs 70  --ckpt_save_interval 2 --rpn_ckpt ../output/rpn/default/ckpt/checkpoint_epoch_200.pth

(b) Train RCNN network with offline GT augmentation:

  1. Generate the augmented offline scenes by running the following command:
python generate_aug_scene.py --class_name Car --split train --aug_times 4
  1. Save the RPN features and proposals by adding --save_rpn_feature:
  • To save features and proposals for the training, we set TEST.RPN_POST_NMS_TOP_N=300 and TEST.RPN_NMS_THRESH=0.85 as follows:
python eval_rcnn.py --cfg_file cfgs/default.yaml --batch_size 4 --eval_mode rpn --ckpt ../output/rpn/default/ckpt/checkpoint_epoch_200.pth --save_rpn_feature --set TEST.SPLIT train_aug TEST.RPN_POST_NMS_TOP_N 300 TEST.RPN_NMS_THRESH 0.85
  • To save features and proposals for the evaluation, we keep TEST.RPN_POST_NMS_TOP_N=100 and TEST.RPN_NMS_THRESH=0.8 as default:
python eval_rcnn.py --cfg_file cfgs/default.yaml --batch_size 4 --eval_mode rpn --ckpt ../output/rpn/default/ckpt/checkpoint_epoch_200.pth --save_rpn_feature
  1. Now we could train our RCNN network. Note that you should modify TRAIN.SPLIT=train_aug to use the augmented scenes for the training, and use --rcnn_training_roi_dir and --rcnn_training_feature_dir to specify the saved features and proposals in the above step:
python train_rcnn.py --cfg_file cfgs/default.yaml --batch_size 4 --train_mode rcnn_offline --epochs 30  --ckpt_save_interval 1 --rcnn_training_roi_dir ../output/rpn/default/eval/epoch_200/train_aug/detections/data --rcnn_training_feature_dir ../output/rpn/default/eval/epoch_200/train_aug/features

For the offline GT sampling augmentation, the default setting to train the RCNN network is RCNN.ROI_SAMPLE_JIT=True, which means that we sample the RoIs and calculate their GTs in the GPU. I also provide the CPU version proposal sampling, which is implemented in the dataloader, and you could enable this feature by setting RCNN.ROI_SAMPLE_JIT=False. Typically the CPU version is faster but costs more CPU resources since they use mutiple workers.

All the codes supported mutiple GPUs, simply add the --mgpus argument as above. And you could also increase the --batch_size by using multiple GPUs for training.

Note:

  • The strategy (a), online augmentation, is more elegant and easy to train.
  • The best model is trained by the offline augmentation strategy with CPU proposal sampling (set RCNN.ROI_SAMPLE_JIT=False).
  • Theoretically, the online augmentation should be better, but currently the online augmentation is a bit lower than the offline augmentation, and I still didn't know why. All discussions are welcomed.
  • I am still working on this codes to make it more stable.

Citation

If you find this work useful in your research, please consider cite:

@InProceedings{Shi_2019_CVPR,
    author = {Shi, Shaoshuai and Wang, Xiaogang and Li, Hongsheng},
    title = {PointRCNN: 3D Object Proposal Generation and Detection From Point Cloud},
    booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
    month = {June},
    year = {2019}
}
Owner
Shaoshuai Shi
Ph.D @ MMLab-CUHK
Shaoshuai Shi
🔥🔥High-Performance Face Recognition Library on PaddlePaddle & PyTorch🔥🔥

face.evoLVe: High-Performance Face Recognition Library based on PaddlePaddle & PyTorch Evolve to be more comprehensive, effective and efficient for fa

Zhao Jian 3.1k Jan 04, 2023
Demo code for paper "Learning optical flow from still images", CVPR 2021.

Depthstillation Demo code for "Learning optical flow from still images", CVPR 2021. [Project page] - [Paper] - [Supplementary] This code is provided t

130 Dec 25, 2022
Gauge equivariant mesh cnn

Geometric Mesh CNN The code in this repository is an implementation of the Gauge Equivariant Mesh CNN introduced in the paper Gauge Equivariant Mesh C

50 Dec 18, 2022
Pretrained Cost Model for Distributed Constraint Optimization Problems

Pretrained Cost Model for Distributed Constraint Optimization Problems Requirements PyTorch 1.9.0 PyTorch Geometric 1.7.1 Directory structure baseline

2 Aug 28, 2022
Task-related Saliency Network For Few-shot learning

Task-related Saliency Network For Few-shot learning This is an official implementation in Tensorflow of TRSN. Abstract An essential cue of human wisdo

1 Nov 18, 2021
Evaluation toolkit of the informative tracking benchmark comprising 9 scenarios, 180 diverse videos, and new challenges.

Informative-tracking-benchmark Informative tracking benchmark (ITB) higher diversity. It contains 9 representative scenarios and 180 diverse videos. m

Xin Li 15 Nov 26, 2022
Occlusion robust 3D face reconstruction model in CFR-GAN (WACV 2022)

Occlusion Robust 3D face Reconstruction Yeong-Joon Ju, Gun-Hee Lee, Jung-Ho Hong, and Seong-Whan Lee Code for Occlusion Robust 3D Face Reconstruction

Yeongjoon 31 Dec 19, 2022
Moer Grounded Image Captioning by Distilling Image-Text Matching Model

Moer Grounded Image Captioning by Distilling Image-Text Matching Model Requirements Python 3.7 Pytorch 1.2 Prepare data Please use git clone --recurse

YE Zhou 60 Dec 16, 2022
[arXiv'22] Panoptic NeRF: 3D-to-2D Label Transfer for Panoptic Urban Scene Segmentation

Panoptic NeRF Project Page | Paper | Dataset Panoptic NeRF: 3D-to-2D Label Transfer for Panoptic Urban Scene Segmentation Xiao Fu*, Shangzhan zhang*,

Xiao Fu 111 Dec 16, 2022
A Moonraker plug-in for real-time compensation of frame thermal expansion

Frame Expansion Compensation A Moonraker plug-in for real-time compensation of frame thermal expansion. Installation Credit to protoloft, from whom I

58 Jan 02, 2023
The description of FMFCC-A (audio track of FMFCC) dataset and Challenge resluts.

FMFCC-A This project is the description of FMFCC-A (audio track of FMFCC) dataset and Challenge resluts. The FMFCC-A dataset is shared through BaiduCl

18 Dec 24, 2022
LSTM and QRNN Language Model Toolkit for PyTorch

LSTM and QRNN Language Model Toolkit This repository contains the code used for two Salesforce Research papers: Regularizing and Optimizing LSTM Langu

Salesforce 1.9k Jan 08, 2023
This is Official implementation for "Pose-guided Feature Disentangling for Occluded Person Re-Identification Based on Transformer" in AAAI2022

PFD:Pose-guided Feature Disentangling for Occluded Person Re-identification based on Transformer This repo is the official implementation of "Pose-gui

Tao Wang 93 Dec 18, 2022
Reviving Iterative Training with Mask Guidance for Interactive Segmentation

This repository provides the source code for training and testing state-of-the-art click-based interactive segmentation models with the official PyTorch implementation

Visual Understanding Lab @ Samsung AI Center Moscow 406 Jan 01, 2023
GT4SD, an open-source library to accelerate hypothesis generation in the scientific discovery process.

The GT4SD (Generative Toolkit for Scientific Discovery) is an open-source platform to accelerate hypothesis generation in the scientific discovery process. It provides a library for making state-of-t

Generative Toolkit 4 Scientific Discovery 142 Dec 24, 2022
Classic Papers for Beginners and Impact Scope for Authors.

There have been billions of academic papers around the world. However, maybe only 0.0...01% among them are valuable or are worth reading. Since our limited life has never been forever, TopPaper provi

Qiulin Zhang 228 Dec 18, 2022
Learning Super-Features for Image Retrieval

Learning Super-Features for Image Retrieval This repository contains the code for running our FIRe model presented in our ICLR'22 paper: @inproceeding

NAVER 101 Dec 28, 2022
Code accompanying the paper "How Tight Can PAC-Bayes be in the Small Data Regime?"

How Tight Can PAC-Bayes be in the Small Data Regime? This is the code to reproduce all experiments for the following paper: @inproceedings{Foong:2021:

5 Dec 21, 2021
Rank1 Conversation Emotion Detection Task

Rank1-Conversation_Emotion_Detection_Task accuracy macro-f1 recall 0.826 0.7544 0.719 基于预训练模型和时序预测模型的对话情感探测任务 1 摘要 针对对话情感探测任务,本文将其分为文本分类和时间序列预测两个子任务,分

Yuchen Han 2 Nov 28, 2021
Data for "Driving the Herd: Search Engines as Content Influencers" paper

herding_data Data for "Driving the Herd: Search Engines as Content Influencers" paper Dataset description The collection contains 2250 documents, 30 i

0 Aug 17, 2021