LiDAR Distillation: Bridging the Beam-Induced Domain Gap for 3D Object Detection

Last update: Dec 22, 2022

Overview

LiDAR Distillation

Paper | Model

LiDAR Distillation: Bridging the Beam-Induced Domain Gap for 3D Object Detection
Yi Wei, Zibu Wei, Yongming Rao, Jiaxin Li, Jiwen Lu, Jie Zhou

Introduction

In this paper, we propose the LiDAR Distillation to bridge the domain gap induced by different LiDAR beams for 3D object detection. In many real-world applications, the LiDAR points used by mass-produced robots and vehicles usually have fewer beams than that in large-scale public datasets. Moreover, as the LiDARs are upgraded to other product models with different beam amount, it becomes challenging to utilize the labeled data captured by previous versions’ high-resolution sensors. Despite the recent progress on domain adaptive 3D detection, most methods struggle to eliminate the beam-induced domain gap.

Model Zoo

Cross-dataset Adaptation

model	method	AP_BEV	AP_3D
SECOND-IoU	Direct transfer	32.91	17.24
SECOND-IoU	ST3D	35.92	20.19
SECOND-IoU	Ours	40.66	22.86
SECOND-IoU	Ours (w / ST3D)	42.04	24.50
PV-RCNN	Direct transfer	34.50	21.47
PV-RCNN	ST3D	36.42	22.99
PV-RCNN	Ours	43.31	25.63
PV-RCNN	Ours (w / ST3D)	44.08	26.37
PointPillar	Direct transfer	27.8	12.1
PointPillar	ST3D	30.6	15.6
PointPillar	Ours	40.23	19.12
PointPillar	Ours (w / ST3D)	40.83	20.97

Results of cross-dataset adaptation from Waymo to nuScenes. The training Waymo data used in our work is version 1.0.

Single-dataset Adaptation

beams	method	AP_BEV	AP_3D
32	Direct transfer	79.81	65.91
32	ST3D	71.29	57.57
32	Ours	82.22	70.15
32*	Direct transfer	73.56	57.77
32*	ST3D	67.08	53.30
32*	Ours	79.47	66.96
16	Direct transfer	64.91	47.48
16	ST3D	57.58	42.40
16	Ours	74.32	59.87
16*	Direct transfer	56.32	38.75
16*	ST3D	55.63	37.02
16*	Ours	70.43	55.24

Results of single-dataset adaptation on KITTI dataset with PointPillars (moderate difficulty). For SECOND-IoU and PV-RCNN, we find that it is easy to raise cuda error on low-beam data, which is may caused by the bug in spconv. Thus, we do not provide the model but you can still run these experiments with the yamls.

Installation

Please refer to INSTALL.md.

Getting Started

Please refer to GETTING_STARTED.md.

License

Our code is released under the Apache 2.0 license.

Acknowledgement

Our code is heavily based on OpenPCDet v0.2 and ST3D. Thanks OpenPCDet Development Team for their awesome codebase.

Citation

If you find this project useful in your research, please consider cite:

@article{wei2022lidar,
  title={LiDAR Distillation: Bridging the Beam-Induced Domain Gap for 3D Object Detection},
  author={Wei, Yi and Wei, Zibu and Rao, Yongming and Li, Jiaxin and Zhou, Jie and Lu, Jiwen},
  journal={arXiv preprint arXiv:2203.14956},
  year={2022}
}

@misc{openpcdet2020,
    title={OpenPCDet: An Open-source Toolbox for 3D Object Detection from Point Clouds},
    author={OpenPCDet Development Team},
    howpublished = {\url{https://github.com/open-mmlab/OpenPCDet}},
    year={2020}
}

LiDAR Distillation: Bridging the Beam-Induced Domain Gap for 3D Object Detection

Related tags

Overview

LiDAR Distillation

Paper | Model

Introduction

Model Zoo

Cross-dataset Adaptation

Single-dataset Adaptation

Installation

Getting Started

License

Acknowledgement

Citation

Owner

Yi Wei

Implementing a simplified copy of Shazam application from scratch using MinHashing and LSH.

Official repo for SemanticGAN https://nv-tlabs.github.io/semanticGAN/

A PyTorch implementation of a Factorization Machine module in cython.

Group R-CNN for Point-based Weakly Semi-supervised Object Detection (CVPR2022)

Online Multi-Granularity Distillation for GAN Compression (ICCV2021)

Implementation of Graph Transformer in Pytorch, for potential use in replicating Alphafold2

Oriented Response Networks, in CVPR 2017

This is the official implement of paper "ActionCLIP: A New Paradigm for Action Recognition"

The code for our paper "AutoSF: Searching Scoring Functions for Knowledge Graph Embedding"

The code of NeurIPS 2021 paper "Scalable Rule-Based Representation Learning for Interpretable Classification".

Unsupervised Semantic Segmentation by Contrasting Object Mask Proposals.

Lip Reading - Cross Audio-Visual Recognition using 3D Convolutional Neural Networks

YOLOX is a high-performance anchor-free YOLO, exceeding yolov3~v5 with ONNX, TensorRT, ncnn, and OpenVINO supported.

So-ViT: Mind Visual Tokens for Vision Transformer

一个免费开源一键搭建的通用验证码识别平台，大部分常见的中英数验证码识别都没啥问题。

A `Neural = Symbolic` framework for sound and complete weighted real-value logic

Object Database for Super Mario Galaxy 1/2.

Custom studies about block sparse attention.

Geneva is an artificial intelligence tool that defeats censorship by exploiting bugs in censors

We present a framework for training multi-modal deep learning models on unlabelled video data by forcing the network to learn invariances to transformations applied to both the audio and video streams.