RTS3D: Real-time Stereo 3D Detection from 4D Feature-Consistency Embedding Space for Autonomous Driving

Related tags

Deep LearningRTS3D
Overview

RTS3D: Real-time Stereo 3D Detection from 4D Feature-Consistency Embedding Space for Autonomous Driving (AAAI2021).

RTS3D is efficiency and accuracy stereo 3D object detection method for autonomous driving.

RTS3D

Introduction

RTS3D is the first true real-time system (FPS>24) for stereo image 3D detection meanwhile achieves 10% improvement in average precision comparing with the previous state-of-the-art method. RTS3D only require RGB images without synthetic data, instance segmentation, CAD model, or depth generator.

Highlights

  • Fast: 33 FPS of single image test speed in KITTI benchmark with 384*1280 resolution
  • Accuracy: SOTA on the KITTI benchmark.
  • Anchor Free: No 2D or 3D anchor are reauired
  • Easy to deploy: RTS3D uses conventional convolution operations and MLP, so it is very easy to deploy and accelerate.

RTS3D Baseline and Model Zoo

All experiments are tested with Ubuntu 16.04, Pytorch 1.0.0, CUDA 9.0, Python 3.6, single NVIDIA 2080Ti

IoU Setting 1: Car IoU > 0.5, Pedestrian IoU > 0.25, Cyclist IoU > 0.25

IoU Setting 2: Car IoU > 0.7, Pedestrian IoU > 0.5, Cyclist IoU > 0.5

  • Training on KITTI train split and evaluation on val split.
Class Iteration FPS AP BEV IoU Setting1 AP 3D IoU Setting1 AP BEV IoU Setting2 AP 3D IoU Setting2
- - - Easy / Moderate / Hard Easy / Moderate / Hard Easy / Moderate / Hard Easy / Moderate / Hard
Car- Recall-11 1 90.9 89.83, 77.05, 68.28 89.27, 70.12, 61.17 73.20, 53.62, 46.44 60.87, 42.38, 36.44
Car- Recall-40 1 90.9 92.92, 76.17, 66.62 90.35, 71.37, 63.52 78.12, 54.75, 47.09 60.34, 39.32, 32.97
Car- Recall-11 2 45.5 90.41, 78.70, 70.03 90.26, 77.23, 68.28 76.56, 56.46, 48.20 63.65, 44.50, 37.48
Car- Recall-40 2 45.5 95.75, 79.61, 69.69 93.57, 76.64, 66.72 78.12, 54.75, 47.09 63.99, 41.78, 34.96
  • Training on KITTI train split and evaluation on val split.
    • FCE Space Resolution: 10 * 10 * 10
    • Recall split: 11
    • Iteration: 2
    • Model: (Google Drive), (Baidu Cloud 提取码:4t4u)
Class AP BEV IoU Setting1 AP 3D IoU Setting1 AP BEV IoU Setting2 AP 3D IoU Setting2
- Easy / Moderate / Hard Easy / Moderate / Hard Easy / Moderate / Hard Easy / Moderate / Hard
Car 90.18, 78.46, 69.76 89.88, 76.64, 67.86 74.95, 54.07, 46.78 58.50, 39.74, 34.83
Pedestrian 57.12, 48.82, 40.88 56.36, 48.29, 40.22 32.16, 26.31, 21.28 26.95, 20.77, 19.74
Cyclist 54.48, 35.78, 30.80 53.86, 30.90, 30.52 33.59, 20.80, 20.14 31.05, 20.26, 18.93

Installation

Please refer to INSTALL.md

Dataset preparation

Please download the official KITTI 3D object detection dataset and organize the downloaded files as follows:

KM3DNet
├── kitti_format
│   ├── data
│   │   ├── kitti
│   │   |   ├── annotations
│   │   │   ├── calib /000000.txt .....
│   │   │   ├── image(left[0-7480] right[7481-14961] input augmentatiom)
│   │   │   ├── label /000000.txt .....
|   |   |   ├── train.txt val.txt trainval.txt
│   │   │   ├── mono_results /000000.txt .....
├── src
├── demo_kitti_format
├── readme
├── requirements.txt

Getting Started

Please refer to GETTING_STARTED.md to learn more usage about this project.

Acknowledgement

License

RTS3D is released under the MIT License (refer to the LICENSE file for details). Portions of the code are borrowed from, CenterNet, iou3d and kitti_eval (KITTI dataset evaluation). Please refer to the original License of these projects (See NOTICE).

Citation

If you find this project useful for your research, please use the following BibTeX entry.

@misc{2012.15072,
Author = {Peixuan Li, Shun Su, Huaici Zhao},
Title = {RTS3D: Real-time Stereo 3D Detection from 4D Feature-Consistency Embedding Space for Autonomous Driving},
Year = {2020},
Eprint = {arXiv:2012.15072},
}
SIR model parameter estimation using a novel algorithm for differentiated uniformization.

TenSIR Parameter estimation on epidemic data under the SIR model using a novel algorithm for differentiated uniformization of Markov transition rate m

The Spang Lab 4 Nov 30, 2022
Facial Expression Detection In The Realtime

The human's facial expressions is very important to detect thier emotions and sentiment. It can be very efficient to use to make our computers make interviews. Furthermore, we have robots now can det

Adel El-Nabarawy 4 Mar 01, 2022
In Search of Probeable Generalization Measures

In Search of Probeable Generalization Measures Exciting News! In Search of Probeable Generalization Measures has been accepted to the International Co

Mahdi S. Hosseini 6 Sep 11, 2022
High-fidelity 3D Model Compression based on Key Spheres

High-fidelity 3D Model Compression based on Key Spheres This repository contains the implementation of the paper: Yuanzhan Li, Yuqi Liu, Yujie Lu, Siy

5 Oct 11, 2022
VOLO: Vision Outlooker for Visual Recognition

VOLO: Vision Outlooker for Visual Recognition, arxiv This is a PyTorch implementation of our paper. We present Vision Outlooker (VOLO). We show that o

Sea AI Lab 876 Dec 09, 2022
Utility tools for the "Divide and Remaster" dataset, introduced as part of the Cocktail Fork problem paper

Divide and Remaster Utility Tools Utility tools for the "Divide and Remaster" dataset, introduced as part of the Cocktail Fork problem paper The DnR d

Darius Petermann 46 Dec 11, 2022
Lava-DL, but with PyTorch-Lightning flavour

Deep learning project seed Use this seed to start new deep learning / ML projects. Built in setup.py Built in requirements Examples with MNIST Badges

Sami BARCHID 4 Oct 31, 2022
An exploration of log domain "alternative floating point" for hardware ML/AI accelerators.

This repository contains the SystemVerilog RTL, C++, HLS (Intel FPGA OpenCL to wrap RTL code) and Python needed to reproduce the numerical results in

Facebook Research 373 Dec 31, 2022
Code, final versions, and information on the Sparkfun Graphical Datasheets

Graphical Datasheets Code, final versions, and information on the SparkFun Graphical Datasheets. Generated Cells After Running Script Example Complete

SparkFun Electronics 102 Jan 05, 2023
Official implementation of Meta-StyleSpeech and StyleSpeech

Meta-StyleSpeech : Multi-Speaker Adaptive Text-to-Speech Generation Dongchan Min, Dong Bok Lee, Eunho Yang, and Sung Ju Hwang This is an official code

min95 168 Dec 28, 2022
Deeply Supervised, Layer-wise Prediction-aware (DSLP) Transformer for Non-autoregressive Neural Machine Translation

Non-Autoregressive Translation with Layer-Wise Prediction and Deep Supervision Training Efficiency We show the training efficiency of our DSLP model b

Chenyang Huang 36 Oct 31, 2022
Demos of essentia classifiers hosted on replicate.ai

essentia-replicate-demos Demos of Essentia models hosted on replicate.ai's MTG site. The models Check our site for a complete list of the models avail

Music Technology Group - Universitat Pompeu Fabra 12 Nov 14, 2022
Implémentation en pyhton de l'article Depixelizing pixel art de Johannes Kopf et Dani Lischinski

Implémentation en pyhton de l'article Depixelizing pixel art de Johannes Kopf et Dani Lischinski

TableauBits 3 May 29, 2022
The Official Repository for "Generalized OOD Detection: A Survey"

Generalized Out-of-Distribution Detection: A Survey 1. Overview This repository is with our survey paper: Title: Generalized Out-of-Distribution Detec

Jingkang Yang 338 Jan 03, 2023
Styled Augmented Translation

SAT Style Augmented Translation Introduction By collecting high-quality data, we were able to train a model that outperforms Google Translate on 6 dif

139 Dec 29, 2022
Code for our paper "Interactive Analysis of CNN Robustness"

Perturber Code for our paper "Interactive Analysis of CNN Robustness" Datasets Feature visualizations: Google Drive Fine-tuning checkpoints as saved m

Stefan Sietzen 0 Aug 17, 2021
Official repository for "Intriguing Properties of Vision Transformers" (2021)

Intriguing Properties of Vision Transformers Muzammal Naseer, Kanchana Ranasinghe, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, & Ming-Hsuan Yang P

Muzammal Naseer 155 Dec 27, 2022
Bio-OFC gym implementation and Gym-Fly environment

Bio-OFC gym implementation and Gym-Fly environment This repository includes the gym compatible implementation of the Bio-OFC algorithm from the paper

Siavash Golkar 1 Nov 16, 2021
CVPR 2021

Smoothing the Disentangled Latent Style Space for Unsupervised Image-to-image Translation [Paper] | [Poster] | [Codes] Yahui Liu1,3, Enver Sangineto1,

Yahui Liu 37 Sep 12, 2022
TrackFormer: Multi-Object Tracking with Transformers

TrackFormer: Multi-Object Tracking with Transformers This repository provides the official implementation of the TrackFormer: Multi-Object Tracking wi

Tim Meinhardt 321 Dec 29, 2022