PyTorch Implementation of PIXOR: Real-time 3D Object Detection from Point Clouds

Last update: Dec 14, 2022

Overview

PIXOR: Real-time 3D Object Detection from Point Clouds

This is a custom implementation of the paper from Uber ATG using PyTorch 1.0. It represents the driving scene using lidar data in the Birds' Eye View (BEV) and uses a single stage object detector to predict the poses of road objects with respect to the car

Highlights

PyTorch 1.0 Reproduced and trained from scratch using the KITTI dataset
Fast Custom LiDAR preprocessing using C++
Multi-GPU Training and Pytorch MultiProcessing package to speed up non-maximum suppression during evaluation
Tensorboard Visualize trainig progress using Tensorboard
KITTI and ROSBAG Demo Scripts that supports running inferences directly on raw KITTI data or custom rosbags.

Install

Dependencies:

Python 3.5(3.6)
Pytorch (Follow Official Installation Guideline)
Tensorflow (see their website)
Numpy, MatplotLib, OpenCV3
PyKitti (for running on KITTI raw dataset)
gcc

pip install shapely numpy matplotlib
git clone https://github.com/philip-huang/PIXOR
cd PIXOR/srcs/preprocess
make

(Optional) If you want to run this project on a custom rosbag containing Velodyne HDL64 scans the system must be Linux with ROS kinetic installed. You also need to install the velodyne driver into the velodyne_ws folder.

Set up the velodyne workspace by running ./velodyne_setup.bash and press Ctrl-C as necessary.

Demo

A helper class is provided in run_kitti.py to simplify writing inference pipelines using pre-trained models. Here is how we would do it. Run this from the src folder (suppose I have already downloaded my KITTI raw data and extracted to somewhere)

from run_kitti import *

def make_kitti_video():
     
    basedir = '/mnt/ssd2/od/KITTI/raw'
    date = '2011_09_26'
    drive = '0035'
    dataset = pykitti.raw(basedir, date, drive)
   
    videoname = "detection_{}_{}.avi".format(date, drive)
    save_path = os.path.join(basedir, date, "{}_drive_{}_sync".format(date, drive), videoname)    
    run(dataset, save_path)

make_kitti_video()

Training and Evaluation

Our Training Result (as of Dec 2018)

All configuration (hyperparameters, GPUs, etc) should be put in a config.json file and save to the directory srcs/experiments/$exp_name$ To train

python srcs/main.py train (--name=$exp_name$)

To evaluate an experiment

python srcs/main.py val (--name=$exp_name$)

To display a sample result

python srcs/main.py test --name=$exp_name$

To view tensorboard

tensorboard --logdir=srcs/logs/$exp_name$

TODO

Improve training accuracy on KITTI dataset
Data augmentation
Generalization gap on custom driving sequences
Data Collection
Improve model (possible idea: use map as a prior)

Credits

Project Contributors

Philip Huang
Allan Liu

Paper Citation below



@inproceedings{yang2018pixor,
  title={PIXOR: Real-Time 3D Object Detection From Point Clouds},
  author={Yang, Bin and Luo, Wenjie and Urtasun, Raquel}
}

We would like to thank aUToronto for genersouly sponsoring GPUs for this project

PyTorch Implementation of PIXOR: Real-time 3D Object Detection from Point Clouds

Related tags

Overview

PIXOR: Real-time 3D Object Detection from Point Clouds

Highlights

Install

Demo

Training and Evaluation

TODO

Credits

Owner

Philip Huang

Intro-to-dl - Resources for "Introduction to Deep Learning" course.

DECA: Detailed Expression Capture and Animation (SIGGRAPH 2021)

Wider-Yolo Kütüphanesi ile Yüz Tespit Uygulamanı Yap

A framework for attentive explainable deep learning on tabular data

nnFormer: Interleaved Transformer for Volumetric Segmentation

The official implementation of NeurIPS 2021 paper: Finding Optimal Tangent Points for Reducing Distortions of Hard-label Attacks

Code for the AAAI 2022 paper "Zero-Shot Cross-Lingual Machine Reading Comprehension via Inter-Sentence Dependency Graph".

Final report with code for KAIST Course KSE 801.

Deep learning image registration library for PyTorch

Code for the ECIR'22 paper "Evaluating the Robustness of Retrieval Pipelines with Query Variation Generators"

Fast and simple implementation of RL algorithms, designed to run fully on GPU.

Vector Quantization, in Pytorch

An End-to-End Machine Learning Library to Optimize AUC (AUROC, AUPRC).

A series of convenience functions to make basic image processing operations such as translation, rotation, resizing, skeletonization, and displaying Matplotlib images easier with OpenCV and Python.

GluonMM is a library of transformer models for computer vision and multi-modality research

SeqAttack: a framework for adversarial attacks on token classification models

Repo for EMNLP 2021 paper "Beyond Preserved Accuracy: Evaluating Loyalty and Robustness of BERT Compression"

MIMIC Code Repository: Code shared by the research community for the MIMIC-III database

Code release for NeX: Real-time View Synthesis with Neural Basis Expansion

The repository contains source code and models to use PixelNet architecture used for various pixel-level tasks. More details can be accessed at .