a project for 3D multi-object tracking

Last update: Jan 04, 2023

Related tags

Deep Learning 3D-Multi-Object-Tracker

Overview

3D Multi-Object Tracker

This project is developed for tracking multiple objects in 3D scene. The visualization code is from here.

Features

Fast: currently, the codes can achieve 700 FPS using only CPU (not include detection and data op), can perform tracking on all kitti val sequence in several seconds.
Support both online and global implementation. The overall framework of design is shown below:

Kitti Results

Results on the Kitti tracking val seq [1,6,8,10,12,13,14,15,16,18,19] using second-iou and point-rcnn detections. We followed the HOTA metric, and tuned the parameters in this code by firstly considering the HOTA performance.

Detector	HOTA	DetA	AssA	DetRe	DetPr	AssRe	AssPr	LocA	MOTA
second-iou	78.787	74.482	83.611	80.665	84.72	89.022	88.575	88.63	85.129
point-rcnn	78.91	75.814	82.406	83.489	82.185	87.209	87.586	87.308	88.412

Prepare data

You can download the Kitti tracking pose data from here, and you can find the point-rcnn and second-iou detections from here.

To run this code, you should organize Kitti tracking dataset as below:

# Kitti Tracking Dataset       
└── kitti_tracking
       ├── testing 
       |      ├──calib
       |      |    ├──0000.txt
       |      |    ├──....txt
       |      |    └──0028.txt
       |      ├──image_02
       |      |    ├──0000
       |      |    ├──....
       |      |    └──0028
       |      ├──pose
       |      |    ├──0000
       |      |    |    └──pose.txt
       |      |    ├──....
       |      |    └──0028
       |      |         └──pose.txt
       |      ├──label_02
       |      |    ├──0000.txt
       |      |    ├──....txt
       |      |    └──0028.txt
       |      └──velodyne
       |           ├──0000
       |           ├──....
       |           └──0028      
       └── training # the structure is same as testing set
              ├──calib
              ├──image_02
              ├──pose
              ├──label_02
              └──velodyne

Detections

└── point-rcnn
       ├── training
       |      ├──0000
       |      |    ├──000001.txt
       |      |    ├──....txt
       |      |    └──000153.txt
       |      ├──...
       |      └──0020
       └──testing

Requirements

python3
numpy
opencv
yaml

Quick start

Please modify the dataset path and detections path in the yaml file to your own path.
Then run python3 kitti_3DMOT.py config/point_rcnn_mot.yaml
The results are automatically saved to evaluation\results\sha_key\data, and evaluated by HOTA metrics.

Notes

The evaluation codes are copied from Kitti.

a project for 3D multi-object tracking

Related tags

Overview

3D Multi-Object Tracker

Features

Kitti Results

Prepare data

Requirements

Quick start

Notes

Owner

deep learning for image processing including classification and object-detection etc.

Vector AI — A platform for building vector based applications. Encode, query and analyse data using vectors.

OpenMMLab Text Detection, Recognition and Understanding Toolbox

ISTR: End-to-End Instance Segmentation with Transformers (https://arxiv.org/abs/2105.00637)

Online Multi-Granularity Distillation for GAN Compression (ICCV2021)

Deep deconfounded recommender (Deep-Deconf) for paper "Deep causal reasoning for recommendations"

An auto discord account and token generator. Automatically verifies the phone number. Works without proxy. Bypasses captcha.

PyTorchVideo is a deeplearning library with a focus on video understanding work

Goal of the project : Detecting Temporal Boundaries in Sign Language videos

A novel method to tune language models. Codes and datasets for paper ``GPT understands, too''.

A library of scripts that interact with the PythonTurtle module to create games, drawings, and more

Breaking Shortcut: Exploring Fully Convolutional Cycle-Consistency for Video Correspondence Learning

Official code for "Distributed Deep Learning in Open Collaborations" (NeurIPS 2021)

Image marine sea litter prediction Shiny

Original Pytorch Implementation of FLAME: Facial Landmark Heatmap Activated Multimodal Gaze Estimation

An implementation of quantum convolutional neural network with MindQuantum. Huawei, classifying MNIST dataset

Multilingual Image Captioning

[NeurIPS 2020] Semi-Supervision (Unlabeled Data) & Self-Supervision Improve Class-Imbalanced / Long-Tailed Learning

CLIP+FFT text-to-image

Decision Transformer: A brand new Offline RL Pattern