Official implementation of deep-multi-trajectory-based single object tracking (IEEE T-CSVT 2021).

Last update: Dec 03, 2022

Overview

DeepMTA_PyTorch

Officical PyTorch Implementation of "Dynamic Attention-guided Multi-TrajectoryAnalysis for Single Object Tracking", Xiao Wang, Zhe Chen, Jin Tang, Bin Luo, Yaowei Wang, Yonghong Tian, Feng Wu, IEEE Transactions on Circuits and Systems for Video Technology (T-CSVT 2021) [Paper] [Project]

Abstract:

Most of the existing single object trackers track the target in a unitary local search window, making them particularly vulnerable to challenging factors such as heavy occlusions and out-of-view movements. Despite the attempts to further incorporate global search, prevailing mechanisms that cooperate local and global search are relatively static, thus are still sub-optimal for improving tracking performance. By further studying the local and global search results, we raise a question: can we allow more dynamics for cooperating both results? In this paper, we propose to introduce more dynamics by devising a dynamic attention-guided multi-trajectory tracking strategy. In particular, we construct dynamic appearance model that contains multiple target templates, each of which provides its own attention for locating the target in the new frame. Guided by different attention, we maintain diversified tracking results for the target to build multi-trajectory tracking history, allowing more candidates to represent the true target trajectory. After spanning the whole sequence, we introduce a multi-trajectory selection network to find the best trajectory that deliver improved tracking performance. Extensive experimental results show that our proposed tracking strategy achieves compelling performance on various large-scale tracking benchmarks.

Our Proposed Approach:

Install:

git clone https://github.com/wangxiao5791509/DeepMTA_PyTorch
cd DeepMTA_TCSVT_project

# create the conda environment
conda env create -f environment.yml
conda activate deepmta

# build the vot toolkits
bash benchmark/make_toolkits.sh

Download Dataset and Model:

download pre-trained Traj-Evaluation-Network [Onedrive] and Dynamic-TANet-Model [Onedrive]

get the dataset OTB2015, GOT-10k, LaSOT, UAV123, UAV20L, OxUvA from [List].

Download TNL2K dataset (published on CVPR 2021, 1300/700 for train and test subset) from: https://sites.google.com/view/langtrackbenchmark/

Train:

you can directly use the pre-trained tracking model of THOR [github];
train Dynamic Target-aware Attention:

cd ~/DeepMTA_TCSVT_project/trackers/dcynet_modules_adaptis/ 
python train.py

train Trajectory Evaluation Network:

python train_traj_measure_net.py

Tracking:

take got-10k and LaSOT dataset as the examples:

python testing.py -d GOT10k -t SiamRPN --lb_type ensemble

python testing.py -d LaSOT -t SiamRPN --lb_type ensemble

Benchmark Results:

Experimental results on the compared tracking benchmarks

[OTB2015] [LaSOT] [OxUvA] [GOT-10k] [UAV123] [TNL2K]

Tracking Results:

Tracking results on LaSOT dataset.

Tracking results on TNL2K dataset.

Attention prediciton and Tracking Results.

Acknowledgement:

Our tracker is developed based on THOR which is published on BMVC-2019 [Paper] [Code]

Other related works:

MTP: Multi-hypothesis Tracking and Prediction for Reduced Error Propagation, Xinshuo Weng, Boris Ivanovic, and Marco Pavone [Paper] [Code]
D.-Y. Lee, J.-Y. Sim, and C.-S. Kim, “Multihypothesis trajectory analysis for robust visual tracking,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015, pp. 5088–5096. [Paper]
C. Kim, F. Li, A. Ciptadi, and J. M. Rehg, “Multiple hypothesis tracking revisited,” in Proceedings of the IEEE International Conference on Computer Vision, 2015, pp. 4696–4704. [Paper]

Citation:

If you find this paper useful for your research, please consider to cite our paper:

@inproceedings{wang2021deepmta,
 title={Dynamic Attention guided Multi-Trajectory Analysis for Single Object Tracking},
 author={Xiao, Wang and Zhe, Chen and Jin, Tang and Bin, Luo and Yaowei, Wang and Yonghong, Tian and Feng, Wu},
 booktitle={IEEE Transactions on Circuits and Systems for Video Technology},
 doi={10.1109/TCSVT.2021.3056684}, 
 year={2021}
}

If you have any questions about this work, please contact with me via: [email protected] or [email protected]

Official implementation of deep-multi-trajectory-based single object tracking (IEEE T-CSVT 2021).

Related tags

Overview

DeepMTA_PyTorch

Officical PyTorch Implementation of "Dynamic Attention-guided Multi-TrajectoryAnalysis for Single Object Tracking", Xiao Wang, Zhe Chen, Jin Tang, Bin Luo, Yaowei Wang, Yonghong Tian, Feng Wu, IEEE Transactions on Circuits and Systems for Video Technology (T-CSVT 2021) [Paper] [Project]

Abstract:

Our Proposed Approach:

Install:

Download Dataset and Model:

Train:

Tracking:

Benchmark Results:

Tracking Results:

Tracking results on LaSOT dataset.

Tracking results on TNL2K dataset.

Attention prediciton and Tracking Results.

Acknowledgement:

Other related works:

Citation:

Owner

Xiao Wang（王逍）

PyTorch implementation of the paper:A Convolutional Approach to Melody Line Identification in Symbolic Scores.

Repository for "Toward Practical Monocular Indoor Depth Estimation" (CVPR 2022)

SegNet model implemented using keras framework

Graph Regularized Residual Subspace Clustering Network for hyperspectral image clustering

Log4j JNDI inj. vuln scanner

PSML: A Multi-scale Time-series Dataset for Machine Learning in Decarbonized Energy Grids

Material del curso IIC2233 Programación Avanzada 📚

WORD: Revisiting Organs Segmentation in the Whole Abdominal Region

Vector Neurons: A General Framework for SO(3)-Equivariant Networks

Code and dataset for ACL2018 paper "Exploiting Document Knowledge for Aspect-level Sentiment Classification"

Composing methods for ML training efficiency

An implementation of MobileFormer

Language Used: Python . Made in Jupyter(Anaconda) notebook.

(CVPR 2022 - oral) Multi-View Depth Estimation by Fusing Single-View Depth Probability with Multi-View Geometry

A benchmark for the task of translation suggestion

Image Lowpoly based on Centroid Voronoi Diagram via python-opencv and taichi

Hide screen when boss is approaching.

This is the implementation of the paper "Self-supervised Outdoor Scene Relighting"

[NeurIPS 2021] SSUL: Semantic Segmentation with Unknown Label for Exemplar-based Class-Incremental Learning

Learning from graph data using Keras