Single object tracking and segmentation.

Last update: Jan 02, 2023

Related tags

Deep Learning SOTS

Overview

Single/Multiple Object Tracking and Segmentation

Codes and comparison of recent single/multiple object tracking and segmentation.

News

💥 AutoMatch is accepted by ICCV2021. The training and testing code has been released in this codebase.

💥 CSTrack ranks 5/4000 at Tianchi Global AI Competition.

💥 Ocean is accepted by ECCV2020. [OceanPlus] is accepted by IEEE TIP.

💥 SiamDW is accepted by CVPR2019 and selected as oral presentation.

Supported Trackers (SOT and MOT)

Single-Object Tracking (SOT)

Multi-Object Tracking (MOT)

CSTrack

Results Comparison

Comparison

Branches

main: for our SOT trackers
MOT: for our MOT trackers
v0: old codebase supporting OceanPlus and TensorRT testing.

Please clone the branch to your needs.

Structure

experiments: training and testing settings
demo: figures for readme
dataset: testing dataset
data: training dataset
lib: core scripts for all trackers
snapshot: pre-trained models
pretrain: models trained on ImageNet (for training)
tracking: training and testing interface

$SOTS
|—— experimnets
|—— lib
|—— snapshot
  |—— xxx.model
|—— dataset
  |—— VOT2019.json 
  |—— VOT2019
     |—— ants1...
  |—— VOT2020
     |—— ants1...
|—— ...

Tracker Details

AutoMatch [ICCV2021]

[Paper] [Raw Results] [Training and Testing Tutorial] [Demo]
AutoMatch replaces the essence of Siamese tracking, i.e. the cross-correlation and its variants, to a learnable matching network. The underlying motivation is that heuristic matching network design relies heavily on expert experience. Moreover, we experimentally find that one sole matching operator is difficult to guarantee stable tracking in all challenging environments. In this work, we introduce six novel matching operators from the perspective of feature fusion instead of explicit similarity learning, namely Concatenation, Pointwise-Addition, Pairwise-Relation, FiLM, Simple-Transformer and Transductive-Guidance, to explore more feasibility on matching operator selection. The analyses reveal these operators' selective adaptability on different environment degradation types, which inspires us to combine them to explore complementary features. We propose binary channel manipulation (BCM) to search for the optimal combination of these operators.

Ocean [ECCV2020]

[Paper] [Raw Results] [Training and Testing Tutorial] [Demo]

Ocean proposes a general anchor-free based tracking framework. It includes a pixel-based anchor-free regression network to solve the weak rectification problem of RPN, and an object-aware classification network to learn robust target-related representation. Moreover, we introduce an effective multi-scale feature combination module to replace heavy result fusion mechanism in recent Siamese trackers. This work also serves as the baseline model of OceanPlus. An additional TensorRT toy demo is provided in this repo.

SiamDW [CVPR2019]

[Paper] [Raw Results] [Training and Testing Tutorial] [Demo]
SiamDW is one of the pioneering work using deep backbone networks for Siamese tracking framework. Based on sufficient analysis on network depth, output size, receptive field and padding mode, we propose guidelines to build backbone networks for Siamese tracker. Several deeper and wider networks are built following the guidelines with the proposed CIR module.

OceanPlus [IEEE TIP]

[Paper] [Raw Results] [Training and Testing Tutorial] [Demo]
Official implementation of the OceanPlus tracker. It proposes an attention retrieval network (ARN) to perform soft spatial constraints on backbone features. Concretely, we first build a look-up-table (LUT) with the ground-truth mask in the starting frame, and then retrieve the LUT to obtain a target-aware attention map for suppressing the negative influence of background clutter. Furthermore, we introduce a multi-resolution multi-stage segmentation network (MMS) to ulteriorly weaken responses of background clutter by reusing the predicted mask to filter backbone features.

CSTrack [Arxiv now]

[Paper] [Training and Testing Tutorial] [Demo]
CSTrack proposes a strong ReID based one-shot MOT framework. It includes a novel cross-correlation network that can effectively impel the separate branches to learn task-dependent representations, and a scale-aware attention network that learns discriminative embeddings to improve the ReID capability. This work also provides an analysis of the weak data association ability in one-shot MOT methods. Our improvements make the data association ability of our one-shot model is comparable to two-stage methods while running more faster.

This version can achieve the performance described in the paper (70.7 MOTA on MOT16, 70.6 MOTA on MOT17). The new version will be released soon. If you are interested in our work or have any questions, please contact me at [email protected].

Other trackers, coming soon ...

☁️ ☁️ ☁️ ☁️ ☁️ ☁️ ☁️ ☁️ ☁️ ☁️ ☁️ ☁️ ☁️ ☁️ ☁️ ☁️ ☁️ ☁️ ☁️ ☁️ ☁️ ☁️ ☁️ ☁️ ☁️ ☁️ ☁️ ☁️ ☁️ ☁️ ☁️ ☁️ ☁️ ☁️ ☁️ ☁️ ☁️ ☁️ ☁️ ☁️

References

https://github.com/StrangerZhang/pysot-toolkit
...

Contributors

Zhipeng Zhang
Chao Liang

Single object tracking and segmentation.

Related tags

Overview

Single/Multiple Object Tracking and Segmentation

Codes and comparison of recent single/multiple object tracking and segmentation.

News

Supported Trackers (SOT and MOT)

Single-Object Tracking (SOT)

Multi-Object Tracking (MOT)

Results Comparison

Branches

Structure

Tracker Details

AutoMatch [ICCV2021]

Ocean [ECCV2020]

SiamDW [CVPR2019]

OceanPlus [IEEE TIP]

CSTrack [Arxiv now]

References

Contributors

Owner

ZP ZHANG

Source code, data, and evaluation details for “Cross-Lingual Citations in English Papers: A Large-Scale Analysis of Prevalence, Formation, and Ramifications”

An educational resource to help anyone learn deep reinforcement learning.

Implementation of Memory-Compressed Attention, from the paper "Generating Wikipedia By Summarizing Long Sequences"

Predictive Maintenance LSTM

Official pytorch implementation of the AAAI 2021 paper Semantic Grouping Network for Video Captioning

Paddle implementation for "Cross-Lingual Word Embedding Refinement by ℓ1 Norm Optimisation" (NAACL 2021)

Official Keras Implementation for UNet++ in IEEE Transactions on Medical Imaging and DLMIA 2018

InterfaceGAN++: Exploring the limits of InterfaceGAN

DvD-TD3: Diversity via Determinants for TD3 version

Optimized code based on M2 for faster image captioning training

The goal of the exercises below is to evaluate the candidate knowledge and problem solving expertise regarding the main development focuses for the iFood ML Platform team: MLOps and Feature Store development.

reimpliment of DFANet: Deep Feature Aggregation for Real-Time Semantic Segmentation

An easy way to build PyTorch datasets. Modularly build datasets and automatically cache processed results

HiFi-GAN: High Fidelity Denoising and Dereverberation Based on Speech Deep Features in Adversarial Networks

O2O-Afford: Annotation-Free Large-Scale Object-Object Affordance Learning (CoRL 2021)

Code for Efficient Visual Pretraining with Contrastive Detection

Target Propagation via Regularized Inversion

This program uses trial auth token of Azure Cognitive Services to do speech synthesis for you.

This is the code for Deformable Neural Radiance Fields, a.k.a. Nerfies.

PyTorch implementation of the paper Dynamic Token Normalization Improves Vision Transfromers.