Single object tracking and segmentation.

Related tags

Deep LearningSOTS
Overview

Single/Multiple Object Tracking and Segmentation

Codes and comparison of recent single/multiple object tracking and segmentation.

News

πŸ’₯ AutoMatch is accepted by ICCV2021. The training and testing code has been released in this codebase.

πŸ’₯ CSTrack ranks 5/4000 at Tianchi Global AI Competition.

πŸ’₯ Ocean is accepted by ECCV2020. [OceanPlus] is accepted by IEEE TIP.

πŸ’₯ SiamDW is accepted by CVPR2019 and selected as oral presentation.

Supported Trackers (SOT and MOT)

Single-Object Tracking (SOT)

Multi-Object Tracking (MOT)

Results Comparison

Branches

  • main: for our SOT trackers
  • MOT: for our MOT trackers
  • v0: old codebase supporting OceanPlus and TensorRT testing.

Please clone the branch to your needs.

Structure

  • experiments: training and testing settings
  • demo: figures for readme
  • dataset: testing dataset
  • data: training dataset
  • lib: core scripts for all trackers
  • snapshot: pre-trained models
  • pretrain: models trained on ImageNet (for training)
  • tracking: training and testing interface
$SOTS
|β€”β€” experimnets
|β€”β€” lib
|β€”β€” snapshot
  |β€”β€” xxx.model
|β€”β€” dataset
  |β€”β€” VOT2019.json 
  |β€”β€” VOT2019
     |β€”β€” ants1...
  |β€”β€” VOT2020
     |β€”β€” ants1...
|β€”β€” ...

Tracker Details

AutoMatch [ICCV2021]

[Paper] [Raw Results] [Training and Testing Tutorial] [Demo]
AutoMatch replaces the essence of Siamese tracking, i.e. the cross-correlation and its variants, to a learnable matching network. The underlying motivation is that heuristic matching network design relies heavily on expert experience. Moreover, we experimentally find that one sole matching operator is difficult to guarantee stable tracking in all challenging environments. In this work, we introduce six novel matching operators from the perspective of feature fusion instead of explicit similarity learning, namely Concatenation, Pointwise-Addition, Pairwise-Relation, FiLM, Simple-Transformer and Transductive-Guidance, to explore more feasibility on matching operator selection. The analyses reveal these operators' selective adaptability on different environment degradation types, which inspires us to combine them to explore complementary features. We propose binary channel manipulation (BCM) to search for the optimal combination of these operators.

Ocean

Ocean [ECCV2020]

[Paper] [Raw Results] [Training and Testing Tutorial] [Demo]

Ocean proposes a general anchor-free based tracking framework. It includes a pixel-based anchor-free regression network to solve the weak rectification problem of RPN, and an object-aware classification network to learn robust target-related representation. Moreover, we introduce an effective multi-scale feature combination module to replace heavy result fusion mechanism in recent Siamese trackers. This work also serves as the baseline model of OceanPlus. An additional TensorRT toy demo is provided in this repo.

Ocean

SiamDW [CVPR2019]

[Paper] [Raw Results] [Training and Testing Tutorial] [Demo]
SiamDW is one of the pioneering work using deep backbone networks for Siamese tracking framework. Based on sufficient analysis on network depth, output size, receptive field and padding mode, we propose guidelines to build backbone networks for Siamese tracker. Several deeper and wider networks are built following the guidelines with the proposed CIR module.

SiamDW

OceanPlus [IEEE TIP]

[Paper] [Raw Results] [Training and Testing Tutorial] [Demo]
Official implementation of the OceanPlus tracker. It proposes an attention retrieval network (ARN) to perform soft spatial constraints on backbone features. Concretely, we first build a look-up-table (LUT) with the ground-truth mask in the starting frame, and then retrieve the LUT to obtain a target-aware attention map for suppressing the negative influence of background clutter. Furthermore, we introduce a multi-resolution multi-stage segmentation network (MMS) to ulteriorly weaken responses of background clutter by reusing the predicted mask to filter backbone features.

OceanPlus


CSTrack [Arxiv now]

[Paper] [Training and Testing Tutorial] [Demo]
CSTrack proposes a strong ReID based one-shot MOT framework. It includes a novel cross-correlation network that can effectively impel the separate branches to learn task-dependent representations, and a scale-aware attention network that learns discriminative embeddings to improve the ReID capability. This work also provides an analysis of the weak data association ability in one-shot MOT methods. Our improvements make the data association ability of our one-shot model is comparable to two-stage methods while running more faster.

CSTrack

This version can achieve the performance described in the paper (70.7 MOTA on MOT16, 70.6 MOTA on MOT17). The new version will be released soon. If you are interested in our work or have any questions, please contact me at [email protected].

Other trackers, coming soon ...

☁️ ☁️ ☁️ ☁️ ☁️ ☁️ ☁️ ☁️ ☁️ ☁️ ☁️ ☁️ ☁️ ☁️ ☁️ ☁️ ☁️ ☁️ ☁️ ☁️ ☁️ ☁️ ☁️ ☁️ ☁️ ☁️ ☁️ ☁️ ☁️ ☁️ ☁️ ☁️ ☁️ ☁️ ☁️ ☁️ ☁️ ☁️ ☁️ ☁️

References

https://github.com/StrangerZhang/pysot-toolkit
...

Contributors

Owner
ZP ZHANG
NLPR, CASIA. Ph.D condidate
ZP ZHANG
[CVPR2021] Domain Consensus Clustering for Universal Domain Adaptation

[CVPR2021] Domain Consensus Clustering for Universal Domain Adaptation [Paper] Prerequisites To install requirements: pip install -r requirements.txt

Guangrui Li 84 Dec 26, 2022
Pytorch implementation of four neural network based domain adaptation techniques: DeepCORAL, DDC, CDAN and CDAN+E. Evaluated on benchmark dataset Office31.

Deep-Unsupervised-Domain-Adaptation Pytorch implementation of four neural network based domain adaptation techniques: DeepCORAL, DDC, CDAN and CDAN+E.

Alan Grijalva 49 Dec 20, 2022
Text-to-Image generation

Generate vivid Images for Any (Chinese) text CogView is a pretrained (4B-param) transformer for text-to-image generation in general domain. Read our p

THUDM 1.3k Dec 29, 2022
PyTorch implementation for ACL 2021 paper "Maria: A Visual Experience Powered Conversational Agent".

Maria: A Visual Experience Powered Conversational Agent This repository is the Pytorch implementation of our paper "Maria: A Visual Experience Powered

Jokie 22 Dec 12, 2022
Boostcamp AI Tech 3rd / Basic Paper reading w.r.t Embedding

Boostcamp AI Tech 3rd : Basic Paper Reading w.r.t Embedding TL;DR 1992λ…„λΆ€ν„° 2018λ…„λ„κΉŒμ§€ 이루어진 word/sentence embedding의 μ€‘μš”ν•œ 쀄기λ₯Ό μ΄λ£¨λŠ” 기초 λ…Όλ¬Έ μŠ€ν„°λ””λ₯Ό μ§„ν–‰ν•˜κ³ μž ν•©λ‹ˆλ‹€. λ…Ό

Soyeon Kim 14 Nov 14, 2022
Facial Expression Detection In The Realtime

The human's facial expressions is very important to detect thier emotions and sentiment. It can be very efficient to use to make our computers make interviews. Furthermore, we have robots now can det

Adel El-Nabarawy 4 Mar 01, 2022
The goal of the exercises below is to evaluate the candidate knowledge and problem solving expertise regarding the main development focuses for the iFood ML Platform team: MLOps and Feature Store development.

The goal of the exercises below is to evaluate the candidate knowledge and problem solving expertise regarding the main development focuses for the iFood ML Platform team: MLOps and Feature Store dev

George Rocha 0 Feb 03, 2022
A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.

Website | Documentation | Tutorials | Installation | Release Notes CatBoost is a machine learning method based on gradient boosting over decision tree

CatBoost 6.9k Jan 04, 2023
ChebLieNet, a spectral graph neural network turned equivariant by Riemannian geometry on Lie groups.

ChebLieNet: Invariant spectral graph NNs turned equivariant by Riemannian geometry on Lie groups Hugo Aguettaz, Erik J. Bekkers, MichaΓ«l Defferrard We

haguettaz 12 Dec 10, 2022
MAVE: : A Product Dataset for Multi-source Attribute Value Extraction

MAVE: : A Product Dataset for Multi-source Attribute Value Extraction The dataset contains 3 million attribute-value annotations across 1257 unique ca

Google Research Datasets 89 Jan 08, 2023
Code release for our paper, "SimNet: Enabling Robust Unknown Object Manipulation from Pure Synthetic Data via Stereo"

SimNet: Enabling Robust Unknown Object Manipulation from Pure Synthetic Data via Stereo Thomas Kollar, Michael Laskey, Kevin Stone, Brijen Thananjeyan

68 Dec 14, 2022
Dynamic Visual Reasoning by Learning Differentiable Physics Models from Video and Language (NeurIPS 2021)

VRDP (NeurIPS 2021) Dynamic Visual Reasoning by Learning Differentiable Physics Models from Video and Language Mingyu Ding, Zhenfang Chen, Tao Du, Pin

Mingyu Ding 36 Sep 20, 2022
Official implementation of DreamerPro: Reconstruction-Free Model-Based Reinforcement Learning with Prototypical Representations in TensorFlow 2

DreamerPro Official implementation of DreamerPro: Reconstruction-Free Model-Based Reinforcement Learning with Prototypical Representations in TensorFl

22 Nov 01, 2022
DFFNet: An IoT-perceptive Dual Feature Fusion Network for General Real-time Semantic Segmentation

DFFNet Paper DFFNet: An IoT-perceptive Dual Feature Fusion Network for General Real-time Semantic Segmentation. Xiangyan Tang, Wenxuan Tu, Keqiu Li, J

4 Sep 23, 2022
Implementation for paper LadderNet: Multi-path networks based on U-Net for medical image segmentation

Implementation for paper LadderNet: Multi-path networks based on U-Net for medical image segmentation This implementation is based on orobix implement

Juntang Zhuang 116 Sep 06, 2022
PyTorch Implementation of CycleGAN and SSGAN for Domain Transfer (Minimal)

MNIST-to-SVHN and SVHN-to-MNIST PyTorch Implementation of CycleGAN and Semi-Supervised GAN for Domain Transfer. Prerequites Python 3.5 PyTorch 0.1.12

Yunjey Choi 401 Dec 30, 2022
RL and distillation in CARLA using a factorized world model

World on Rails Learning to drive from a world on rails Dian Chen, Vladlen Koltun, Philipp KrΓ€henbΓΌhl, arXiv techical report (arXiv 2105.00636) This re

Dian Chen 131 Dec 16, 2022
Supplementary materials to "Spin-optomechanical quantum interface enabled by an ultrasmall mechanical and optical mode volume cavity" by H. Raniwala, S. Krastanov, M. Eichenfield, and D. R. Englund, 2022

Supplementary materials to "Spin-optomechanical quantum interface enabled by an ultrasmall mechanical and optical mode volume cavity" by H. Raniwala,

Stefan Krastanov 1 Jan 17, 2022
The repo of Feedback Networks, CVPR17

Feedback Networks http://feedbacknet.stanford.edu/ Paper: Feedback Networks, CVPR 2017. Amir R. Zamir*,Te-Lin Wu*, Lin Sun, William B. Shen, Bertram E

Stanford Vision and Learning Lab 87 Nov 19, 2022
Code for ACM MM 2020 paper "NOH-NMS: Improving Pedestrian Detection by Nearby Objects Hallucination"

NOH-NMS: Improving Pedestrian Detection by Nearby Objects Hallucination The offical implementation for the "NOH-NMS: Improving Pedestrian Detection by

Tencent YouTu Research 64 Nov 11, 2022