Learning Spatio-Temporal Transformer for Visual Tracking

Related tags

Text Data & NLPStark
Overview

STARK

The official implementation of the paper Learning Spatio-Temporal Transformer for Visual Tracking STARK_Framework

Highlights

The strongest performances

Tracker LaSOT (AUC) GOT-10K (AO) TrackingNet (AUC)
STARK 67.1 68.8 82.0
TransT 64.9 67.1 81.4
TrDiMP 63.7 67.1 78.4
Siam R-CNN 64.8 64.9 81.2

Real-Time Speed

STARK-ST50 and STARK-ST101 run at 40FPS and 30FPS respectively on a Tesla V100 GPU.

End-to-End, Post-processing Free

STARK is an end-to-end tracking approach, which directly predicts one accurate bounding box as the tracking result.
Besides, STARK does not use any hyperparameters-sensitive post-processing, leading to stable performances.

Purely PyTorch-based Code

STARK is implemented purely based on the PyTorch.

Install the environment

Option1: Use the Anaconda

conda create -n stark python=3.6
conda activate stark
bash install.sh

Option2: Use the docker file

We provide the complete docker at here

Data Preparation

Put the tracking datasets in ./data. It should look like:

${STARK_ROOT}
 -- data
     -- lasot
         |-- airplane
         |-- basketball
         |-- bear
         ...
     -- got10k
         |-- test
         |-- train
         |-- val
     -- coco
         |-- annotations
         |-- images
     -- trackingnet
         |-- TRAIN_0
         |-- TRAIN_1
         ...
         |-- TRAIN_11
         |-- TEST

Run the following command to set paths for this project

python tracking/create_default_local_file.py --workspace_dir . --data_dir ./data --save_dir .

After running this command, you can also modify paths by editing these two files

lib/train/admin/local.py  # paths about training
lib/test/evaluation/local.py  # paths about testing

Train STARK

Training with multiple GPUs using DDP

# STARK-S50
python tracking/train.py --script stark_s --config baseline --save_dir . --mode multiple --nproc_per_node 8  # STARK-S50
# STARK-ST50
python tracking/train.py --script stark_st1 --config baseline --save_dir . --mode multiple --nproc_per_node 8  # STARK-ST50 Stage1
python tracking/train.py --script stark_st2 --config baseline --save_dir . --mode multiple --nproc_per_node 8 --script_prv stark_st1 --config_prv baseline  # STARK-ST50 Stage2
# STARK-ST101
python tracking/train.py --script stark_st1 --config baseline_R101 --save_dir . --mode multiple --nproc_per_node 8  # STARK-ST101 Stage1
python tracking/train.py --script stark_st2 --config baseline_R101 --save_dir . --mode multiple --nproc_per_node 8 --script_prv stark_st1 --config_prv baseline_R101  # STARK-ST101 Stage2

(Optionally) Debugging training with a single GPU

python tracking/train.py --script stark_s --config baseline --save_dir . --mode single

Test and evaluate STARK on benchmarks

  • LaSOT
python tracking/test.py stark_st baseline --dataset lasot --threads 32
python tracking/analysis_results.py # need to modify tracker configs and names
  • GOT10K-test
python tracking/test.py stark_st baseline_got10k_only --dataset got10k_test --threads 32
python lib/test/utils/transform_got10k.py --tracker_name stark_st --cfg_name baseline_got10k_only
  • TrackingNet
python tracking/test.py stark_st baseline --dataset trackingnet --threads 32
python lib/test/utils/transform_trackingnet.py --tracker_name stark_st --cfg_name baseline
  • VOT2020
    Before evaluating "STARK+AR" on VOT2020, please install some extra packages following external/AR/README.md
cd external/vot20/<workspace_dir>
export PYTHONPATH=<path to the stark project>:$PYTHONPATH
bash exp.sh
  • VOT2020-LT
cd external/vot20_lt/<workspace_dir>
export PYTHONPATH=<path to the stark project>:$PYTHONPATH
bash exp.sh

Test FLOPs, Params, and Speed

# Profiling STARK-S50 model
python tracking/profile_model.py --script stark_s --config baseline
# Profiling STARK-ST50 model
python tracking/profile_model.py --script stark_st2 --config baseline
# Profiling STARK-ST101 model
python tracking/profile_model.py --script stark_st2 --config baseline_R101

Model Zoo

The trained models, the training logs, and the raw tracking results are provided in the model zoo

Acknowledgments

Owner
Multimedia Research
Multimedia Research at Microsoft Research Asia
Multimedia Research
This github repo is for Neurips 2021 paper, NORESQA A Framework for Speech Quality Assessment using Non-Matching References.

NORESQA: Speech Quality Assessment using Non-Matching References This is a Pytorch implementation for using NORESQA. It contains minimal code to predi

Meta Research 36 Dec 08, 2022
中文生成式预训练模型

T5 PEGASUS 中文生成式预训练模型,以mT5为基础架构和初始权重,通过类似PEGASUS的方式进行预训练。 详情可见:https://kexue.fm/archives/8209 Tokenizer 我们将T5 PEGASUS的Tokenizer换成了BERT的Tokenizer,它对中文更

410 Jan 03, 2023
A Facebook Messenger Chatbot using NLP

A Facebook Messenger Chatbot using NLP This project is about creating a messenger chatbot using basic NLP techniques and models like Logistic Regressi

6 Nov 20, 2022
ALBERT: A Lite BERT for Self-supervised Learning of Language Representations

ALBERT ***************New March 28, 2020 *************** Add a colab tutorial to run fine-tuning for GLUE datasets. ***************New January 7, 2020

Google Research 3k Dec 26, 2022
Beyond Paragraphs: NLP for Long Sequences

Beyond Paragraphs: NLP for Long Sequences

AI2 338 Dec 02, 2022
Negative sampling for solving the unlabeled entity problem in NER. ICLR-2021 paper: Empirical Analysis of Unlabeled Entity Problem in Named Entity Recognition.

Negative Sampling for NER Unlabeled entity problem is prevalent in many NER scenarios (e.g., weakly supervised NER). Our paper in ICLR-2021 proposes u

Yangming Li 128 Dec 29, 2022
Simple tool/toolkit for evaluating NLG (Natural Language Generation) offering various automated metrics.

Simple tool/toolkit for evaluating NLG (Natural Language Generation) offering various automated metrics. Jury offers a smooth and easy-to-use interface. It uses datasets for underlying metric computa

Open Business Software Solutions 129 Jan 06, 2023
nlp-tutorial is a tutorial for who is studying NLP(Natural Language Processing) using Pytorch

nlp-tutorial is a tutorial for who is studying NLP(Natural Language Processing) using Pytorch. Most of the models in NLP were implemented with less than 100 lines of code.(except comments or blank li

Tae-Hwan Jung 11.9k Jan 08, 2023
GrammarTagger — A Neural Multilingual Grammar Profiler for Language Learning

GrammarTagger — A Neural Multilingual Grammar Profiler for Language Learning GrammarTagger is an open-source toolkit for grammatical profiling for lan

Octanove Labs 27 Jan 05, 2023
Pytorch code for ICRA'21 paper: "Hierarchical Cross-Modal Agent for Robotics Vision-and-Language Navigation"

Hierarchical Cross-Modal Agent for Robotics Vision-and-Language Navigation This repository is the pytorch implementation of our paper: Hierarchical Cr

44 Jan 06, 2023
A python wrapper around the ZPar parser for English.

NOTE This project is no longer under active development since there are now really nice pure Python parsers such as Stanza and Spacy. The repository w

ETS 49 Sep 12, 2022
Script and models for clustering LAION-400m CLIP embeddings.

clustering-laion400m Script and models for clustering LAION-400m CLIP embeddings. Models were fit on the first million or so image embeddings. A subje

Peter Baylies 22 Oct 04, 2022
Learn meanings behind words is a key element in NLP. This project concentrates on the disambiguation of preposition senses. Therefore, we train a bert-transformer model and surpass the state-of-the-art.

New State-of-the-Art in Preposition Sense Disambiguation Supervisor: Prof. Dr. Alexander Mehler Alexander Henlein Institutions: Goethe University TTLa

Dirk Neuhäuser 4 Apr 06, 2022
A fast and lightweight python-based CTC beam search decoder for speech recognition.

pyctcdecode A fast and feature-rich CTC beam search decoder for speech recognition written in Python, providing n-gram (kenlm) language model support

Kensho 315 Dec 21, 2022
Deal or No Deal? End-to-End Learning for Negotiation Dialogues

Introduction This is a PyTorch implementation of the following research papers: (1) Hierarchical Text Generation and Planning for Strategic Dialogue (

Facebook Research 1.4k Dec 29, 2022
vits chinese, tts chinese, tts mandarin

vits chinese, tts chinese, tts mandarin 史上训练最简单,音质最好的语音合成系统

AmorTX 12 Dec 14, 2022
Part of Speech Tagging using Hidden Markov Model (HMM) POS Tagger and Brill Tagger

Part of Speech Tagging using Hidden Markov Model (HMM) POS Tagger and Brill Tagger In this project, our aim is to tune, compare, and contrast the perf

Chirag Daryani 0 Dec 25, 2021
Takes a string and puts it through different languages in Google Translate a requested amount of times, returning nonsense.

PythonTextObfuscator Takes a string and puts it through different languages in Google Translate a requested amount of times, returning nonsense. Requi

2 Aug 29, 2022
Uses Google's gTTS module to easily create robo text readin' on command.

Tool to convert text to speech, creating files for later use. TTRS uses Google's gTTS module to easily create robo text readin' on command.

0 Jun 20, 2021
Tokenizer - Module python d'analyse syntaxique et de grammaire, tokenization

Tokenizer Le Tokenizer est un analyseur lexicale, il permet, comme Flex and Yacc par exemple, de tokenizer du code, c'est à dire transformer du code e

Manolo 1 Aug 15, 2022