PSTR: End-to-End One-Step Person Search With Transformers (CVPR2022)

Last update: Dec 13, 2022

Overview

PSTR (CVPR2022)

This code is an official implementation of "PSTR: End-to-End One-Step Person Search With Transformers (CVPR2022)".
End-to-end one-step person search with Transformers, which does not requre NMS post-processing.
Pre-trained models with ResNet50, ResNet50-DCN, and PVTv2b2.
Curves of different methods on CUHK under different gallery sizes (plot_cuhk.py). If you want to add new results, please feel free to contact us.

Installation

We install this project using cuda11.1 and PyTorch1.8.0 (or PyTorch1.9.0) as follows.

# Download this project
git clone https://github.com/JialeCao001/PSTR.git

# Create a new conda enviroment for PSTR
conda create -n pstr python=3.7 -y
conda activate pstr
pip install torch==1.8.1+cu111 torchvision==0.9.1+cu111 torchaudio==0.8.1 -f https://download.pytorch.org/whl/torch_stable.html
#conda install pytorch==1.8.0 torchvision==0.9.0 torchaudio==0.8.0 cudatoolkit=11.1 -c pytorch -c conda-forge

# Comiple mmcv, which has been included in this project
cd PSTR/mmcv
MMCV_WITH_OPS=1 pip install -e .

# Comiple this project 
cd PSTR
pip install -r requirements/build.txt
pip install -v -e .  # or "python setup.py develop"
pip install sklearn

If you have the problem local variable 'beta1' referenced before assignment with PyTorch1.8, add one table space in L110 of optim/adamw.py

Train and Inference

Datasets and Annotations

Download PRW and CUHK-SYSU datasets.
Download the json annotations provided by AlignPS.

Train with a single GPU

python tools/train.py ${CONFIG_FILE} --no-validate

CONFIG_FILE about PSTR is in configs/PSTR

Test with a single GPU

PRW: sh run_test_prw.sh 
CUHK: sh run_test_cuhk.sh

If you want to output the results of different models, please change CONFIGPATH, MODELPATH, OUTPATH for diffferent models

Results

We provide some models with different backbones and results on PRW and CUHK-SYSU datasets, which have a little difference to CVPR version due to jitter.

name	dataset	backbone	mAP	top-1	mAP+	top-1+	download
PSTR	PRW	PVTv2-B2	57.46	90.57	58.07	92.03	model
PSTR	PRW	ResNet50	50.03	88.04	50.64	89.94	model
PSTR	PRW	ResNet50-DCN	51.09	88.33	51.62	90.13	model
PSTR	CUHK-SYSU	PVTv2-B2	95.31	96.28	95.78	96.83	model
PSTR	CUHK-SYSU	ResNet50	93.55	94.93	94.16	95.48	model
PSTR	CUHK-SYSU	ResNet50-DCN	94.22	95.28	94.90	95.97	model

All the models are based on multi-scale training and all the results are based on single-scale inference.
+ indicates adding a re-scoring module during evaluation, where we modify the final matching score as the weighted score of CBGM score and originial matching scores.

Citation

If the project helps your research, please cite this paper.

@article{Cao_PSTR_CVPR_2022,
  author =       {Jiale Cao and Yanwei Pang and Rao Muhammad Anwer and Hisham Cholakkal and Jin Xie and Mubarak Shah and Fahad Shahbaz Khan},
  title =        {PSTR: End-to-End One-Step Person Search With Transformers},
  journal =      {Proc. IEEE Conference on Computer Vision and Pattern Recognition},
  year =         {2022}
}

Acknowledgement

Many thanks to the open source codes: mmdetection, AlignPS, and SeqNet.

PSTR: End-to-End One-Step Person Search With Transformers (CVPR2022)

Related tags

Overview

PSTR (CVPR2022)

Installation

Train and Inference

Datasets and Annotations

Train with a single GPU

Test with a single GPU

Results

Citation

Acknowledgement

Owner

Jiale Cao

Code and real data for the paper "Counterfactual Temporal Point Processes", available at arXiv.

To model the probability of a soccer coach leave his/her team during Campeonato Brasileiro for 10 chosen teams and considering years 2018, 2019 and 2020.

Scikit-event-correlation - Event Correlation and Forecasting over High Dimensional Streaming Sensor Data algorithms

Code release for General Greedy De-bias Learning

An implementation of "Optimal Textures: Fast and Robust Texture Synthesis and Style Transfer through Optimal Transport"

Official repository for the ICCV 2021 paper: UltraPose: Synthesizing Dense Pose with 1 Billion Points by Human-body Decoupling 3D Model.

Deep-learning X-Ray Micro-CT image enhancement, pore-network modelling and continuum modelling

noisy labels; missing labels; semi-supervised learning; entropy; uncertainty; robustness and generalisation.

FridaHookAppTool - Frida Hook App Tool With Python

NNR conformation conditional and global probabilities estimation and analysis in peptides or proteins fragments

Hand gesture recognition model that can be used as a remote control for a smart tv.

A Planar RGB-D SLAM which utilizes Manhattan World structure to provide optimal camera pose trajectory while also providing a sparse reconstruction containing points, lines and planes, and a dense surfel-based reconstruction.

Code for the AAAI 2022 paper "Zero-Shot Cross-Lingual Machine Reading Comprehension via Inter-Sentence Dependency Graph".

Self Governing Neural Networks (SGNN): the Projection Layer

Deep Learning for Time Series Classification

Implementation of our paper "DMT: Dynamic Mutual Training for Semi-Supervised Learning"

Everything's Talkin': Pareidolia Face Reenactment (CVPR2021)

A geometric deep learning pipeline for predicting protein interface contacts.

TensorFlow implementation of "TokenLearner: What Can 8 Learned Tokens Do for Images and Videos?"

A Player for Kanye West's Stem Player. Sort of an emulator.