[Pedestron] Generalizable Pedestrian Detection: The Elephant In The Room. @ CVPR2021

Last update: Jan 05, 2023

Overview

Pedestron

Pedestron is a MMdetection based repository, that focuses on the advancement of research on pedestrian detection. We provide a list of detectors, both general purpose and pedestrian specific to train and test. Moreover, we provide pre-trained models and benchmarking of several detectors on different pedestrian detection datasets. Additionally, we provide processed annotations and scripts to process the annotation of different pedestrian detection benchmarks. If you use Pedestron, please cite us (see at the end) and other respective sources.

🔥 Updates 🔥

🧨 We have realeased PedesFormer - Transformer Based Pedestrian Detection repo (particularly Swin Transformer) along with pre-trained models. Stay tune for the updates. 🧨

YouTube demo

Caltech and EuroCity Persons. Pre-Trained model available.

Leaderboards

Installation

We refer to the installation and list of dependencies to installation file. Clone this repo and follow installation. Alternatively, Google Colab step-by-step instruction can be followed for installation (Please download the pre-trained models from the table in the readme.md, the link is broken on google colab for the pre-trained model). Addiitonally, you can also refer to the google doc file for step-by-step installation. For running a docker image please see installation file.

List of detectors

Currently we provide configurations for the following detectors, with different backbones

Cascade Mask-R-CNN
Faster R-CNN
RetinaNet
RetinaNet with Guided Anchoring
Hybrid Task Cascade (HTC)
MGAN
CSP

Following datasets are currently supported

Datasets Preparation

We refer to Datasets preparation file for detailed instructions

Benchmarking

Benchmarking of pre-trained models on pedestrian detection datasets (autonomous driving)

Detector	Dataset	Backbone	Reasonable	Heavy
Cascade Mask R-CNN	CityPersons	HRNet	7.5	28.0
Cascade Mask R-CNN	CityPersons	MobileNet	10.2	37.3
Faster R-CNN	CityPersons	HRNet	10.2	36.2
RetinaNet	CityPersons	ResNeXt	14.6	39.5
RetinaNet with Guided Anchoring	CityPersons	ResNeXt	11.7	41.5
Hybrid Task Cascade (HTC)	CityPersons	ResNeXt	9.5	35.8
MGAN	CityPersons	VGG	11.2	52.5
CSP	CityPersons	ResNet-50	10.9	41.3
Cascade Mask R-CNN	Caltech	HRNet	1.7	25.7
Cascade Mask R-CNN	EuroCity Persons	HRNet	4.4	21.3
Faster R-CNN	EuroCity Persons	HRNet	6.1	27.0

Benchmarking of pre-trained models on general human/person detection datasets

Detector	Dataset	Backbone	AP
Cascade Mask R-CNN	CrowdHuman	HRNet	84.1

Getting Started

Running a demo using pre-trained model on few images

Pre-trained model can be evaluated on sample images in the following way

python tools/demo.py config checkpoint input_dir output_dir

Download one of our provided pre-trained model and place it in models_pretrained folder. Demo can be run using the following command

python tools/demo.py configs/elephant/cityperson/cascade_hrnet.py ./models_pretrained/epoch_5.pth.stu demo/ result_demo/

See Google Colab demo.

Training

single GPU training
multiple GPU training

Train with single GPU

python tools/train.py ${CONFIG_FILE}

Train with multiple GPUs

./tools/dist_train.sh ${CONFIG_FILE} ${GPU_NUM} [optional arguments]

For instance training on CityPersons using single GPU

python tools/train.py configs/elephant/cityperson/cascade_hrnet.py

Training on CityPersons using multiple(7 in this case) GPUs

./tools/dist_train.sh configs/elephant/cityperson/cascade_hrnet.py 7

Testing

single GPU testing
multiple GPU testing

Test can be run using the following command.

python ./tools/TEST_SCRIPT_TO_RUN.py PATH_TO_CONFIG_FILE ./models_pretrained/epoch_ start end\
 --out Output_filename --mean_teacher

For example for CityPersons inference can be done the following way

Download the pretrained CityPersons model and place it in the folder "models_pretrained/".
Run the following command:

python ./tools/test_city_person.py configs/elephant/cityperson/cascade_hrnet.py ./models_pretrained/epoch_ 5 6\
 --out result_citypersons.json --mean_teacher

Alternatively, for EuroCity Persons

python ./tools/test_euroCity.py configs/elephant/eurocity/cascade_hrnet.py ./models_pretrained/epoch_ 147 148 --mean_teacher

or without mean_teacher flag for MGAN

python ./tools/test_city_person.py configs/elephant/cityperson/mgan_vgg.py ./models_pretrained/epoch_ 1 2\
 --out result_citypersons.json

Testing with multiple GPUs on CrowdHuman

./tools/dist_test.sh ${CONFIG_FILE} ${CHECKPOINT_FILE} ${GPU_NUM} [--out ${RESULT_FILE}] [--eval ${EVAL_METRICS}]

./tools/dist_test.sh configs/elephant/crowdhuman/cascade_hrnet.py ./models_pretrained/epoch_19.pth.stu 8 --out CrowdHuman12.pkl --eval bbox

Similarly change respective paths for EuroCity Persons
For Caltech refer to Datasets preparation file

Please cite the following work

CVPR2021

@InProceedings{Hasan_2021_CVPR,
    author    = {Hasan, Irtiza and Liao, Shengcai and Li, Jinpeng and Akram, Saad Ullah and Shao, Ling},
    title     = {Generalizable Pedestrian Detection: The Elephant in the Room},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2021},
    pages     = {11328-11337}
}

[Pedestron] Generalizable Pedestrian Detection: The Elephant In The Room. @ CVPR2021

Related tags

Overview

Pedestron

🔥 Updates 🔥

YouTube demo

Leaderboards

Installation

List of detectors

Following datasets are currently supported

Datasets Preparation

Benchmarking

Benchmarking of pre-trained models on pedestrian detection datasets (autonomous driving)

Benchmarking of pre-trained models on general human/person detection datasets

Getting Started

Running a demo using pre-trained model on few images

Training

Testing

Please cite the following work

Owner

Irtiza Hasan

CycleTransGAN-EVC: A CycleGAN-based Emotional Voice Conversion Model with Transformer

Code release of paper Improving neural implicit surfaces geometry with patch warping

Python scripts for performing lane detection using the LSTR model in ONNX

The official implementation of CSG-Stump: A Learning Friendly CSG-Like Representation for Interpretable Shape Parsing

Official implementation for: Blended Diffusion for Text-driven Editing of Natural Images.

VACA: Designing Variational Graph Autoencoders for Interventional and Counterfactual Queries

Copy Paste positive polyp using poisson image blending for medical image segmentation

Face Synthetics dataset is a collection of diverse synthetic face images with ground truth labels.

Reinforcement Learning via Supervised Learning

Calibrate your listeners! Robust communication-based training for pragmatic speakers. Findings of EMNLP 2021.

labelpix is a graphical image labeling interface for drawing bounding boxes

The Habitat-Matterport 3D Research Dataset - the largest-ever dataset of 3D indoor spaces.

Code for the paper Learning the Predictability of the Future

Learning to Estimate Hidden Motions with Global Motion Aggregation

This is the workbook I created while I was studying for the Qiskit Associate Developer exam. I hope this becomes useful to others as it was for me :)

Effect of Different Encodings and Distance Functions on Quantum Instance-based Classifiers

Interpretation of T cell states using reference single-cell atlases

An efficient PyTorch implementation of the winning entry of the 2017 VQA Challenge.

Optimizing DR with hard negatives and achieving SOTA first-stage retrieval performance on TREC DL Track (SIGIR 2021 Full Paper).

A boosting-based Multiple Instance Learning (MIL) package that includes MIL-Boost and MCIL-Boost