[ACM MM 2021] Multiview Detection with Shadow Transformer (and View-Coherent Data Augmentation)

Last update: Dec 13, 2022

Related tags

Overview

Multiview Detection with Shadow Transformer (and View-Coherent Data Augmentation) [arXiv] [paper]

@inproceedings{hou2021multiview,
  title={Multiview Detection with Shadow Transformer (and View-Coherent Data Augmentation)},
  author={Hou, Yunzhong and Zheng, Liang},
  booktitle={Proceedings of the 29th ACM International Conference on Multimedia (MM ’21)},
  year={2021}
}

Overview

We release the PyTorch code for MVDeTr, a state-of-the-art multiview pedestrian detector. Its superior performance should be credited to transformer architectures, updated loss terms, and view-coherent data augmentations. Moreover, MVDeTr is also very efficient and can be trained on a single RTX 2080TI. This repo also includes a simplified version of MVDet, which also runs on a single RTX 2080TI.

MVDeTr Code

This repo is dedicated to the code for MVDeTr.

Dependencies

This code uses the following libraries

python
pytorch & tochvision
numpy
matplotlib
pillow
opencv-python
kornia

Data Preparation

By default, all datasets are in ~/Data/. We use MultiviewX and Wildtrack in this project.

Your ~/Data/ folder should look like this

Data
├── MultiviewX/
│   └── ...
└── Wildtrack/ 
    └── ...

Code Preparation

Before running the code, one should go to multiview_detector/models/ops and run bash mask.sh to build the deformable transformer (forked from Deformable DETR).

Training

In order to train classifiers, please run the following,

python main.py -d wildtrack
python main.py -d multiviewx

This should automatically return evaluation results similar to the reported 91.5% MODA on Wildtrack dataset and 93.7% MODA on MultiviewX dataset.

Architectures

This repo supports multiple architecture variants. For MVDeTr, please specify --world_feat deform_trans; for a similar fully convolutional architecture like MVDet, please specify --world_feat conv.

Loss terms

This repo supports multiple loss terms. For the focal loss variant as in MVDeTr, please specify --use_mse 0; for the MSE loss as in MVDet, please specify ----use_mse 1.

Augmentations

This repo includes support for view coherent data augmentation, which applies affine transformations onto the per-view inputs, and then invert the per-view feature maps to maintain multiview coherency.

Pre-trained models

You can download the checkpoints at this link.

[ACM MM 2021] Multiview Detection with Shadow Transformer (and View-Coherent Data Augmentation)

Related tags

Overview

Multiview Detection with Shadow Transformer (and View-Coherent Data Augmentation) [arXiv] [paper]

Overview

Content

MVDeTr Code

Dependencies

Data Preparation

Code Preparation

Training

Architectures

Loss terms

Augmentations

Pre-trained models

Owner

Yunzhong Hou

Original Implementation of Prompt Tuning from Lester, et al, 2021

Контрольная работа по математическим методам машинного обучения

Codes accompanying the paper "Learning Nearly Decomposable Value Functions with Communication Minimization" (ICLR 2020)

An implementation on "Curved-Voxel Clustering for Accurate Segmentation of 3D LiDAR Point Clouds with Real-Time Performance"

Deep Face Recognition in PyTorch

Implementation of " SESS: Self-Ensembling Semi-Supervised 3D Object Detection" (CVPR2020 Oral)

Learning-Augmented Dynamic Power Management

A library for answering questions using data you cannot see

Style transfer, deep learning, feature transform

Annotate with anyone, anywhere.

The world's simplest facial recognition api for Python and the command line

Virtual hand gesture mouse using a webcam

Official repository with code and data accompanying the NAACL 2021 paper "Hurdles to Progress in Long-form Question Answering" (https://arxiv.org/abs/2103.06332).

Code repo for "Transformer on a Diet" paper

quantize aware training package for NCNN on pytorch

Adaptive Prototype Learning and Allocation for Few-Shot Segmentation (CVPR 2021)

HashNeRF-pytorch - Pure PyTorch Implementation of NVIDIA paper on Instant Training of Neural Graphics primitives

CasualHealthcare's Pneumonia detection with Artificial Intelligence (Convolutional Neural Network)

ElegantRL is featured with lightweight, efficient and stable, for researchers and practitioners.

Code for Environment Dynamics Decomposition (ED2).