[ICCV 2021 Oral] Deep Evidential Action Recognition

Overview

DEAR (Deep Evidential Action Recognition)

Project | Paper & Supp

Wentao Bao, Qi Yu, Yu Kong

International Conference on Computer Vision (ICCV Oral), 2021.

Table of Contents

  1. Introduction
  2. Installation
  3. Datasets
  4. Testing
  5. Training
  6. Model Zoo
  7. Citation

Introduction

We propose the Deep Evidential Action Recognition (DEAR) method to recognize actions in an open world. Specifically, we formulate the action recognition problem from the evidential deep learning (EDL) perspective and propose a novel model calibration method to regularize the EDL training. Besides, to mitigate the static bias of video representation, we propose a plug-and-play module to debias the learned representation through contrastive learning. Our DEAR model trained on UCF-101 dataset achieves significant and consistent performance gains based on multiple action recognition models, i.e., I3D, TSM, SlowFast, TPN, with HMDB-51 or MiT-v2 dataset as the unknown.

Demo

The following figures show the inference results by the SlowFast + DEAR model trained on UCF-101 dataset.

UCF-101
(Known)

1 2 3 4

HMDB-51
(Unknown)

6 7 8 10

Installation

This repo is developed from MMAction2 codebase. Since MMAction2 is updated in a fast pace, most of the requirements and installation steps are similar to the version MMAction2 v0.9.0.

Requirements and Dependencies

Here we only list our used requirements and dependencies. It would be great if you can work around with the latest versions of the listed softwares and hardwares on the latest MMAction2 codebase.

  • Linux: Ubuntu 18.04 LTS
  • GPU: GeForce RTX 3090, A100-SXM4
  • CUDA: 11.0
  • GCC: 7.5
  • Python: 3.7.9
  • Anaconda: 4.9.2
  • PyTorch: 1.7.1+cu110
  • TorchVision: 0.8.2+cu110
  • OpenCV: 4.4.0
  • MMCV: 1.2.1
  • MMAction2: 0.9.0

Installation Steps

The following steps are modified from MMAction2 (v0.9.0) installation document. If you encountered problems, you may refer to more details in the official document, or raise an issue in this repo.

a. Create a conda virtual environment of this repo, and activate it:

conda create -n mmaction python=3.7 -y
conda activate mmaction

b. Install PyTorch and TorchVision following the official instructions, e.g.,

conda install pytorch=1.7.1 cudatoolkit=11.0 torchvision=0.8.2 -c pytorch

c. Install mmcv, we recommend you to install the pre-build mmcv as below.

pip install mmcv-full==1.2.1 -f https://download.openmmlab.com/mmcv/dist/cu110/torch1.7.1/index.html

Important: If you have already installed mmcv and try to install mmcv-full, you have to uninstall mmcv first by running pip uninstall mmcv. Otherwise, there will be ModuleNotFoundError.

d. Clone the source code of this repo:

git clone https://github.com/Cogito2012/DEAR.git mmaction2
cd mmaction2

e. Install build requirements and then install DEAR.

pip install -r requirements/build.txt
pip install -v -e .  # or "python setup.py develop"

If no error appears in your installation steps, then you are all set!

Datasets

This repo uses standard video action datasets, i.e., UCF-101 for closed set training, and HMDB-51 and MiT-v2 test sets as two different unknowns. Please refer to the default MMAction2 dataset setup steps to setup these three datasets correctly.

Note: You can just ignore the Step 3. Extract RGB and Flow in the referred setup steps since all codes related to our paper do not rely on extracted frames and optical flow. This will save you large amount of disk space!

Testing

To test our pre-trained models (see the Model Zoo), you need to download a model file and unzip it under work_dir. Let's take the I3D-based DEAR model as an example. First, download the pre-trained I3D-based models, where the full DEAR model is saved in the folder finetune_ucf101_i3d_edlnokl_avuc_debias. The following directory tree is for your reference to place the downloaded files.

work_dirs    
├── i3d
│    ├── finetune_ucf101_i3d_bnn
│    │   └── latest.pth
│    ├── finetune_ucf101_i3d_dnn
│    │   └── latest.pth
│    ├── finetune_ucf101_i3d_edlnokl
│    │   └── latest.pth
│    ├── finetune_ucf101_i3d_edlnokl_avuc_ced
│    │   └── latest.pth
│    ├── finetune_ucf101_i3d_edlnokl_avuc_debias
│    │   └── latest.pth
│    └── finetune_ucf101_i3d_rpl
│        └── latest.pth
├── slowfast
├── tpn_slowonly
└── tsm

a. Closed Set Evaluation.

Top-K accuracy and mean class accuracy will be reported.

cd experiments/i3d
bash evaluate_i3d_edlnokl_avuc_debias_ucf101.sh

b. Get Uncertainty Threshold.

The threshold value of one model will be reported.

cd experiments/i3d
# run the thresholding with BATCH_SIZE=2 on GPU_ID=0
bash run_get_threshold.sh 0 edlnokl_avuc_debias 2

c. Open Set Evaluation and Comparison.

The open set evaluation metrics and openness curves will be reported.

Note: Make sure the threshold values of different models are from the reported results in step b.

cd experiments/i3d
bash run_openness.sh HMDB  # use HMDB-51 test set as the Unknown
bash run_openness.sh MiT  # use MiT-v2 test set as the Unknown

d. Out-of-Distribution Detection.

The uncertainty distribution figure of a specified model will be reported.

cd experiments/i3d
bash run_ood_detection.sh 0 HMDB edlnokl_avuc_debias

e. Draw Open Set Confusion Matrix

The confusion matrix with unknown dataset used will be reported.

cd experiments/i3d
bash run_draw_confmat.sh HMDB  # or MiT

Training

Let's still take the I3D-based DEAR model as an example.

cd experiments/i3d
bash finetune_i3d_edlnokl_avuc_debias_ucf101.sh 0

Since model training is time consuming, we strongly recommend you to run the above training script in a backend way if you are using SSH remote connection.

nohup bash finetune_i3d_edlnokl_avuc_debias_ucf101.sh 0 >train.log 2>&1 &
# monitoring the training status whenever you open a new terminal
tail -f train.log

Visualizing the training curves (losses, accuracies, etc.) on TensorBoard:

cd work_dirs/i3d/finetune_ucf101_i3d_edlnokl_avuc_debias/tf_logs
tensorboard --logdir=./ --port 6008

Then, you will see the generated url address http://localhost:6008. Open this address with your Internet Browser (such as Chrome), you will monitoring the status of training.

If you are using SSH connection to a remote server without monitor, tensorboard visualization can be done on your local machine by manually mapping the SSH port number:

ssh -L 16008:localhost:6008 {your_remote_name}@{your_remote_ip}

Then, you can monitor the tensorboard by the port number 16008 by typing http://localhost:16008 in your browser.

Model Zoo

The pre-trained weights (checkpoints) are available below.

Model Checkpoint Train Config Test Config Open maF1 (%) Open Set AUC (%) Closed Set ACC (%)
I3D + DEAR ckpt train test 77.24 / 69.98 77.08 / 81.54 93.89
TSM + DEAR ckpt train test 84.69 / 70.15 78.65 / 83.92 94.48
TPN + DEAR ckpt train test 81.79 / 71.18 79.23 / 81.80 96.30
SlowFast + DEAR ckpt train test 85.48 / 77.28 82.94 / 86.99 96.48

For other checkpoints of the compared baseline models, please download them in the Google Drive.

Citation

If you find the code useful in your research, please cite:

@inproceedings{BaoICCV2021DEAR,
  author = "Bao, Wentao and Yu, Qi and Kong, Yu",
  title = "Evidential Deep Learning for Open Set Action Recognition",
  booktitle = "International Conference on Computer Vision (ICCV)",
  year = "2021"
}

License

See Apache-2.0 License

Acknowledgement

In addition to the MMAction2 codebase, this repo contains modified codes from:

We sincerely thank the owners of all these great repos!

Owner
Wentao Bao
Ph.D. Student
Wentao Bao
Sudoku solver - A sudoku solver with python

sudoku_solver A sudoku solver What is Sudoku? Sudoku (Japanese: 数独, romanized: s

Sikai Lu 0 May 22, 2022
Code for the paper "On the Power of Edge Independent Graph Models"

Edge Independent Graph Models Code for the paper: "On the Power of Edge Independent Graph Models" Sudhanshu Chanpuriya, Cameron Musco, Konstantinos So

Konstantinos Sotiropoulos 0 Oct 26, 2021
Implement of "Training deep neural networks via direct loss minimization" in PyTorch for 0-1 loss

This is the implementation of "Training deep neural networks via direct loss minimization" published at ICML 2016 in PyTorch. The implementation targe

Cuong Nguyen 1 Jan 18, 2022
利用Tensorflow实现基于CNN的中文短文本分类

Text Classification with CNN 使用卷积神经网络进行中文文本分类 CNN做句子分类的论文可以参看: Convolutional Neural Networks for Sentence Classification 还可以去读dennybritz大牛的博客:Implemen

Jeremiah 4 Nov 08, 2022
wgan, wgan2(improved, gp), infogan, and dcgan implementation in lasagne, keras, pytorch

Generative Adversarial Notebooks Collection of my Generative Adversarial Network implementations Most codes are for python3, most notebooks works on C

tjwei 1.5k Dec 16, 2022
Evaluating deep transfer learning for whole-brain cognitive decoding

Evaluating deep transfer learning for whole-brain cognitive decoding This README file contains the following sections: Project description Repository

Armin Thomas 5 Oct 31, 2022
I3-master-layout - Simple master and stack layout script

Simple master and stack layout script | ------ | ----- | | | | | Ma

Tobias S 18 Dec 05, 2022
Unofficial JAX implementations of Deep Learning models

JAX Models Table of Contents About The Project Getting Started Prerequisites Installation Usage Contributing License Contact About The Project The JAX

107 Jan 05, 2023
OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark

Introduction English | 简体中文 MMAction2 is an open-source toolbox for video understanding based on PyTorch. It is a part of the OpenMMLab project. The m

OpenMMLab 2.7k Jan 07, 2023
Anchor-free Oriented Proposal Generator for Object Detection

Anchor-free Oriented Proposal Generator for Object Detection Gong Cheng, Jiabao Wang, Ke Li, Xingxing Xie, Chunbo Lang, Yanqing Yao, Junwei Han, Intro

jbwang1997 56 Nov 15, 2022
Code for Motion Representations for Articulated Animation paper

Motion Representations for Articulated Animation This repository contains the source code for the CVPR'2021 paper Motion Representations for Articulat

Snap Research 851 Jan 09, 2023
PyTorch implementation of Pointnet2/Pointnet++

Pointnet2/Pointnet++ PyTorch Project Status: Unmaintained. Due to finite time, I have no plans to update this code and I will not be responding to iss

Erik Wijmans 1.2k Dec 29, 2022
Model of an AI powered sign language interpreter.

TEXT AND SPEECH TO SIGN LANGUAGE. A web application which takes in text or live audio speech recording as input, converts and displays the relevant Si

Mark Gatere 4 Mar 30, 2022
A Python package for faster, safer, and simpler ML processes

Bender 🤖 A Python package for faster, safer, and simpler ML processes. Why use bender? Bender will make your machine learning processes, faster, safe

Otovo 6 Dec 13, 2022
Py4fi2nd - Jupyter Notebooks and code for Python for Finance (2nd ed., O'Reilly) by Yves Hilpisch.

Python for Finance (2nd ed., O'Reilly) This repository provides all Python codes and Jupyter Notebooks of the book Python for Finance -- Mastering Dat

Yves Hilpisch 1k Jan 05, 2023
Keras attention models including botnet,CoaT,CoAtNet,CMT,cotnet,halonet,resnest,resnext,resnetd,volo,mlp-mixer,resmlp,gmlp,levit

Keras_cv_attention_models Keras_cv_attention_models Usage Basic Usage Layers Model surgery AotNet ResNetD ResNeXt ResNetQ BotNet VOLO ResNeSt HaloNet

319 Dec 28, 2022
PyTorch implementation of "Image-to-Image Translation Using Conditional Adversarial Networks".

pix2pix-pytorch PyTorch implementation of Image-to-Image Translation Using Conditional Adversarial Networks. Based on pix2pix by Phillip Isola et al.

mrzhu 383 Dec 17, 2022
URIE: Universal Image Enhancementfor Visual Recognition in the Wild

URIE: Universal Image Enhancementfor Visual Recognition in the Wild This is the implementation of the paper "URIE: Universal Image Enhancement for Vis

Taeyoung Son 43 Sep 12, 2022
The code for the NSDI'21 paper "BMC: Accelerating Memcached using Safe In-kernel Caching and Pre-stack Processing".

BMC The code for the NSDI'21 paper "BMC: Accelerating Memcached using Safe In-kernel Caching and Pre-stack Processing". BibTex entry available here. B

Orange 383 Dec 16, 2022
An Unbiased Learning To Rank Algorithms (ULTRA) toolbox

Unbiased Learning to Rank Algorithms (ULTRA) This is an Unbiased Learning To Rank Algorithms (ULTRA) toolbox, which provides a codebase for experiment

back 3 Nov 18, 2022