A Comparative Review of Recent Kinect-Based Action Recognition Algorithms (TIP2020, Matlab codes)

Related tags

Deep LearningHDG
Overview

A Comparative Review of Recent Kinect-Based Action Recognition Algorithms

This repo contains:

  • the HDG implementation (Matlab codes) for 'Analysis and Evaluation of Kinect-based Action Recognition Algorithms', and
  • provides the links (google drive) for downloading the algorithms evaluated in our TIP journal and
  • provides direct links (google drive) to download 5 smaller datasets for action recognition research.

1 Introduction

This repository contains the implementation of HDG presented in the following paper:

[1] Lei Wang, 2017. Analysis and Evaluation of Kinect-based Action Recognition Algorithms. Master's thesis. School of Computer Science and Software Engineering, The University of Western Australia. [ArXiv] [BibTex]

[2] Lei Wang, Du Q. Huynh, and Piotr Koniusz. A Comparative Review of Recent Kinect-Based Action Recognition Algorithms. IEEE Transactions on Image Processing, 29: 15-28, 2020. [ArXiv] [BibTex]

We also provide the links for downloading the algorithms/datasets used in our TIP paper.

2 Other algorithms compared in TIP paper

You can download other algorithms we evaluated in TIP paper from the following links:

3 Datasets used in TIP paper

3.1 Five Smaller datasets

3.1.1 Depth+Skeleton

You can directly download the depth+skeleton sequences for the following smaller datasets here:

The above 5 downloaded datasets contain depth + skeleton data, which you can directly use for HDG algorithm in this repo:

  • unzip a dataset, and
  • put the Dataset folder into HDG folder, then
  • extract the features (refer to following sections for more details).

3.1.2 Depth video only

For downloading the UWA3DActivity+UWA3D Multiview Activity II depth only, you can use this link(extraction code: 172h).

For downloading the CAD-60 depth only, please use this link (extraction code: 36wt)

3.2 Big datasets (NTU RGB+D)

For big datasets such as NTU-60 and NTU-120, please refer to this link for the request to download.

4 Run the codes of HDG

This is an implementation based on Rahmani et al.’s paper ‘Real Time Action Recognition Using Histograms of Depth Gradients and Random Decision Forests’ (WACV2014).

To run our new HDG algorithm (which is analysed and compared in our TIP2020 paper):

4.0 A glance of skeleton configuration

To know more detailed information about the skeleton configuration/graph, please refer to the pdf file attached in this repo.

UWAS denotes the skeleton configuration for UWA3D Activity, and UWAW is for UWA3D Multiview Activity II.

4.1 Data preparation

  • Go to the 'Dataset' folder, then go to the 'depth' folder and copy all depth sequence in this folder (should be .mat format and the internal data has the same name 'inDepthVideo').

  • After that go to the 'skeleton' folder, copy all skeleton sequence (the skeleton sequence should also be .mat format and each skeleton sequence has the following dimension: #jointsx3x#frames, here 3 represents x, y and d respectively), the internal data has the same name 'skeletonsequence'.

4.2 Feature extraction and concatenation

  • Go to the 'MATLAB_Codes' folder, run each 'main' in each algorithm folder(in the order of 00, 01, 02 and 03), and then run 'main' in 'feature_concatenating'. You can also run '02' and '03' first and then run '00' and '01', since '00' may need more time for segmenting the foreground (around 6 hours) and '01' is based on the results of '00'.

  • For UWAMultiview dataset, remember to change the video sequence from uint16 to double using im2double before running each main in 00 and 01: in both 00 and 01 folders, in main function line 33 & 17, change depthsequence=actionvolume; to depthsequence=im2double(actionvolume);.

  • For feature concatenating, you can select different combinations of features for classification. There are four features, which are:

    • hod(histogram of depth),
    • hodg(histogram of depth gradients),
    • jmv(joint movement volume features) and
    • jpd(joint position differences features).
  • Remember to change the number of joints and the torso joint ID in the 'main' of '02' and '03' since different datasets have different number of joints and torso joint IDs (refer to the pdf attached in this repo for the skeleton configuration).

    • MSRPairs (3D Action Pairs): 20 joints, torso joint ID is '2';
    • MSRAction3D: 20 joints, torso joint ID is '4';
    • CAD-60: 15 joints, torso joint ID is '3';
    • UWA3D single view dataset (UWA3D Activity): 15 joints, torso joint ID is '9';
    • UWA3D multi view dataset (UWA3D Multiview Activity II): 15 joints, torso joint ID is '3';

4.3 Classification

  • Run 'main' of random decision forests (Lei uses different 'main' for different datasets since different datasets should have different training and testing datasets). In Lei's implementation, half of data are used for training and the remaining half for testing.

    • MSRPairs (3D Action Pairs): msrpairsmain.m
    • MSRAction3D: msr3dmain.m
    • CAD-60: cadmain.m
    • UWA3D single view (UWA3D Activity): uwasinglemain.m
    • UWA3D multi view (UWA3D Multiview Activity II): uwamultimain.m

4.4 Visualization (i.e., confusion matrix)

  • The results of the confusion matrix will be saved in the 'Results' folder, and the confusion matrix will be displayed. Moreover, the total accuracy will appear in the workspace of the MATLAB.

4.4.1 Save figures to pdf format

  • saveTightFigure function is downloaded from online resource, which can be used to save the confusion matrix plot as pdf files. The use of this function is, for example: saveTightFigure(gcf, 'uwamultiview.pdf');

Codes for parameters evaluation, and running over all possible combinations of selecting half subjects (for training) are not provided in this repo.

For more information, please refer to my research report and our journal paper, or contact me.

5 Citations

You can cite the following papers for the use of this work:

@mastersthesis{lei_thesis_2017,
  author       = {Lei Wang}, 
  title        = {Analysis and Evaluation of {K}inect-based Action Recognition Algorithms},
  school       = {School of the Computer Science and Software Engineering, The University of Western Australia},
  year         = 2017,
  month        = {Nov}
}
@article{lei_tip_2019,
author={Lei Wang and Du Q. Huynh and Piotr Koniusz},
journal={IEEE Transactions on Image Processing},
title={A Comparative Review of Recent Kinect-Based Action Recognition Algorithms},
year={2020},
volume={29},
number={},
pages={15-28},
doi={10.1109/TIP.2019.2925285},
ISSN={1941-0042},
month={},}

Acknowledgments

I am grateful to Associate Professor Du Huynh for her valuable suggestions and discussions. We would like to thank the authors of HON4D, HOPC, LARP-SO, HPM+TM, IndRNN and ST-GCN for making their codes publicly available. We thank the ROSE Lab of Nanyang Technological University(NTU), Singapore, for making the NTU RGB+D dataset freely accessible.

Owner
Lei Wang
PhD student, Machine Learning/Computer Vision Researcher
Lei Wang
Image-to-Image Translation in PyTorch

CycleGAN and pix2pix in PyTorch New: Please check out contrastive-unpaired-translation (CUT), our new unpaired image-to-image translation model that e

Jun-Yan Zhu 19k Jan 07, 2023
MAVE: : A Product Dataset for Multi-source Attribute Value Extraction

The dataset contains 3 million attribute-value annotations across 1257 unique categories on 2.2 million cleaned Amazon product profiles. It is a large, multi-sourced, diverse dataset for product attr

Google Research Datasets 89 Jan 08, 2023
PyTorch implementation of spectral graph ConvNets, NIPS’16

Graph ConvNets in PyTorch October 15, 2017 Xavier Bresson http://www.ntu.edu.sg/home/xbresson https://github.com/xbresson https://twitter.com/xbresson

Xavier Bresson 287 Jan 04, 2023
Code for "PVNet: Pixel-wise Voting Network for 6DoF Pose Estimation" CVPR 2019 oral

Good news! We release a clean version of PVNet: clean-pvnet, including how to train the PVNet on the custom dataset. Use PVNet with a detector. The tr

ZJU3DV 722 Dec 27, 2022
COIN the currently largest dataset for comprehensive instruction video analysis.

COIN Dataset COIN is the currently largest dataset for comprehensive instruction video analysis. It contains 11,827 videos of 180 different tasks (i.e

86 Dec 28, 2022
Semantic Segmentation in Pytorch. Network include: FCN、FCN_ResNet、SegNet、UNet、BiSeNet、BiSeNetV2、PSPNet、DeepLabv3_plus、 HRNet、DDRNet

🚀 If it helps you, click a star! ⭐ Update log 2020.12.10 Project structure adjustment, the previous code has been deleted, the adjustment will be re-

Deeachain 269 Jan 04, 2023
Removing Inter-Experimental Variability from Functional Data in Systems Neuroscience

Removing Inter-Experimental Variability from Functional Data in Systems Neuroscience This repository is the official implementation of [https://www.bi

Eulerlab 6 Oct 09, 2022
FlowTorch is a PyTorch library for learning and sampling from complex probability distributions using a class of methods called Normalizing Flows

FlowTorch is a PyTorch library for learning and sampling from complex probability distributions using a class of methods called Normalizing Flows.

Meta Incubator 272 Jan 02, 2023
Improving Contrastive Learning by Visualizing Feature Transformation, ICCV 2021 Oral

Improving Contrastive Learning by Visualizing Feature Transformation This project hosts the codes, models and visualization tools for the paper: Impro

Bingchen Zhao 83 Dec 15, 2022
Introduction to CPM

CPM CPM is an open-source program on large-scale pre-trained models, which is conducted by Beijing Academy of Artificial Intelligence and Tsinghua Uni

Tsinghua AI 136 Dec 23, 2022
Official code of ICCV2021 paper "Residual Attention: A Simple but Effective Method for Multi-Label Recognition"

CSRA This is the official code of ICCV 2021 paper: Residual Attention: A Simple But Effective Method for Multi-Label Recoginition Demo, Train and Vali

163 Dec 22, 2022
Spatial Action Maps for Mobile Manipulation (RSS 2020)

spatial-action-maps Update: Please see our new spatial-intention-maps repository, which extends this work to multi-agent settings. It contains many ne

Jimmy Wu 27 Nov 30, 2022
A python module for scientific analysis of 3D objects based on VTK and Numpy

A lightweight and powerful python module for scientific analysis and visualization of 3d objects.

Marco Musy 1.5k Jan 06, 2023
Near-Duplicate Video Retrieval with Deep Metric Learning

Near-Duplicate Video Retrieval with Deep Metric Learning This repository contains the Tensorflow implementation of the paper Near-Duplicate Video Retr

2 Jan 24, 2022
Mouse Brain in the Model Zoo

Deep Neural Mouse Brain Modeling This is the repository for the ongoing deep neural mouse modeling project, an attempt to characterize the representat

Colin Conwell 15 Aug 22, 2022
Music library streaming app written in Flask & VueJS

djtaytay This is a little toy app made to explore Vue, brush up on my Python, and make a remote music collection accessable through a web interface. I

Ryan Tasson 6 May 27, 2022
PyTorch code for 'Efficient Single Image Super-Resolution Using Dual Path Connections with Multiple Scale Learning'

Efficient Single Image Super-Resolution Using Dual Path Connections with Multiple Scale Learning This repository is for EMSRDPN introduced in the foll

7 Feb 10, 2022
Code for our SIGCOMM'21 paper "Network Planning with Deep Reinforcement Learning".

0. Introduction This repository contains the source code for our SIGCOMM'21 paper "Network Planning with Deep Reinforcement Learning". Notes The netwo

NetX Group 68 Nov 24, 2022
Optimizing DR with hard negatives and achieving SOTA first-stage retrieval performance on TREC DL Track (SIGIR 2021 Full Paper).

Optimizing Dense Retrieval Model Training with Hard Negatives Jingtao Zhan, Jiaxin Mao, Yiqun Liu, Jiafeng Guo, Min Zhang, Shaoping Ma This repo provi

Jingtao Zhan 99 Dec 27, 2022
Lightweight mmm - Lightweight (Bayesian) Media Mix Model

Lightweight (Bayesian) Media Mix Model This is not an official Google product. L

Google 342 Jan 03, 2023