CZU-MHAD: A multimodal dataset for human action recognition utilizing a depth camera and 10 wearable inertial sensors

Overview

CZU-MHAD: A multimodal dataset for human action recognition utilizing a depth camera and 10 wearable inertial sensors

  In order to facilitate the research of multi-modal sensor fusion for human action recognition, this paper provides a multi-modal human action dataset using Kinect depth camera and multile wearable sensors, which is called Changzhou University multi-modal human action dataset (CZU-MHAD). Our dataset contains more wearable sensors, which aims to obtain the position data of human skeleton joints, as well as 3-axis acceleration and 3-axis angular velocity data of corresponding joints. Our dataset provides time synchronous depth video, skeleton joint position, 3-axis acceleration and 3-axis angular velocity data to describe a complete human action.

1. Sensors

  The CZU-MHAD uses 1 Microsoft Kinect V2 and 10 wearable sensors MPU9250. These two kinds of sensors are widely used, which have the characteristics of low power consumption, low cost and simple operation. In addition, it does not require too much computing power to process the data collected by the two kind sensors in real time.

1.1 Kinect v2

  The above picture is the Microsoft Kinect V2, which can collect both color and depth images at a sampling frequency of 30 frames per second. Kinect SDK is a software package provided by Microsoft, which can be used to track 25 skeleton joint points and their 3D spatial positions. You can download the Kinect SDK in https://www.microsoft.com/en-us/download/details.aspx?id=44561.

  The above image shows 25 skeleton joint points of the human body that Kinect V2 can track.

1.2 MPU9250

  The MPU9250 can capture 3-axis acceleration, 3-axis angular velocity and 3-axis magnetic intensity.

  • The measurement range of MPU9250:
    • the measurement range of accelerometer is ±16g;
    • the measurement range of angular velocity of the gyroscope is ±2000 degrees/second.

  CZU-MHAD uses Raspberry PI to interact with MPU9250 through the integrated circuit bus (IIC) interface, realizing the functions of reading, saving and uploading MPU9250 sensor data to the server.The connection between Raspberry PI and MPU9250 is shown in picture.

  You can visit https://projects.raspberrypi.org/en/projects/raspberry-pi-setting-up to learn more about Raspberry PI.

2. Data Acquisition System Architecture

  This section introduces the data acquisition system of CZU-MHAD dataset. CZU-MHAD uses Kinect V2 sensor to collect depth image and joint position data, and uses MPU9250 sensor to collect 3-axis acceleration data and 3-axis angular velocity data. In order to collect the 3-axis acceleration data and the 3-axis angular velocity data of the whole body, a motion data acquisition system including 10 MPU9250 sensors is built-in this paper. The sampling system architecture is shown in following picture.

  The MPU9250 sensor is controlled by Raspberry PI, Kinect V2 is controlled by a notebook computer, and time synchronization with a NTP server is carried out every time data is collected. After considering the sampling scheme of MHAD and UTD-MHAD, the position of wearable sensors is determined as shown in the following picture.

  The points marked in red in the figure are the positions of inertial sensors, the left in the figure is the left side of the human body, and the right in the figure is the right side of the human body.

3. Information for "CZU-MHAD" dataset.

  The CZU-MHAD dataset contains 22 actions performed by 5 subjects (5 males). Each subject repeated each action >8 times. The CZU-MHAD dataset contains a total of >880 samples. The 22 actions performed are listed in Table. It can be seen that CZU-MHAD includes common gestures (such as Draw fork, Draw circle),daily activities (such as Sur Place, Clap, Bend down), and training actions (such as Left body turning movement, Left lateral movement).

Describe different actions in English:

ID Action name ID Action name ID Action name ID Action name
1 Right high wave 7 Draw fork with right hand 13 Right foot kick side 19 Left body turning movement
2 Left high wave 8 Draw fork with left hand 14 Left foot kick side 20 Right body turning movement
3 Right horizontal wave 9 Draw circle with right hand 15 Clap 21 Left lateral movement
4 Left horizontal wave 10 Draw circle with left hand 16 Bend down 22 Right lateral movement
5 Hammer with right hand 11 Right foot kick foward 17 Wave up and down
6 Grasp with right hand 12 Left foot kick foward 18 Sur Place

Describe different actions in Chinese::

ID Action name ID Action name ID Action name ID Action name
1 右高挥手 7 右手画× 13 右脚侧踢 19 左体转
2 左高挥手 8 左手画× 14 左脚侧踢 20 右体转
3 右水平挥手 9 右手画○ 15 拍手 21 左体侧
4 左水平挥手 10 左手画○ 16 弯腰 22 右体侧
5 锤(右手) 11 右脚前踢 17 上下挥手
6 抓(右手) 12 左脚前踢 18 原地踏步

4. How to download the dataset

   We offer one way to download our CZU-MHAD dataset:

  1. BaiduDisk(百度网盘)

    (Link) 链接:https://pan.baidu.com/s/1SBy0D2f1ZoX_mDyd3YEp2Q
    (Code) 提取码:qsq1

  In the CZU-MHAD, you will see three subfolders:

  • depth_mat

       The depth_mat contains the depth images captured by Kinect V2. In this folder, each file represents an action sample. Each file is named by the subject's name, the category label of the action and the time of each action of each subject. Take cyy_a1_t1.mat as an example, cyy is the subject's name, a1 is the name of the action, t1 stands the first time to perform this action. How to read data is shown in our sample code.

  • sensors_mat

       The sensors_mat contains the data of 3-axis acceleration and 3-axis angular velocity captured by MPU9250. In this folder, each file represents an action sample. Each file is named by the subject's name, the category label of the action and the time of each action of each subject. Take cyy_a1_t1.mat as an example, cyy is the subject's name, a1 is the name of the action, t1 stands the first time to perform this action. How to read data is shown in our sample code.

  • skeleton_mat

       The skeleton_mat contains the position data of skeleton joint points captured by Kinect V2. In this folder, each file represents an action sample. Each file is named by the subject's name, the category label of the action and the time of each action of each subject. Take cyy_a1_t1.mat as an example, cyy is the subject's name, a1 is the name of the action, t1 stands the first time to perform this action. How to read data is shown in our sample code.

5. Sample codes

  1. BaiduDisk(百度网盘)

    (Link) 链接:https://pan.baidu.com/s/1bWq7ypygjTffkor1GAExMQ

    (Code) 提取码:limf

6. Citation

To use our dataset, please refer to the following paper:

  • Mo Yujian, Hou Zhenjie, Chang Xingzhi, Liang Jiuzhen, Chen Chen, Huan Juan. Structural feature representation and fusion of behavior recognition oriented human spatial cooperative motion[J]. Journal of Beijing University of Aeronautics and Astronautics,2019,(12):2495-2505.

7. Mailing List

  If you are interested to recieve news, updates, and future events about this dataset, please email me.

#. Thanks(致谢)

  1. Cui Yaoyao(崔瑶瑶)
  2. Chao Xin(巢新)
  3. Qin Yinhua(秦银华)
  4. Zhang Yuheng(张宇恒)
  5. Mo Yujian(莫宇剑)

#. Gao Liang(高亮)

#. Shi Yuhang(石宇航)

  The subjects marked with '#' also participated in our data collection process. However, due to the unstable power supply and abnormal heat dissipation of Raspberry PI, their behavior data is abnormal. Therefore, we do not provide their data.

You might also like...
Official PyTorch implementation of
Official PyTorch implementation of "IntegralAction: Pose-driven Feature Integration for Robust Human Action Recognition in Videos", CVPRW 2021

IntegralAction: Pose-driven Feature Integration for Robust Human Action Recognition in Videos Introduction This repo is official PyTorch implementatio

Python script for performing depth completion from sparse depth and rgb images using the msg_chn_wacv20. model in ONNX
Python script for performing depth completion from sparse depth and rgb images using the msg_chn_wacv20. model in ONNX

ONNX msg_chn_wacv20 depth completion Python script for performing depth completion from sparse depth and rgb images using the msg_chn_wacv20 model in

Python script for performing depth completion from sparse depth and rgb images using the msg_chn_wacv20. model in Tensorflow Lite.
Python script for performing depth completion from sparse depth and rgb images using the msg_chn_wacv20. model in Tensorflow Lite.

TFLite-msg_chn_wacv20-depth-completion Python script for performing depth completion from sparse depth and rgb images using the msg_chn_wacv20. model

Info and sample codes for "NTU RGB+D Action Recognition Dataset"

"NTU RGB+D" Action Recognition Dataset "NTU RGB+D 120" Action Recognition Dataset "NTU RGB+D" is a large-scale dataset for human action recognition. I

LVI-SAM: Tightly-coupled Lidar-Visual-Inertial Odometry via Smoothing and Mapping
LVI-SAM: Tightly-coupled Lidar-Visual-Inertial Odometry via Smoothing and Mapping

LVI-SAM This repository contains code for a lidar-visual-inertial odometry and mapping system, which combines the advantages of LIO-SAM and Vins-Mono

A real-time motion capture system that estimates poses and global translations using only 6 inertial measurement units
A real-time motion capture system that estimates poses and global translations using only 6 inertial measurement units

TransPose Code for our SIGGRAPH 2021 paper "TransPose: Real-time 3D Human Translation and Pose Estimation with Six Inertial Sensors". This repository

 COVINS -- A Framework for Collaborative Visual-Inertial SLAM and Multi-Agent 3D Mapping
COVINS -- A Framework for Collaborative Visual-Inertial SLAM and Multi-Agent 3D Mapping

COVINS -- A Framework for Collaborative Visual-Inertial SLAM and Multi-Agent 3D Mapping Version 1.0 COVINS is an accurate, scalable, and versatile vis

Monocular Depth Estimation - Weighted-average prediction from multiple pre-trained depth estimation models
Monocular Depth Estimation - Weighted-average prediction from multiple pre-trained depth estimation models

merged_depth runs (1) AdaBins, (2) DiverseDepth, (3) MiDaS, (4) SGDepth, and (5) Monodepth2, and calculates a weighted-average per-pixel absolute dept

The implemention of Video Depth Estimation by Fusing Flow-to-Depth Proposals

Flow-to-depth (FDNet) video-depth-estimation This is the implementation of paper Video Depth Estimation by Fusing Flow-to-Depth Proposals Jiaxin Xie,

Releases(skeleton)
Owner
yujmo
帅气,阳光,灿烂,美丽,大方
yujmo
PyTorch implementation of Constrained Policy Optimization

PyTorch implementation of Constrained Policy Optimization (CPO) This repository has a simple to understand and use implementation of CPO in PyTorch. A

Sapana Chaudhary 25 Dec 08, 2022
ImageBART: Bidirectional Context with Multinomial Diffusion for Autoregressive Image Synthesis

ImageBART NeurIPS 2021 Patrick Esser*, Robin Rombach*, Andreas Blattmann*, Björn Ommer * equal contribution arXiv | BibTeX | Poster Requirements A sui

CompVis Heidelberg 110 Jan 01, 2023
ETMO: Evolutionary Transfer Multiobjective Optimization

ETMO: Evolutionary Transfer Multiobjective Optimization To promote the research on ETMO, benchmark problems are of great importance to ETMO algorithm

Songbai Liu 0 Mar 16, 2021
An image processing project uses Viola-jones technique to detect faces and then use SIFT algorithm for recognition.

Attendance_System An image processing project uses Viola-jones technique to detect faces and then use LPB algorithm for recognition. Face Detection Us

8 Jan 11, 2022
KwaiRec: A Fully-observed Dataset for Recommender Systems (Density: Almost 100%)

KuaiRec: A Fully-observed Dataset for Recommender Systems (Density: Almost 100%) KuaiRec is a real-world dataset collected from the recommendation log

Chongming GAO (高崇铭) 70 Dec 28, 2022
Generate text captions for images from their CLIP embeddings. Includes PyTorch model code and example training script.

clip-text-decoder Generate text captions for images from their CLIP embeddings. Includes PyTorch model code and example training script. Example Predi

Frank Odom 36 Dec 21, 2022
Find the Heart simple Python Game

This is a simple Python game for finding a heart emoji. There is a 3 x 3 matrix in which a heart emoji resides. The location of the heart is randomized and is not revealed. The player must guess the

p.katekomol 1 Jan 24, 2022
Management Dashboard for Torchserve

Torchserve Dashboard Torchserve Dashboard using Streamlit Related blog post Usage Additional Requirement: torchserve (recommended:v0.5.2) Simply run:

Ceyda Cinarel 103 Dec 10, 2022
OpenMMLab 3D Human Parametric Model Toolbox and Benchmark

Introduction English | 简体中文 MMHuman3D is an open source PyTorch-based codebase for the use of 3D human parametric models in computer vision and comput

OpenMMLab 782 Jan 04, 2023
Deep Learning for Human Part Discovery in Images - Chainer implementation

Deep Learning for Human Part Discovery in Images - Chainer implementation NOTE: This is not official implementation. Original paper is Deep Learning f

Shintaro Shiba 63 Sep 25, 2022
Dynamic Slimmable Network (CVPR 2021, Oral)

Dynamic Slimmable Network (DS-Net) This repository contains PyTorch code of our paper: Dynamic Slimmable Network (CVPR 2021 Oral). Architecture of DS-

Changlin Li 197 Dec 09, 2022
PyTorch code for Composing Partial Differential Equations with Physics-Aware Neural Networks

FInite volume Neural Network (FINN) This repository contains the PyTorch code for models, training, and testing, and Python code for data generation t

Cognitive Modeling 20 Dec 18, 2022
A Pytorch implementation of SMU: SMOOTH ACTIVATION FUNCTION FOR DEEP NETWORKS USING SMOOTHING MAXIMUM TECHNIQUE

SMU_pytorch A Pytorch Implementation of SMU: SMOOTH ACTIVATION FUNCTION FOR DEEP NETWORKS USING SMOOTHING MAXIMUM TECHNIQUE arXiv https://arxiv.org/ab

Fuhang 36 Dec 24, 2022
Galileo library for large scale graph training by JD

近年来,图计算在搜索、推荐和风控等场景中获得显著的效果,但也面临超大规模异构图训练,与现有的深度学习框架Tensorflow和PyTorch结合等难题。 Galileo(伽利略)是一个图深度学习框架,具备超大规模、易使用、易扩展、高性能、双后端等优点,旨在解决超大规模图算法在工业级场景的落地难题,提

JD Galileo Team 128 Nov 29, 2022
EMNLP 2020 - Summarizing Text on Any Aspects

Summarizing Text on Any Aspects This repo contains preliminary code of the following paper: Summarizing Text on Any Aspects: A Knowledge-Informed Weak

Bowen Tan 35 Nov 14, 2022
Residual Dense Net De-Interlace Filter (RDNDIF)

Residual Dense Net De-Interlace Filter (RDNDIF) Work in progress deep de-interlacer filter. It is based on the architecture proposed by Bernasconi et

Louis 7 Feb 15, 2022
Moiré Attack (MA): A New Potential Risk of Screen Photos [NeurIPS 2021]

Moiré Attack (MA): A New Potential Risk of Screen Photos [NeurIPS 2021] This repository is the official implementation of Moiré Attack (MA): A New Pot

Dantong Niu 22 Dec 24, 2022
Open source code for the paper of Neural Sparse Voxel Fields.

Neural Sparse Voxel Fields (NSVF) Project Page | Video | Paper | Data Photo-realistic free-viewpoint rendering of real-world scenes using classical co

Meta Research 647 Dec 27, 2022
AI-UPV at IberLEF-2021 EXIST task: Sexism Prediction in Spanish and English Tweets Using Monolingual and Multilingual BERT and Ensemble Models

AI-UPV at IberLEF-2021 EXIST task: Sexism Prediction in Spanish and English Tweets Using Monolingual and Multilingual BERT and Ensemble Models Descrip

Angel de Paula 1 Jun 08, 2022
Official PyTorch Implementation of "Self-supervised Auxiliary Learning with Meta-paths for Heterogeneous Graphs". NeurIPS 2020.

Self-supervised Auxiliary Learning with Meta-paths for Heterogeneous Graphs This repository is the implementation of SELAR. Dasol Hwang* , Jinyoung Pa

MLV Lab (Machine Learning and Vision Lab at Korea University) 48 Nov 09, 2022