Multi-Stage Spatial-Temporal Convolutional Neural Network (MS-GCN)

Last update: Nov 29, 2022

Related tags

Overview

Multi-Stage Spatial-Temporal Convolutional Neural Network (MS-GCN)

This code implements the skeleton-based action segmentation MS-GCN model from Automated freezing of gait assessment with marker-based motion capture and multi-stage spatial-temporal graph convolutional neural networks and Skeleton-based action segmentation with multi-stage spatial-temporal graph convolutional neural networks, arXiv 2022 (in-review).

It was originally developed for freezing of gait (FOG) assessment on a proprietary dataset. Recently, we have also achieved high skeleton-based action segmentation performance on public datasets, e.g. HuGaDB, LARa, PKU-MMD v2, TUG.

Requirements

Tested on Ubuntu 16.04 and Pytorch 1.10.1. Models were trained on a Nvidia Tesla K80.

The c3d data preparation script requires Biomechanical-Toolkit. For installation instructions, please refer to the following issue.

Content

data_prep/ -- Data preparation scripts.
main.py -- Main script. I suggest working with this interactively with an IDE. Please provide the dataset and train/predict arguments, e.g. --dataset=fog_example --action=train.
batch_gen.py -- Batch loader.
label_eval.py -- Compute metrics and save prediction results.
model.py -- train/predict script.
models/ -- Location for saving the trained models.
models/ms_gcn.py -- The MS-GCN model.
models/net_utils/ -- Scripts to partition the graph for the various datasets. For more information about the partitioning, please refer to the section Graph representations. For more information about spatial-temporal graphs, please refer to ST-GCN.
data/ -- Location for the processed datasets. For more information, please refer to the 'FOG' example.
data/signals. -- Scripts for computing the feature representations. Used for datasets that provided spatial features per joint, e.g. FOG, TUG, and PKU-MMD v2. For more information, please refer to the section Graph representations.
results/ -- Location for saving the results.

Data

After processing the dataset (scripts are dataset specific), each processed dataset should be placed in the data folder. We provide an example for a motion capture dataset that is in c3d format. For this particular example, we extract 9 joints in 3D:

data_prep/read_frame.py -- Import the joints and action labels from the c3d and save both in a separate csv.
data_prep/gen_data/ -- Import the csv, construct the input, and save to npy for training. For more information about the input and label shape, please refer to the section Problem statement.

Please refer to the example in data/example/ for more information on how to structure the files for training/prediction.

Pre-trained models

Pre-trained models are provided for HuGaDB, PKU-MMD, and LARa. To reproduce the results from the paper:

The dataset should be downloaded from their respective repository.
See the "Data" section for more information on how to prepare the datasets.
Place the pre-trained models in models/, e.g. models/hugadb.
Ensure that the correct graph representation is chosen in ms_gcn.
Comment out features = get_features(features) in model (only for lara and hugadb).
Specify the correct sampling rate, e.g. downsampling factor of 4 for lara.
Run main to generate the per-sample predictions with proper arguments, e.g. --dataset=hugadb --action=predict.
Run label_eval with proper arguments, e.g. --dataset=hugadb.

Acknowledgements

The MS-GCN model and code are heavily based on ST-GCN and MS-TCN. We thank the authors for publicly releasing their code.

License

MIT

Multi-Stage Spatial-Temporal Convolutional Neural Network (MS-GCN)

Related tags

Overview

Multi-Stage Spatial-Temporal Convolutional Neural Network (MS-GCN)

Requirements

Content

Data

Pre-trained models

Acknowledgements

License

Owner

Benjamin Filtjens

This project is a re-implementation of MASTER: Multi-Aspect Non-local Network for Scene Text Recognition by MMOCR

Hydra Lightning Template for Structured Configs

Semi-supervised Semantic Segmentation with Directional Context-aware Consistency (CVPR 2021)

ElegantRL is featured with lightweight, efficient and stable, for researchers and practitioners.

NU-Wave: A Diffusion Probabilistic Model for Neural Audio Upsampling @ INTERSPEECH 2021 Accepted

Restricted Boltzmann Machines in Python.

Delta Conformity Sociopatterns Analysis - Delta Conformity Sociopatterns Analysis

ConvMAE: Masked Convolution Meets Masked Autoencoders

Official implementation of deep-multi-trajectory-based single object tracking (IEEE T-CSVT 2021).

Bottleneck Transformers for Visual Recognition

Dynamic Divide-and-Conquer Adversarial Training for Robust Semantic Segmentation （ICCV2021）

This is the official pytorch implementation for our ICCV 2021 paper "TRAR: Routing the Attention Spans in Transformers for Visual Question Answering" on VQA Task

JORLDY an open-source Reinforcement Learning (RL) framework provided by KakaoEnterprise

[ICCV 2021] Focal Frequency Loss for Image Reconstruction and Synthesis

GrabGpu_py: a scripts for grab gpu when gpu is free

EdiBERT is a generative model based on a bi-directional transformer, suited for image manipulation

Deploy pytorch classification model using Flask and Streamlit

Individual Tree Crown classification on WorldView-2 Images using Autoencoder -- Group 9 Weak learners - Final Project (Machine Learning 2020 Course)

[CVPR 2020] Interpreting the Latent Space of GANs for Semantic Face Editing

Framework for joint representation learning, evaluation through multimodal registration and comparison with image translation based approaches