HAR-stacked-residual-bidir-LSTMs - Deep stacked residual bidirectional LSTMs for HAR

Overview

HAR-stacked-residual-bidir-LSTM

The project is based on this repository which is presented as a tutorial. It consists of Human Activity Recognition (HAR) using stacked residual bidirectional-LSTM cells (RNN) with TensorFlow.

It resembles to the architecture used in "Google’s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation" without an attention mechanism and with just the encoder part. In fact, we started coding while thinking about applying residual connections to LSTMs - and it is only afterwards that we saw that such a deep LSTM architecture was already being used.

Here, we improve accuracy on the previously used dataset from 91% to 94% and we push the subject further by trying our architecture on another dataset.

Our neural network has been coded to be easy to adapt to new datasets (assuming it is given a fixed, non-dynamic, window of signal for every prediction) and to use different breadth, depth and length by using a new configuration file.

Here is a simplified overview of our architecture:

Simplified view of a "2x2" architecture. We obtain best results with a "3x3" architecture (details below figure).

Bear in mind that the time steps expands to the left for the whole sequence length and that this architecture example is what we call a "2x2" architecture: 2 residual cells as a block stacked 2 times for a total of 4 bidirectional cells, which is in reality 8 unidirectional LSTM cells. We obtain best results with a 3x3 architecture, consisting of 18 LSTM cells.

Neural network's architecture

Mainly, the number of stacked and residual layers can be parametrized easily as well as whether or not bidirectional LSTM cells are to be used. Input data needs to be windowed to an array with one more dimension: the training and testing is never done on full signal lengths and use shuffling with resets of the hidden cells' states.

We are using a deep neural network with stacked LSTM cells as well as residual (highway) LSTM cells for every stacked layer, a little bit like in ResNet, but for RNNs.

Our LSTM cells are also bidirectional in term of how they pass trough the time axis, but differ from classic bidirectional LSTMs by the fact we concatenate their output features rather than adding them in an element-wise fashion. A simple hidden ReLU layer then lowers the dimension of those concatenated features for sending them to the next stacked layer. Bidirectionality can be disabled easily.

Setup

We used TensorFlow 0.11 and Python 2. Sklearn is also used.

The two datasets can be loaded by running python download_datasets.py in the data/ folder.

To preprocess the second dataset (opportunity challenge dataset), the signal submodule of scipy is needed, as well as pandas.

Results using the previous public domain HAR dataset

This dataset named A Public Domain Dataset for Human Activity Recognition Using Smartphones is about classifying the type of movement amongst six categories: (WALKING, WALKING_UPSTAIRS, WALKING_DOWNSTAIRS, SITTING, STANDING, LAYING).

The bests results for a test accuracy of 94% are achieved with the 3x3 bidirectional architecture with a learning rate of 0.001 and an L2 regularization multiplier (weight decay) of 0.005, as seen in the 3x3_result_HAR_6.txt file.

Training and testing can be launched by running the config: python config_dataset_HAR_6_classes.py.

Results from the Opportunity dataset

The neural network has also been tried on the Opportunity dataset to see if the architecture could be easily adapted to a similar task.

Don't miss out this nice video that offers a nice overview and understanding of the dataset.

We obtain a test F1-score of 0.893. Our results can be compared to the state of the art DeepConvLSTM that is used on the same dataset and achieving a test F1-score of 0.9157.

We only used a subset of the full dataset as done in other research in order to simulate the conditions of the competition, using 113 sensor channels and classifying on the 17 categories output (and with the NULL class for a total of 18 classes). The windowing of the series for feeding in our neural network is also the same 24 time steps per classification, on a 30 Hz signal. However, we observed that there was no significant difference between using 128 time steps or 24 time steps (0.891 vs 0.893 F1-score). Our LSTM cells' inner representation is always reset to 0 between series. We also used mean and standard deviation normalization rather than min to max rescaling to rescale features to a zero mean and a standard deviation of 0.5. More details about preprocessing are explained furthermore in their paper. Other details, such as the fact that the classification output is sampled only at the last timestep for the training of the neural network, can be found in their preprocessing script that we adapted in our repository.

The config file can be runned like this: config_dataset_opportunity_18_classes.py. For best results, it is possible to readjust the learning rate such as in the 3x3_result_opportunity_18.txt file.

Citation

The paper is available on arXiv: https://arxiv.org/abs/1708.08989

Here is the BibTeX citation code:

@article{DBLP:journals/corr/abs-1708-08989,
  author    = {Yu Zhao and
               Rennong Yang and
               Guillaume Chevalier and
               Maoguo Gong},
  title     = {Deep Residual Bidir-LSTM for Human Activity Recognition Using Wearable
               Sensors},
  journal   = {CoRR},
  volume    = {abs/1708.08989},
  year      = {2017},
  url       = {http://arxiv.org/abs/1708.08989},
  archivePrefix = {arXiv},
  eprint    = {1708.08989},
  timestamp = {Mon, 13 Aug 2018 16:46:48 +0200},
  biburl    = {https://dblp.org/rec/bib/journals/corr/abs-1708-08989},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}

Collaborate with us on similar research projects

Join the slack workspace for time series processing, where you can:

  • Collaborate with us and other researchers on writing more time series processing papers, in the #research channel;
  • Do business with us and other companies for services and products related to time series processing, in the #business channel;
  • Talk about how to do Clean Machine Learning using Neuraxle, in the #neuraxle channel;

Online Course: Learn Deep Learning and Recurrent Neural Networks (DL&RNN)

We have created a course on Deep Learning and Recurrent Neural Networks (DL&RNN). Request an access to the course here. That is the most richly dense and accelerated course out there on this precise topic of DL&RNN.

We've also created another course on how to do Clean Machine Learning with the right design patterns and the right software architecture for your code to evolve correctly to be useable in production environments.

Owner
Guillaume Chevalier
e^(πi) + 1 = 0
Guillaume Chevalier
Megaverse is a new 3D simulation platform for reinforcement learning and embodied AI research

Megaverse Megaverse is a new 3D simulation platform for reinforcement learning and embodied AI research. The efficient design of the engine enables ph

Aleksei Petrenko 191 Dec 23, 2022
Bianace Prediction Pytorch Model

Bianace Prediction Pytorch Model Main Results ETHUSDT from 2021-01-01 00:00:00 t

RoyYang 4 Jul 20, 2022
Repositorio de los Laboratorios de Análisis Numérico / Análisis Numérico I de FAMAF, UNC.

Repositorio de los Laboratorios de Análisis Numérico / Análisis Numérico I de FAMAF, UNC. Para los Laboratorios de la materia, vamos a utilizar el len

Luis Biedma 18 Dec 12, 2022
Repository containing detailed experiments related to the paper "Memotion Analysis through the Lens of Joint Embedding".

Memotion Analysis Through The Lens Of Joint Embedding This repository contains the experiments conducted as described in the paper 'Memotion Analysis

Nethra Gunti 1 Mar 16, 2022
ESP32 python application to read data from a Tilt™ Hydrometer for homebrewing

TitlESP32 ESP32 MicroPython application to read and log data from a Tilt™ Hydrometer. Requirements A board with an ESP32 chip USB cable - USB A / micr

IoBeer 5 Dec 01, 2022
Causal estimators for use with WhyNot

WhyNot Estimators A collection of causal inference estimators implemented in Python and R to pair with the Python causal inference library whynot. For

ZYKLS 8 Apr 06, 2022
Little tool in python to watch anime from the terminal (the better way to watch anime)

ani-cli Script working again :), thanks to the fork by Dink4n for the alternative approach to by pass the captcha on gogoanime A cli to browse and wat

Harshith 4.5k Dec 31, 2022
Get a Grip! - A robotic system for remote clinical environments.

Get a Grip! Within clinical environments, sterilization is an essential procedure for disinfecting surgical and medical instruments. For our engineeri

Jay Sharma 1 Jan 05, 2022
Official repository of "DeepMIH: Deep Invertible Network for Multiple Image Hiding", TPAMI 2022.

DeepMIH: Deep Invertible Network for Multiple Image Hiding (TPAMI 2022) This repo is the official code for DeepMIH: Deep Invertible Network for Multip

Junpeng Jing 67 Nov 22, 2022
Pytorch port of Google Research's LEAF Audio paper

leaf-audio-pytorch Pytorch port of Google Research's LEAF Audio paper published at ICLR 2021. This port is not completely finished, but the Leaf() fro

Dennis Fedorishin 80 Oct 31, 2022
CLIPImageClassifier wraps clip image model from transformers

CLIPImageClassifier CLIPImageClassifier wraps clip image model from transformers. CLIPImageClassifier is initialized with the argument classes, these

Jina AI 6 Sep 12, 2022
Implementation of H-Transformer-1D, Hierarchical Attention for Sequence Learning using 🤗 transformers

hierarchical-transformer-1d Implementation of H-Transformer-1D, Hierarchical Attention for Sequence Learning using 🤗 transformers In Progress!! 2021.

MyungHoon Jin 7 Nov 06, 2022
Adaptive Pyramid Context Network for Semantic Segmentation (APCNet CVPR'2019)

Adaptive Pyramid Context Network for Semantic Segmentation (APCNet CVPR'2019) Introduction Official implementation of Adaptive Pyramid Context Network

21 Nov 09, 2022
Official PyTorch implementation of MX-Font (Multiple Heads are Better than One: Few-shot Font Generation with Multiple Localized Experts)

Introduction Pytorch implementation of Multiple Heads are Better than One: Few-shot Font Generation with Multiple Localized Expert. | paper Song Park1

Clova AI Research 97 Dec 23, 2022
PSANet: Point-wise Spatial Attention Network for Scene Parsing, ECCV2018.

PSANet: Point-wise Spatial Attention Network for Scene Parsing (in construction) by Hengshuang Zhao*, Yi Zhang*, Shu Liu, Jianping Shi, Chen Change Lo

Hengshuang Zhao 217 Oct 30, 2022
Segmentation models with pretrained backbones. PyTorch.

Python library with Neural Networks for Image Segmentation based on PyTorch. The main features of this library are: High level API (just two lines to

Pavel Yakubovskiy 6.6k Jan 06, 2023
PSGAN running with ncnn⚡妆容迁移/仿妆⚡Imitation Makeup/Makeup Transfer⚡

PSGAN running with ncnn⚡妆容迁移/仿妆⚡Imitation Makeup/Makeup Transfer⚡

WuJinxuan 144 Dec 26, 2022
DIR-GNN - Discovering Invariant Rationales for Graph Neural Networks

DIR-GNN "Discovering Invariant Rationales for Graph Neural Networks" (ICLR 2022)

Ying-Xin (Shirley) Wu 70 Nov 13, 2022
Implementation for our ICCV2021 paper: Internal Video Inpainting by Implicit Long-range Propagation

Implicit Internal Video Inpainting Implementation for our ICCV2021 paper: Internal Video Inpainting by Implicit Long-range Propagation paper | project

202 Dec 30, 2022
Secure Distributed Training at Scale

Secure Distributed Training at Scale This repository contains the implementation of experiments from the paper "Secure Distributed Training at Scale"

Yandex Research 9 Jul 11, 2022