HAR-stacked-residual-bidir-LSTMs - Deep stacked residual bidirectional LSTMs for HAR

Overview

HAR-stacked-residual-bidir-LSTM

The project is based on this repository which is presented as a tutorial. It consists of Human Activity Recognition (HAR) using stacked residual bidirectional-LSTM cells (RNN) with TensorFlow.

It resembles to the architecture used in "Google’s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation" without an attention mechanism and with just the encoder part. In fact, we started coding while thinking about applying residual connections to LSTMs - and it is only afterwards that we saw that such a deep LSTM architecture was already being used.

Here, we improve accuracy on the previously used dataset from 91% to 94% and we push the subject further by trying our architecture on another dataset.

Our neural network has been coded to be easy to adapt to new datasets (assuming it is given a fixed, non-dynamic, window of signal for every prediction) and to use different breadth, depth and length by using a new configuration file.

Here is a simplified overview of our architecture:

Simplified view of a "2x2" architecture. We obtain best results with a "3x3" architecture (details below figure).

Bear in mind that the time steps expands to the left for the whole sequence length and that this architecture example is what we call a "2x2" architecture: 2 residual cells as a block stacked 2 times for a total of 4 bidirectional cells, which is in reality 8 unidirectional LSTM cells. We obtain best results with a 3x3 architecture, consisting of 18 LSTM cells.

Neural network's architecture

Mainly, the number of stacked and residual layers can be parametrized easily as well as whether or not bidirectional LSTM cells are to be used. Input data needs to be windowed to an array with one more dimension: the training and testing is never done on full signal lengths and use shuffling with resets of the hidden cells' states.

We are using a deep neural network with stacked LSTM cells as well as residual (highway) LSTM cells for every stacked layer, a little bit like in ResNet, but for RNNs.

Our LSTM cells are also bidirectional in term of how they pass trough the time axis, but differ from classic bidirectional LSTMs by the fact we concatenate their output features rather than adding them in an element-wise fashion. A simple hidden ReLU layer then lowers the dimension of those concatenated features for sending them to the next stacked layer. Bidirectionality can be disabled easily.

Setup

We used TensorFlow 0.11 and Python 2. Sklearn is also used.

The two datasets can be loaded by running python download_datasets.py in the data/ folder.

To preprocess the second dataset (opportunity challenge dataset), the signal submodule of scipy is needed, as well as pandas.

Results using the previous public domain HAR dataset

This dataset named A Public Domain Dataset for Human Activity Recognition Using Smartphones is about classifying the type of movement amongst six categories: (WALKING, WALKING_UPSTAIRS, WALKING_DOWNSTAIRS, SITTING, STANDING, LAYING).

The bests results for a test accuracy of 94% are achieved with the 3x3 bidirectional architecture with a learning rate of 0.001 and an L2 regularization multiplier (weight decay) of 0.005, as seen in the 3x3_result_HAR_6.txt file.

Training and testing can be launched by running the config: python config_dataset_HAR_6_classes.py.

Results from the Opportunity dataset

The neural network has also been tried on the Opportunity dataset to see if the architecture could be easily adapted to a similar task.

Don't miss out this nice video that offers a nice overview and understanding of the dataset.

We obtain a test F1-score of 0.893. Our results can be compared to the state of the art DeepConvLSTM that is used on the same dataset and achieving a test F1-score of 0.9157.

We only used a subset of the full dataset as done in other research in order to simulate the conditions of the competition, using 113 sensor channels and classifying on the 17 categories output (and with the NULL class for a total of 18 classes). The windowing of the series for feeding in our neural network is also the same 24 time steps per classification, on a 30 Hz signal. However, we observed that there was no significant difference between using 128 time steps or 24 time steps (0.891 vs 0.893 F1-score). Our LSTM cells' inner representation is always reset to 0 between series. We also used mean and standard deviation normalization rather than min to max rescaling to rescale features to a zero mean and a standard deviation of 0.5. More details about preprocessing are explained furthermore in their paper. Other details, such as the fact that the classification output is sampled only at the last timestep for the training of the neural network, can be found in their preprocessing script that we adapted in our repository.

The config file can be runned like this: config_dataset_opportunity_18_classes.py. For best results, it is possible to readjust the learning rate such as in the 3x3_result_opportunity_18.txt file.

Citation

The paper is available on arXiv: https://arxiv.org/abs/1708.08989

Here is the BibTeX citation code:

@article{DBLP:journals/corr/abs-1708-08989,
  author    = {Yu Zhao and
               Rennong Yang and
               Guillaume Chevalier and
               Maoguo Gong},
  title     = {Deep Residual Bidir-LSTM for Human Activity Recognition Using Wearable
               Sensors},
  journal   = {CoRR},
  volume    = {abs/1708.08989},
  year      = {2017},
  url       = {http://arxiv.org/abs/1708.08989},
  archivePrefix = {arXiv},
  eprint    = {1708.08989},
  timestamp = {Mon, 13 Aug 2018 16:46:48 +0200},
  biburl    = {https://dblp.org/rec/bib/journals/corr/abs-1708-08989},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}

Collaborate with us on similar research projects

Join the slack workspace for time series processing, where you can:

  • Collaborate with us and other researchers on writing more time series processing papers, in the #research channel;
  • Do business with us and other companies for services and products related to time series processing, in the #business channel;
  • Talk about how to do Clean Machine Learning using Neuraxle, in the #neuraxle channel;

Online Course: Learn Deep Learning and Recurrent Neural Networks (DL&RNN)

We have created a course on Deep Learning and Recurrent Neural Networks (DL&RNN). Request an access to the course here. That is the most richly dense and accelerated course out there on this precise topic of DL&RNN.

We've also created another course on how to do Clean Machine Learning with the right design patterns and the right software architecture for your code to evolve correctly to be useable in production environments.

Owner
Guillaume Chevalier
e^(πi) + 1 = 0
Guillaume Chevalier
Unsupervised Semantic Segmentation by Contrasting Object Mask Proposals.

Unsupervised Semantic Segmentation by Contrasting Object Mask Proposals This repo contains the Pytorch implementation of our paper: Unsupervised Seman

Wouter Van Gansbeke 335 Dec 28, 2022
PAIRED in PyTorch 🔥

PAIRED This codebase provides a PyTorch implementation of Protagonist Antagonist Induced Regret Environment Design (PAIRED), which was first introduce

UCL DARK Lab 46 Dec 12, 2022
CVPR2021 Content-Aware GAN Compression

Content-Aware GAN Compression [ArXiv] Paper accepted to CVPR2021. @inproceedings{liu2021content, title = {Content-Aware GAN Compression}, auth

52 Nov 06, 2022
Examples of using f2py to get high-speed Fortran integrated with Python easily

f2py Examples Simple examples of using f2py to get high-speed Fortran integrated with Python easily. These examples are also useful to troubleshoot pr

Michael 35 Aug 21, 2022
A framework for analyzing computer vision models with simulated data

3DB: A framework for analyzing computer vision models with simulated data Paper Quickstart guide Blog post Installation Follow instructions on: https:

3DB 112 Jan 01, 2023
Vehicle detection using machine learning and computer vision techniques for Udacity's Self-Driving Car Engineer Nanodegree.

Vehicle Detection Video demo Overview Vehicle detection using these machine learning and computer vision techniques. Linear SVM HOG(Histogram of Orien

hata 1.1k Dec 18, 2022
Neural Network Libraries

Neural Network Libraries Neural Network Libraries is a deep learning framework that is intended to be used for research, development and production. W

Sony 2.6k Dec 30, 2022
Meaningful titles for tabs and PDF downloads! Also supports tab search.

arxiv-utils If you are a researcher that reads a lot on ArXiv, you'll benefit a lot from this web extension. Renames the title of PDF page to the pape

Johnson 174 Dec 20, 2022
Official Code Release for "TIP-Adapter: Training-free clIP-Adapter for Better Vision-Language Modeling"

Official Code Release for "TIP-Adapter: Training-free clIP-Adapter for Better Vision-Language Modeling" Pipeline of Tip-Adapter Tip-Adapter can provid

peng gao 187 Dec 28, 2022
Simulation of moving particles under microscopic imaging

Simulation of moving particles under microscopic imaging Install scipy numpy scikit-image tiffile Run python simulation.py Read result https://imagej

Zehao Wang 2 Dec 14, 2021
A universal memory dumper using Frida

Fridump Fridump (v0.1) is an open source memory dumping tool, primarily aimed to penetration testers and developers. Fridump is using the Frida framew

551 Jan 07, 2023
Official PyTorch implementation of the paper "TEMOS: Generating diverse human motions from textual descriptions"

TEMOS: TExt to MOtionS Generating diverse human motions from textual descriptions Description Official PyTorch implementation of the paper "TEMOS: Gen

Mathis Petrovich 187 Dec 27, 2022
Unofficial implementation (replicates paper results!) of MINER: Multiscale Implicit Neural Representations in pytorch-lightning

MINER_pl Unofficial implementation of MINER: Multiscale Implicit Neural Representations in pytorch-lightning. 📖 Ref readings Laplacian pyramid explan

AI葵 51 Nov 28, 2022
Anomaly detection related books, papers, videos, and toolboxes

Anomaly Detection Learning Resources Outlier Detection (also known as Anomaly Detection) is an exciting yet challenging field, which aims to identify

Yue Zhao 6.7k Dec 31, 2022
This repository is an implementation of our NeurIPS 2021 paper (Stylized Dialogue Generation with Multi-Pass Dual Learning) in PyTorch.

MPDL---TODO This repository is an implementation of our NeurIPS 2021 paper (Stylized Dialogue Generation with Multi-Pass Dual Learning) in PyTorch. Ci

CodebaseLi 3 Nov 27, 2022
ActNN: Reducing Training Memory Footprint via 2-Bit Activation Compressed Training

ActNN : Activation Compressed Training This is the official project repository for ActNN: Reducing Training Memory Footprint via 2-Bit Activation Comp

UC Berkeley RISE 178 Jan 05, 2023
A DCGAN to generate anime faces using custom mined dataset

Anime-Face-GAN-Keras A DCGAN to generate anime faces using custom dataset in Keras. Dataset The dataset is created by crawling anime database websites

Pavitrakumar P 190 Jan 03, 2023
PyTorch implementation for MINE: Continuous-Depth MPI with Neural Radiance Fields

MINE: Continuous-Depth MPI with Neural Radiance Fields Project Page | Video PyTorch implementation for our ICCV 2021 paper. MINE: Towards Continuous D

Zijian Feng 325 Dec 29, 2022
Elevation Mapping on GPU.

Elevation Mapping cupy Overview This is a ros package of elevation mapping on GPU. Code are written in python and uses cupy for GPU calculation. * pla

Robotic Systems Lab - Legged Robotics at ETH Zürich 183 Dec 19, 2022
Parasite: a tool allowing you to compress and decompress files, to reduce their size

🦠 Parasite 🦠 Parasite is a tool written in Python3 allowing you to "compress" any file, reducing its size. ⭐ Features ⭐ + Fast + Good optimization,

Billy 30 Nov 25, 2022