Sign Language Recognition service utilizing a deep learning model with Long Short-Term Memory to perform sign language recognition.

Overview

Sign Language Recognition Service

This is a Sign Language Recognition service utilizing a deep learning model with Long Short-Term Memory to perform sign language recognition. The service was developed as a part of a bachelor project at Aalborg University.

alt text

Requirements

  • Python 3.7
  • OpenPose 1.6.0
  • CUDA 10.0
  • cuDNN 7.5.0
  • Numpy 1.18.5
  • OpenCV 4.5.1.48
  • Flask 1.1.2
  • Tensorflow 2.0.0
  • Pandas 1.1.5
  • Tensorboard
  • Matplotlib
  • Seaborn
  • Scikit-Learn

How to use

Installing OpenPose

  1. Please install OpenPose 1.6.0 for Python by following the official guide. Note that the newest release on the OpenPose github is 1.7.0 - for this service to work, 1.6.0 must be used.

    A few things to note when installing OpenPose:

    • When cloning the OpenPose repository, use the following git command to get version 1.6.0:
      git clone --depth 1 --branch v1.6.0 https://github.com/CMU-Perceptual-Computing-Lab/openpose
      
    • Remember to run the following command on the newly cloned repository:
      git submodule update --init --recursive --remote
      
    • Use Visual Studio Enterprise 2017 to build the required files. Install this first if you do not already have it.
    • Install CUDA 10.0 and cuDNN 7.5.0 for CUDA 10.0 after installing Visual Studio Enterprise 2017.
    • When generating the files using CMake, make sure that the BUILD_PYTHON flag is enabled, and that the Python version is set to 3.7. Also make sure that the detected CUDA version is 10.0.
    • After building with Visual Studio Enterprise 2017, make sure that all necessary files have been generated.
      • There should be a openpose.dll in /x64/Release/
      • There should be a openpose.exp and openpose.lib in /src/openpose/Release/
      • There should be a pyopenpose.cp37-win_amd64.pyd in /python/openpose/Release/
  2. Install requirements from requirements.txt

  3. Change the path in main/openpose/paths.py to the path of your OpenPose installation:

    # Change this path so it points to your OpenPose path relative to this file
    OPEN_POSE_PATH = get_relative_path(__file__, '../../../../openpose')
    
  4. If you get any errors related to OpenPose when running the service, please go back and make sure that all instructions have been followed - be particularly careful to install the correct CUDA/cuDNN versions, make sure that the BUILD_PYTHON flag was enabled and that Python 3.7 was used when generating the files.

When OpenPose is successfully installed, you can either use the existing model trained on our dataset, or you can choose to make your own dataset and train a model on this instead.

alt text

Using the service

A singular endpoint '/recognize' has been created in order to perform recognition, which allows for POST requests to be made. The endpoint expects a sequence of base64 images, which will get converted into a suitable format recognizable by the classifier.

alt text

alt text

Creating a custom dataset

In order to create a custom dataset, you can access the file create_dataset.py and change the following constant:

DATASET_NAME = 'dsl_dataset'

Such that the path in the constant DATASET_DIR points to a folder where the dataset is located. This folder should contain another folder called 'src', which contains folders for all the desired labels in the dataset. Each of these folders should contain videos of the corresponding sign.

Before running the script, the following constants can be tweaked based on the desired settings:

WINDOW_LENGTH = 60
STRIDE = 5
BATCH_SIZE = 512
VAL_SPLIT = 0.2
TEST_SPLIT = 0.1

Finally, the following constant can be changed:

CREATE_RAW_DATA = True

This is because initial feature extraction by OpenPose can be a fairly lengthy process. This allows for the tweaking of the dataset after features have been extracted, by setting this to False. Note that the raw OpenPose data must be created before the actual dataset can be created, so it is necessary to do this at least once.

Training a custom model

In order to train a custom model you can make use of the train_models.py file. Here, the constant DATASET_NAME can be changed to reflect the name of the dataset you wish to use, such that the DATASET_DIR points to the correct folder. Furthermore, you can specify a tensorboard directory:

DATASET_NAME = 'dsl_dataset'
DATASET_DIR = f'.\\main\\algorithm\\datasets\\{DATASET_NAME}'
MODELS_DIR = f'.\\main\\algorithm\\models\\{DATASET_NAME}'
TENSORBOARD_DIR = f'{MODELS_DIR}\\logs'

Before running the script, you can tweak various training settings as well as the hyper parameters of the model by changing the following constants:

MODEL_NAME = "model"
EPOCHS = 25
LAYER_SIZES = [64]
DENSE_LAYERS = [0]
DENSE_ACTIVATION = "relu"
LSTM_LAYERS = [2]
LSTM_ACTIVATION = "tanh"
OUTPUT_ACTIVATION = "softmax"

Note that the trainer can train multiple models depending on these settings. Changing the LAYER_SIZES, DENSE_LAYERS and LSTM_LAYERS to contain several values will result in a model being trained for each possible combination.

After training your model, you should change the paths.py located in main/core/ to reflect the path to the new model by changing the constant MODEL_NAME to the name of your model:

MODEL_NAME = 'dsl_lstm.model'

Finally, it also possible to generate a confusion matrix for your model by using the generate_confusion_matrix.py script. Here, you simply change the constants DATASET_NAME and MODEL_NAME such that the DATASET_DIR points to your dataset directory, and MODEL_DIR points to your model directory, respectively:

DATASET_NAME = "dsl_dataset"
MODEL_NAME = "dsl_lstm"
DATASET_DIR = f"./main/algorithm/datasets/{DATASET_NAME}/{DATASET_NAME}.pickle"
MODEL_DIR = f"./main/algorithm/models/{DATASET_NAME}/{MODEL_NAME}"

Happy signing :O)

Authors

  • Adil Cemalovic
  • Martin Lønne
  • Magnus Helleshøj Lund
Owner
Martin Lønne
Full-stack software developer with an interest in Cloud development. Is working most with Javascript, C#, and Python for machine learning.
Martin Lønne
An Implementation of the seglink alogrithm in paper Detecting Oriented Text in Natural Images by Linking Segments

Tips: A more recent scene text detection algorithm: PixelLink, has been implemented here: https://github.com/ZJULearning/pixel_link Contents: Introduc

dengdan 484 Dec 07, 2022
Contextual speed detection for python

Speed Prediction using Optical Flow and 2D CNN About the challenge: Comma.AI Speed Challenge This challenge was developed by Comma.AI to predict the s

Mahimana Bhatt 2 Dec 16, 2021
A fastai/PyTorch package for unpaired image-to-image translation.

Unpaired image-to-image translation A fastai/PyTorch package for unpaired image-to-image translation currently with CycleGAN implementation. This is a

Tanishq Abraham 120 Dec 02, 2022
DouZero is a reinforcement learning framework for DouDizhu - 斗地主AI

[ICML 2021] DouZero: Mastering DouDizhu with Self-Play Deep Reinforcement Learning | 斗地主AI

Kwai 3.1k Jan 05, 2023
A curated list of promising OCR resources

Call for contributor(paper summary,dataset generation,algorithm implementation and any other useful resources) awesome-ocr A curated list of promising

wanghaisheng 1.6k Jan 04, 2023
Python-based tools for document analysis and OCR

ocropy OCRopus is a collection of document analysis programs, not a turn-key OCR system. In order to apply it to your documents, you may need to do so

OCRopus 3.2k Dec 31, 2022
Fast style transfer

faststyle Faststyle aims to provide an easy and modular interface to Image to Image problems based on feature loss. Install Making sure you have a wor

Lucas Vazquez 21 Mar 11, 2022
Code for CVPR'2022 paper ✨ "Predict, Prevent, and Evaluate: Disentangled Text-Driven Image Manipulation Empowered by Pre-Trained Vision-Language Model"

PPE ✨ Repository for our CVPR'2022 paper: Predict, Prevent, and Evaluate: Disentangled Text-Driven Image Manipulation Empowered by Pre-Trained Vision-

Zipeng Xu 34 Nov 28, 2022
This is a GUI for scrapping PDFs with the help of optical character recognition making easier than ever to scrape PDFs.

pdf-scraper-with-ocr With this tool I am aiming to facilitate the work of those who need to scrape PDFs either by hand or using tools that doesn't imp

Jacobo José Guijarro Villalba 75 Oct 21, 2022
Computer vision applications project (Flask and OpenCV)

Computer Vision Applications Project This project is at it's initial phase. This is all about the implementation of different computer vision techniqu

Suryam Thapa 1 Jan 26, 2022
Code for the paper: Fusformer: A Transformer-based Fusion Approach for Hyperspectral Image Super-resolution

Fusformer Code for the paper: "Fusformer: A Transformer-based Fusion Approach for Hyperspectral Image Super-resolution" Plateform Python 3.8.5 + Pytor

Jin-Fan Hu (胡锦帆) 11 Dec 12, 2022
OCR of Chicago 1909 Renumbering Plan

Requirements: Python 3 (probably at least 3.4) pipenv (pip3 install pipenv) tesseract (brew install tesseract, at least if you have a mac and homebrew

ted whalen 2 Nov 21, 2021
Python Computer Vision Aim Bot for Roblox's Phantom Forces

Python-Phantom-Forces-Aim-Bot Python Computer Vision Aim Bot for Roblox's Phanto

drag0ngam3s 2 Jul 11, 2022
Textboxes_plusplus implementation with Tensorflow (python)

TextBoxes++-TensorFlow TextBoxes++ re-implementation using tensorflow. This project is greatly inspired by slim project And many functions are modifie

81 Dec 07, 2022
Python rubik's cube solver

This program makes a 3D representation of a rubiks cube and solves it step by step.

Pablo QB 4 May 29, 2022
Geometric Augmentation for Text Image

Text Image Augmentation A general geometric augmentation tool for text images in the CVPR 2020 paper "Learn to Augment: Joint Data Augmentation and Ne

Canjie Luo 440 Jan 05, 2023
EAST for ICPR MTWI 2018 Challenge II (Text detection of network images)

EAST_ICPR2018: EAST for ICPR MTWI 2018 Challenge II (Text detection of network images) Introduction This is a repository forked from argman/EAST for t

QichaoWu 49 Dec 24, 2022
CRAFT-Pyotorch:Character Region Awareness for Text Detection Reimplementation for Pytorch

CRAFT-Reimplementation Note:If you have any problems, please comment. Or you can join us weChat group. The QR code will update in issues #49 . Reimple

453 Dec 28, 2022
Binarize document images

Binarization Binarization for document images Examples Introduction This tool performs document image binarization (i.e. transform colour/grayscale to

QURATOR-SPK 48 Jan 02, 2023
Volume Control using OpenCV

Gesture-Volume-Control Volume Control using OpenCV Here i made volume control using Python and OpenCV in which we can control the volume of our laptop

Mudit Sinha 3 Oct 10, 2021