PyTorch implementation for Stochastic Fine-grained Labeling of Multi-state Sign Glosses for Continuous Sign Language Recognition.

Overview

Stochastic CSLR

This is the PyTorch implementation for the ECCV 2020 paper: Stochastic Fine-grained Labeling of Multi-state Sign Glosses for Continuous Sign Language Recognition.

Quick Start

1. Installation

pip install git+https://github.com/zheniu/stochastic-cslr

Also, you need to install sclite for evaluation. Take a look at step 2 for instructions.

2. Prepare the dataset

  • Download the RWTH-PHOENIX-2014 dataset here.
  • Unzip it and obtain the path to phoenix-2014-multisigner/ folder for later use.
  • Install sclite for evaluation. Check phoenix-2014-multisigner/evaluation/NIST-sclite_sctk-2.4.0-20091110-0958.tar.bz2 for detail.
  • After installing sclite, put it in your PATH.

3. Run a quick test

You can use the script quick_test.py for a quick test.

python3 quick_test.py --data-root your_path_to/phoenix-2014-multisigner

By specifying the model type --model sfl/dfl, the data split --split dev/test, whether to use a language model--use-lm, you can get the following results:

Model WER (dev) sub/del/ins (dev) WER (test) sub/del/ins (test)
DFL 27.1 12.7/7.4/7.0 27.7 13.8/7.3/6.6
SFL 26.2 12.7/6.9/6.7 26.6 13.7/6.5/6.4
DFL + LM 25.6 11.5/9.2/4.9 26.4 12.4/9.3/4.7
SFL + LM 24.3 11.4/8.5/4.4 25.3 12.4/8.5/4.3

Note that these results are slightly different from the paper as a different random seed is used.

You may also take a look at quick_test.py as it shows how to use the pretrained models.

4. Train your own model

The configuration files for deterministic and stochastic fine-grained labeling are put under config/. The training script is based on a PyTorch experiment runner torchzq, which automatically reads the hyperparameters in the YAML file and passes them to stochastic_cslr/runner.py.

Before running, change the data_root in the YAML configurations to phoenix-2014-multisigner/ first.

Train (for instance, dfl):

tzq config/dfl-fp16.yml train

Test the trained model

tzq config/dfl-fp16.yml test

Citation

You may cite this work by:

@inproceedings{niu2020stochastic,
  title={Stochastic Fine-Grained Labeling of Multi-state Sign Glosses for Continuous Sign Language Recognition},
  author={Niu, Zhe and Mak, Brian},
  booktitle={European Conference on Computer Vision},
  pages={172--186},
  year={2020},
  organization={Springer}
}
Owner
Zhe Niu
PhD Candidate @ CSE, HKUST
Zhe Niu
Deep Inside Convolutional Networks - This is a caffe implementation to visualize the learnt model

Deep Inside Convolutional Networks This is a caffe implementation to visualize the learnt model. Part of a class project at Georgia Tech Problem State

Jigar 61 Apr 15, 2022
A Simulation Environment to train Robots in Large Realistic Interactive Scenes

iGibson: A Simulation Environment to train Robots in Large Realistic Interactive Scenes iGibson is a simulation environment providing fast visual rend

Stanford Vision and Learning Lab 493 Jan 04, 2023
Neural Logic Inductive Learning

Neural Logic Inductive Learning This is the implementation of the Neural Logic Inductive Learning model (NLIL) proposed in the ICLR 2020 paper: Learn

36 Nov 28, 2022
Implementation of FitVid video prediction model in JAX/Flax.

FitVid Video Prediction Model Implementation of FitVid video prediction model in JAX/Flax. If you find this code useful, please cite it in your paper:

Google Research 62 Nov 25, 2022
The implementation our EMNLP 2021 paper "Enhanced Language Representation with Label Knowledge for Span Extraction".

LEAR The implementation our EMNLP 2021 paper "Enhanced Language Representation with Label Knowledge for Span Extraction". **The code is in the "master

杨攀 93 Jan 07, 2023
Official implementation for the paper "Attentive Prototypes for Source-free Unsupervised Domain Adaptive 3D Object Detection"

Attentive Prototypes for Source-free Unsupervised Domain Adaptive 3D Object Detection PyTorch code release of the paper "Attentive Prototypes for Sour

Deepti Hegde 23 Oct 17, 2022
Deep Reinforced Attention Regression for Partial Sketch Based Image Retrieval.

DARP-SBIR Intro This repository contains the source code implementation for ICDM submission paper Deep Reinforced Attention Regression for Partial Ske

2 Jan 09, 2022
Boundary-preserving Mask R-CNN (ECCV 2020)

BMaskR-CNN This code is developed on Detectron2 Boundary-preserving Mask R-CNN ECCV 2020 Tianheng Cheng, Xinggang Wang, Lichao Huang, Wenyu Liu Video

Hust Visual Learning Team 178 Nov 28, 2022
BackgroundRemover lets you Remove Background from images and video with a simple command line interface

BackgroundRemover BackgroundRemover is a command line tool to remove background from video and image, made by nadermx to power https://BackgroundRemov

Johnathan Nader 1.7k Dec 30, 2022
A repository for the updated version of CoinRun used to collect MUGEN, a multimodal video-audio-text dataset.

A repository for the updated version of CoinRun used to collect MUGEN, a multimodal video-audio-text dataset. This repo contains scripts to train RL agents to navigate the closed world and collect vi

MUGEN 11 Oct 22, 2022
Exponential Graph is Provably Efficient for Decentralized Deep Training

Exponential Graph is Provably Efficient for Decentralized Deep Training This code repository is for the paper Exponential Graph is Provably Efficient

3 Apr 20, 2022
PyTorch implementation of the end-to-end coreference resolution model with different higher-order inference methods.

End-to-End Coreference Resolution with Different Higher-Order Inference Methods This repository contains the implementation of the paper: Revealing th

Liyan 52 Jan 04, 2023
Hierarchical Metadata-Aware Document Categorization under Weak Supervision (WSDM'21)

Hierarchical Metadata-Aware Document Categorization under Weak Supervision This project provides a weakly supervised framework for hierarchical metada

Yu Zhang 53 Sep 17, 2022
This project is based on our SIGGRAPH 2021 paper, ROSEFusion: Random Optimization for Online DenSE Reconstruction under Fast Camera Motion .

ROSEFusion 🌹 This project is based on our SIGGRAPH 2021 paper, ROSEFusion: Random Optimization for Online DenSE Reconstruction under Fast Camera Moti

219 Dec 27, 2022
Use unsupervised and supervised learning to predict stocks

AIAlpha: Multilayer neural network architecture for stock return prediction This project is meant to be an advanced implementation of stacked neural n

Vivek Palaniappan 1.5k Dec 26, 2022
Create time-series datacubes for supervised machine learning with ICEYE SAR images.

ICEcube is a Python library intended to help organize SAR images and annotations for supervised machine learning applications. The library generates m

ICEYE Ltd 65 Jan 03, 2023
Towards uncontrained hand-object reconstruction from RGB videos

Towards uncontrained hand-object reconstruction from RGB videos Yana Hasson, Gül Varol, Ivan Laptev and Cordelia Schmid Project page Paper Table of Co

Yana 69 Dec 27, 2022
CPPE - 5 (Medical Personal Protective Equipment) is a new challenging object detection dataset

CPPE - 5 CPPE - 5 (Medical Personal Protective Equipment) is a new challenging dataset with the goal to allow the study of subordinate categorization

Rishit Dagli 53 Dec 17, 2022
The FIRST GANs-based omics-to-omics translation framework

OmiTrans Please also have a look at our multi-omics multi-task DL freamwork 👀 : OmiEmbed The FIRST GANs-based omics-to-omics translation framework Xi

Xiaoyu Zhang 6 Dec 14, 2022