A flask application to predict the speech emotion of any .wav file.

Last update: Dec 15, 2021

Overview

This is a speech emotion recognition app. It will allow you to train a modular MLP model with the RAVDESS dataset, and then use that model with a flask application to predict the speech emotion of any .wav file.

REQS:

To download the RAVDESS speech emotion recognition data, go to: https://drive.google.com/file/d/1wWsrN2Ep7x6lWqOXfr4rpKGYrJhWc8z7/view

for installing all dependencie simply open terminal and run:

. ./install_deps.sh

This should create your venv and populate it with all necessary dependencies

MODEL:

A multilayer perceptron model to detect the emotion of wav files. To create and edit the model see create_model.py Once the create_model.py is adjusted to your liking (emotions_to_observe, and path to sound data), simply run:

python3 create_model.py

to create the model.model binary file and test accuracy of your model

APP:

Once the model.model binary is created, you can spin up the flask application (ToneCheck): To do so run

. ./start_flask.sh

The app will run default on localhost:5000, the emotions available for predictions will correspond with the emotions_to_observe variable you have edited inside create_models.py (and are therefore available inside the model binary file)

A flask application to predict the speech emotion of any .wav file.

Related tags

Overview

REQS:

MODEL:

APP:

Owner

Aryan Vijaywargia

GSoC'2021 | TensorFlow implementation of Wav2Vec2

Implementation of some unbalanced loss like focal_loss, dice_loss, DSC Loss, GHM Loss et.al

Repo for Enhanced Seq2Seq Autoencoder via Contrastive Learning for Abstractive Text Summarization

Collection of useful (to me) python scripts for interacting with napari

Bidirectional LSTM-CRF and ELMo for Named-Entity Recognition, Part-of-Speech Tagging and so on.

Text editor on python to convert english text to malayalam(Romanization/Transiteration).

⛵️The official PyTorch implementation for "BERT-of-Theseus: Compressing BERT by Progressive Module Replacing" (EMNLP 2020).

Official PyTorch implementation of "Dual Path Learning for Domain Adaptation of Semantic Segmentation".

Hierarchical unsupervised and semi-supervised topic models for sparse count data with CorEx

Repositório da disciplina no semestre 2021-2

Vad-sli-asr - A Python scripts for a speech processing pipeline with Voice Activity Detection (VAD)

PyTorch Implementation of VAENAR-TTS: Variational Auto-Encoder based Non-AutoRegressive Text-to-Speech Synthesis.

A demo for end-to-end English and Chinese text spotting using ABCNet.

Code for hyperboloid embeddings for knowledge graph entities

Black for Python docstrings and reStructuredText (rst).

Code associated with the Don't Stop Pretraining ACL 2020 paper

Contact Extraction with Question Answering.

Training and evaluation codes for the BertGen paper (ACL-IJCNLP 2021)

SASE : Self-Adaptive noise distribution network for Speech Enhancement with heterogeneous data of Cross-Silo Federated learning

A multi-lingual approach to AllenNLP CoReference Resolution along with a wrapper for spaCy.