LightSpeech: Lightweight and Fast Text to Speech with Neural Architecture Search

Last update: Dec 03, 2022

Overview

LightSpeech

UnOfficial PyTorch implementation of LightSpeech: Lightweight and Fast Text to Speech with Neural Architecture Search. This repo uses the FastSpeech 2 implementation of Espnet as a base. This repo only implements the final version of LightSpeech model not the Neural Architecture Search as mentioned in paper.

But I am able to compress only 3x (from 27 M to 7.99 M trainable parameters) not 15x.

Requirements :

All code written in Python 3.6.2 .

Install Pytorch

Before installing pytorch please check your Cuda version by running following command : nvcc --version

pip install torch torchvision

In this repo I have used Pytorch 1.6.0 for torch.bucketize feature which is not present in previous versions of PyTorch.

Installing other requirements :

pip install -r requirements.txt

To use Tensorboard install tensorboard version 1.14.0 seperatly with supported tensorflow (1.14.0)

For Preprocessing :

filelists folder contains MFA (Motreal Force aligner) processed LJSpeech dataset files so you don't need to align text with audio (for extract duration) for LJSpeech dataset. For other dataset follow instruction here. For other pre-processing run following command :

python .\nvidia_preprocessing.py -d path_of_wavs -c configs/default.yaml

For finding the min and max of F0 and Energy

python .\compute_statistics.py

Update the following in hparams.py by min and max of F0 and Energy

p_min = Min F0/pitch
p_max = Max F0
e_min = Min energy
e_max = Max energy

For training

 python train_lightspeech.py --outdir etc -c configs/default.yaml -n "name"

For inference

WIP

python .\inference.py -c .\configs\default.yaml -p .\checkpoints\first_1\xyz.pyt --out output --text "ModuleList can be indexed like a regular Python list but modules it contains are properly registered."

For TorchScript Export

python export_torchscript.py -c configs/default.yaml -n fastspeech_scrip --outdir etc

Checkpoint and samples:

WIP

LightSpeech: Lightweight and Fast Text to Speech with Neural Architecture Search

Related tags

Overview

LightSpeech

Requirements :

For Preprocessing :

For training

For inference

For TorchScript Export

Checkpoint and samples:

References

Owner

Rishikesh (ऋषिकेश)

This python module is an easy-to-use port of the text normalization used in the paper "Not low-resource anymore: Aligner ensembling, batch filtering, and new datasets for Bengali-English machine translation". It is intended to be used for normalizing / cleaning Bengali and English text.

This project aims to conduct a text information retrieval and text mining on medical research publication regarding Covid19 - treatments and vaccinations.

Chinese named entity recognization (bert/roberta/macbert/bert_wwm with Keras)

The training code for the 4th place model at MDX 2021 leaderboard A.

Use Google's BERT for named entity recognition （CoNLL-2003 as the dataset）.

BERT Attention Analysis

Automatic privilege escalation for misconfigured capabilities, sudo and suid binaries

A complete NLP guideline for enthusiasts

LeBenchmark: a reproducible framework for assessing SSL from speech

Mapping a variable-length sentence to a fixed-length vector using BERT model

Example code for "Real-World Natural Language Processing"

Top2Vec is an algorithm for topic modeling and semantic search.

A simple chatbot based on chatterbot that you can use for anything has basic features

Textlesslib - Library for Textless Spoken Language Processing

SimpleChinese2 集成了许多基本的中文NLP功能，使基于 Python 的中文文字处理和信息提取变得简单方便。

GAP-text2SQL: Learning Contextual Representations for Semantic Parsing with Generation-Augmented Pre-Training

AI-powered literature discovery and review engine for medical/scientific papers

GPT-3 command line interaction

pysentimiento: A Python toolkit for Sentiment Analysis and Social NLP tasks

DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference