Fast and Simple Neural Vocoder, the Multiband RNNMS

Last update: Jan 11, 2022

Related tags

Deep Learning MultibandRNNMS

Overview

Multiband RNN_MS

Fast and Simple vocoder, Multiband RNN_MS.

Demo
Quick training
How to Use
System Details
Results
References

Demo

ToDO: Link super great impressive high-quatity audio demo.

Quick Training

Jump to ☞ , then Run. That's all!

How to Use

1. Install

# pip install "torch==1.10.0" -q      # Based on your environment (validated with v1.10)
# pip install "torchaudio==0.10.0" -q # Based on your environment
pip install git+https://github.com/tarepan/MultibandRNNMS

2. Data & Preprocessing

"Batteries Included".
RNNMS transparently download corpus and preprocess it for you 😉

3. Train

python -m mbrnnms.main_train

For arguments, check ./mbrnnms/config.py

Advanced: Other datasets

You can switch dataset with arguments.
All speechcorpusy's preset corpuses are supported.

# LJSpeech corpus
python -m mbrnnms.main_train data.data_name=LJ

Advanced: Custom dataset

Copy mbrnnms.main_train and replace DataModule.

    # datamodule = LJSpeechDataModule(batch_size, ...)
    datamodule = YourSuperCoolDataModule(batch_size, ...)
    # That's all!

System Details

Model

PreNet: GRU
Upsampler: time-directional nearest interpolation
Decoder: Embedding-auto-regressive generative RNN with 10-bit μ-law encoding

Results

Output Sample

Demo

Performance

X [iter/sec] @ NVIDIA T4 on Google Colaboratory (AMP+, num_workers=8)

It takes about Ydays for full training.

References

Acknowlegements

: Basic vocoder concept came from this paper.
bshall/UniversalVocoding: Model and hyperparams are derived from this repository. All codes are re-written.

Fast and Simple Neural Vocoder, the Multiband RNNMS

Related tags

Overview

Multiband RNN_MS

Demo

Quick Training

How to Use

1. Install

2. Data & Preprocessing

3. Train

Advanced: Other datasets

Advanced: Custom dataset

System Details

Model

Results

Output Sample

Performance

References

Acknowlegements

Owner

tarepan

Source code for From Stars to Subgraphs

Dynamic Head: Unifying Object Detection Heads with Attentions

Pacman-AI - AI project designed by UC Berkeley. Designed reflex and minimax agents for the game Pacman.

Implementing yolov4 target detection and tracking based on nao robot

This repository is for Contrastive Embedding Distribution Refinement and Entropy-Aware Attention Network (CEDR)

ObjectDrawer-ToolBox: a graphical image annotation tool to generate ground plane masks for a 3D object reconstruction system

Loopy belief propagation for factor graphs on discrete variables, in JAX!

Bayesian Optimization Library for Medical Image Segmentation.

Code for paper Adaptively Aligned Image Captioning via Adaptive Attention Time

A custom-designed Spider Robot trained to walk using Deep RL in a PyBullet Simulation

Siamese-nn-semantic-text-similarity - A repository containing comprehensive Neural Networks based PyTorch implementations for the semantic text similarity task

Multi-Scale Progressive Fusion Network for Single Image Deraining

Curriculum Domain Adaptation for Semantic Segmentation of Urban Scenes, ICCV 2017

Code for "On the Effects of Batch and Weight Normalization in Generative Adversarial Networks"

Flickr-Faces-HQ (FFHQ) is a high-quality image dataset of human faces, originally created as a benchmark for generative adversarial networks (GAN)

A basic implementation of Layer-wise Relevance Propagation (LRP) in PyTorch.

Official Implementation of DE-CondDETR and DELA-CondDETR in "Towards Data-Efficient Detection Transformers"

[NeurIPS 2020] Official repository for the project "Listening to Sound of Silence for Speech Denoising"

Optimising chemical reactions using machine learning

The code release of paper 'Domain Generalization for Medical Imaging Classification with Linear-Dependency Regularization' NIPS 2020.