A Pytorch Implementation of ClariNet

Last update: Sep 15, 2022

Overview

ClariNet

A Pytorch Implementation of ClariNet (Mel Spectrogram --> Waveform)

Requirements

PyTorch 0.4.1 & python 3.6 & Librosa

Examples

Step 1. Download Dataset

LJSpeech : https://keithito.com/LJ-Speech-Dataset/

Step 2. Preprocessing (Preparing Mel Spectrogram)

python preprocessing.py --in_dir ljspeech --out_dir DATASETS/ljspeech

Step 3. Train Gaussian Autoregressive WaveNet (Teacher)

python train.py --model_name wavenet_gaussian --batch_size 8 --num_blocks 2 --num_layers 10

Step 4. Synthesize (Teacher)

--load_step CHECKPOINT : the # of the pre-trained teacher model's global training step (also depicted in the trained weight file)

python synthesize.py --model_name wavenet_gaussian --num_blocks 2 --num_layers 10 --load_step 10000 --num_samples 5

Step 5. Train Gaussian Inverse Autoregressive Flow (Student)

--teacher_name (YOUR TEACHER MODEL'S NAME)

--teacher_load_step CHECKPOINT : the # of the pre-trained teacher model's global training step (also depicted in the trained weight file)

--KL_type qp : Reversed KL divegence KL(q||p) or --KL_type pq : Forward KL divergence KL(p||q)

python train_student.py --model_name wavenet_gaussian_student --teacher_name wavenet_gaussian --teacher_load_step 10000 --batch_size 2 --num_blocks_t 2 --num_layers_t 10 --num_layers_s 10 --KL_type qp

Step 6. Synthesize (Student)

--model_name (YOUR STUDENT MODEL'S NAME)

--load_step CHECKPOINT : the # of the pre-trained student model's global training step (also depicted in the trained weight file)

--teacher_name (YOUR TEACHER MODEL'S NAME)

--teacher_load_step CHECKPOINT : the # of the pre-trained teacher model's global training step (also depicted in the trained weight file)

python synthesize_student.py --model_name wavenet_gaussian_student --load_step 10000 --teacher_name wavenet_gaussian --teacher_load_step 10000 --num_blocks_t 2 --num_layers_t 10 --num_layers_s 10 --num_samples 5

References

WaveNet vocoder : https://github.com/r9y9/wavenet_vocoder
ClariNet : https://arxiv.org/abs/1807.07281

A Pytorch Implementation of ClariNet

Related tags

Overview

ClariNet

Requirements

Examples

Step 1. Download Dataset

Step 2. Preprocessing (Preparing Mel Spectrogram)

Step 3. Train Gaussian Autoregressive WaveNet (Teacher)

Step 4. Synthesize (Teacher)

Step 5. Train Gaussian Inverse Autoregressive Flow (Student)

Step 6. Synthesize (Student)

References

Owner

Sungwon Kim

Code for the AAAI 2022 paper "Zero-Shot Cross-Lingual Machine Reading Comprehension via Inter-Sentence Dependency Graph".

This repository contains the code for the paper 'PARM: Paragraph Aggregation Retrieval Model for Dense Document-to-Document Retrieval' published at ECIR'22.

Source code and data in paper "MDFEND: Multi-domain Fake News Detection (CIKM'21)"

Hepsiburada - Hepsiburada Urun Bilgisi Cekme

Reproduction of Vision Transformer in Tensorflow2. Train from scratch and Finetune.

An implementation of "Learning human behaviors from motion capture by adversarial imitation"

Optimizing Value-at-Risk and Conditional Value-at-Risk of Black Box Functions with Lacing Values (LV)

Implementations of CNNs, RNNs, GANs, etc

Planning from Pixels in Environments with Combinatorially Hard Search Spaces -- NeurIPS 2021

Video Matting via Consistency-Regularized Graph Neural Networks

[ICLR 2022] Pretraining Text Encoders with Adversarial Mixture of Training Signal Generators

Automated detection of anomalous exoplanet transits in light curve data.

[ICLR 2021] "CPT: Efficient Deep Neural Network Training via Cyclic Precision" by Yonggan Fu, Han Guo, Meng Li, Xin Yang, Yining Ding, Vikas Chandra, Yingyan Lin

Code for paper "Do Language Models Have Beliefs? Methods for Detecting, Updating, and Visualizing Model Beliefs"

Winners of the Facebook Image Similarity Challenge

Rlmm blender toolkit - A set of tools to streamline level generation in UDK straight from Blender

This repo contains the code required to train the multivariate time-series Transformer.

A PyTorch-based R-YOLOv4 implementation which combines YOLOv4 model and loss function from R3Det for arbitrary oriented object detection.

A human-readable PyTorch implementation of "Self-attention Does Not Need O(n^2) Memory"

TensorFlow Metal Backend on Apple Silicon Experiments (just for fun)