Pytorch implementation of "Grad-TTS: A Diffusion Probabilistic Model for Text-to-Speech"

Related tags

Deep LearningGradTTS
Overview

GradTTS

Unofficial Pytorch implementation of "Grad-TTS: A Diffusion Probabilistic Model for Text-to-Speech" (arxiv)

About this repo

This is an unofficial implementation of GradTTS. We created this project based on GlowTTS (https://github.com/jaywalnut310/glow-tts). We replace the GlowDecoder with DiffusionDecoder which follows the settings of the original paper. In addition, we also replace torch.distributed with horovod for convenience and we don't use fp16 now.

Training and inference

Please go to egs/ folder, and see run.sh and inference_waveglow_vocoder.py for example use. Before training, please download and extract the LJ Speech dataset, then rename or create a link to the dataset folder: ln -s /path/to/LJSpeech-1.1/wavs DUMMY. And build Monotonic Alignment Search Code (Cython): cd monotonic_align; python setup.py build_ext --inplace. Before inference, you should download waveglow checkpoint from download_link and put it into the waveglow folder.

Reference Materials

Grad-TTS: A Diffusion Probabilistic Model for Text-to-Speech

GlowTTS

Score-Based Generative Modeling through Stochastic Differential Equations

score_sde_pytorch

denoising-diffusion-pytorch

Authors

Heyang Xue(https://github.com/WelkinYang) and Qicong Xie(https://github.com/QicongXie)

Owner
HeyangXue1997
Speech synthesis/Singing voice synthesis/machine learning @ aslp, nwpu, Xi'an, ShannXi, China
HeyangXue1997
LIMEcraft: Handcrafted superpixel selectionand inspection for Visual eXplanations

LIMEcraft LIMEcraft: Handcrafted superpixel selectionand inspection for Visual eXplanations The LIMEcraft algorithm is an explanatory method based on

MI^2 DataLab 4 Aug 01, 2022
Unofficial implementation of "Coordinate Attention for Efficient Mobile Network Design"

Unofficial implementation of "Coordinate Attention for Efficient Mobile Network Design". CoordAttention tensorflow slim

Billy 9 Aug 22, 2022
AWS provides a Python SDK, "Boto3" ,which can be used to access the AWS-account from the local.

Boto3 - The AWS SDK for Python Boto3 is the Amazon Web Services (AWS) Software Development Kit (SDK) for Python, which allows Python developers to wri

Shreyas Srivastava 1 Oct 25, 2021
🧠 A PyTorch implementation of 'Deep CORAL: Correlation Alignment for Deep Domain Adaptation.', ECCV 2016

Deep CORAL A PyTorch implementation of 'Deep CORAL: Correlation Alignment for Deep Domain Adaptation. B Sun, K Saenko, ECCV 2016' Deep CORAL can learn

Andy Hsu 200 Dec 25, 2022
Detecting drunk people through thermal images using Deep Learning (CNN)

Drunk Detection CNN Detecting drunk people through thermal images using Deep Learning (CNN) Dataset We used thermal images provided by Electronics Lab

Giacomo Ferretti 3 Oct 27, 2022
Code for "PV-RAFT: Point-Voxel Correlation Fields for Scene Flow Estimation of Point Clouds", CVPR 2021

PV-RAFT This repository contains the PyTorch implementation for paper "PV-RAFT: Point-Voxel Correlation Fields for Scene Flow Estimation of Point Clou

Yi Wei 43 Dec 05, 2022
Convert BART models to ONNX with quantization. 3X reduction in size, and upto 3X boost in inference speed

fast-Bart Reduction of BART model size by 3X, and boost in inference speed up to 3X BART implementation of the fastT5 library (https://github.com/Ki6a

Siddharth Sharma 19 Dec 09, 2022
A Simplied Framework of GAN Inversion

Framework of GAN Inversion Introcuction You can implement your own inversion idea using our repo. We offer a full range of tuning settings (in hparams

Kangneng Zhou 13 Sep 27, 2022
Exploration of some patients clinical variables.

Answer_ALS_clinical_data Exploration of some patients clinical variables. All the clinical / metadata data is available here: https://data.answerals.o

1 Jan 20, 2022
In-place Parallel Super Scalar Samplesort (IPS⁓o)

In-place Parallel Super Scalar Samplesort (IPS⁓o) This is the implementation of the algorithm IPS⁓o presented in the paper Engineering In-place (Share

82 Dec 22, 2022
Learning the Beauty in Songs: Neural Singing Voice Beautifier; ACL 2022 (Main conference); Official code

Learning the Beauty in Songs: Neural Singing Voice Beautifier Jinglin Liu, Chengxi Li, Yi Ren, Zhiying Zhu, Zhou Zhao Zhejiang University ACL 2022 Mai

Jinglin Liu 257 Dec 30, 2022
nanodet_plus,yolov5_v6.0

OAK_Detection OAKč®¾å¤‡äøŠé€‚é…nanodet_plus,yolov5_v6.0 Environment pytorch = 1.7.0

ē‚¼äø¹åŽ»äŗ† 1 Feb 18, 2022
A minimalist environment for decision-making in autonomous driving

highway-env A collection of environments for autonomous driving and tactical decision-making tasks An episode of one of the environments available in

Edouard Leurent 1.6k Jan 07, 2023
Tensorflow implementation of Human-Level Control through Deep Reinforcement Learning

Human-Level Control through Deep Reinforcement Learning Tensorflow implementation of Human-Level Control through Deep Reinforcement Learning. This imp

Devsisters Corp. 2.4k Dec 26, 2022
J.A.R.V.I.S is an AI virtual assistant made in python.

J.A.R.V.I.S is an AI virtual assistant made in python. Running JARVIS Without Python To run JARVIS without python: 1. Head over to our installation pa

somePythonProgrammer 16 Dec 29, 2022
Re-implememtation of MAE (Masked Autoencoders Are Scalable Vision Learners) using PyTorch.

mae-repo PyTorch re-implememtation of "masked autoencoders are scalable vision learners". In this repo, it heavily borrows codes from codebase https:/

Peng Qiao 1 Dec 14, 2021
Woosung Choi 63 Nov 14, 2022
This is a template for the Non-autoregressive Deep Learning-Based TTS model (in PyTorch).

Non-autoregressive Deep Learning-Based TTS Template This is a template for the Non-autoregressive TTS model. It contains Data Preprocessing Pipeline D

Keon Lee 13 Dec 05, 2022
Double pendulum simulator using a symplectic Euler's method and Hamiltonian mechanics

Symplectic Double Pendulum Simulator Double pendulum simulator using a symplectic Euler's method. The program calculates the momentum and position of

Scott Marino 1 Jan 12, 2022
STEAL - Learning Semantic Boundaries from Noisy Annotations (CVPR 2019)

STEAL This is the official inference code for: Devil Is in the Edges: Learning Semantic Boundaries from Noisy Annotations David Acuna, Amlan Kar, Sanj

469 Dec 26, 2022