Unofficial PyTorch Implementation of Multi-Singer

Last update: Dec 28, 2022

Related tags

Deep Learning Multi-Singer

Overview

Multi-Singer

Unofficial PyTorch Implementation of Multi-Singer: Fast Multi-Singer Singing Voice Vocoder With A Large-Scale Corpus.

Requirements

See requirements in requirement.txt:

linux
python 3.6
pytorch 1.0+
librosa
json, tqdm, logging

TODO

1026: upload code
1024: implement multi-singer & perceptual loss
1023: implement singer encoder

Getting started

Apply recipe to your own dataset

Put any wav files in data directory
Edit configuration in config/config.yaml

1. Pretrain

Pretrain the Singer Embedding Extractor using repository here, and set the 'enc_model_fpath' in config/config.yaml

Note: Please set params as those in 'encoder/params_data' and 'encoder/params_model'.

2. Preprocess

Extract mel-spectrogram

python preprocess.py -i data/wavs -o data/feature -c config/config.yaml

-i your audio folder

-o output acoustic feature folder

-c config file

3. Train

Training conditioned on mel-spectrogram

python train.py -i data/feature -o checkpoints/ --config config/config.yaml

-i acoustic feature folder

-o directory to save checkpoints

-c config file

4. Inference

python inference.py -i data/feature -o outputs/  -c checkpoints/*.pkl -g config/config.yaml

-i acoustic feature folder

-o directory to save generated speech

-c checkpoints file

-c config file

5. Singing Voice Synthesis

For Singing Voice Synthesis:

Take modified FastSpeech for mel-spectrogram synthesis
Use synthesized mel-spectrogram in Multi-Singer for waveform synthesis.

Acknowledgements

Citation

Please cite this repository by the "Cite this repository" of About section (top right of the main page).

Question

Feel free to contact me at [email protected]

Unofficial PyTorch Implementation of Multi-Singer

Related tags

Overview

Multi-Singer

Requirements

TODO

Getting started

Apply recipe to your own dataset

1. Pretrain

Note: Please set params as those in 'encoder/params_data' and 'encoder/params_model'.

2. Preprocess

3. Train

4. Inference

5. Singing Voice Synthesis

Acknowledgements

Citation

Question

Owner

SunMail-hub

Driller: augmenting AFL with symbolic execution!

Static-test - A playground to play with ideas related to testing the comparability of the code

Complex-Valued Neural Networks (CVNN)Complex-Valued Neural Networks (CVNN)

[cvpr22] Perturbed and Strict Mean Teachers for Semi-supervised Semantic Segmentation

Open & Efficient for Framework for Aspect-based Sentiment Analysis

Karate Club: An API Oriented Open-source Python Framework for Unsupervised Learning on Graphs (CIKM 2020)

improvement of CLIP features over the traditional resnet features on the visual question answering, image captioning, navigation and visual entailment tasks.

Pytorch implementation for "Open Compound Domain Adaptation" (CVPR 2020 ORAL)

City Surfaces: City-scale Semantic Segmentation of Sidewalk Surfaces

A Pythonic library for Nvidia Codec.

Implementations of polygamma, lgamma, and beta functions for PyTorch

Educational API for 3D Vision using pose to control carton.

Dungeons and Dragons randomized content generator

Bib-parser - Convenient script to parse .bib files with the ACM Digital Library like metadata

THIS IS THE OLD PYMC PROJECT. PLEASE USE PYMC3 INSTEAD:

iNAS: Integral NAS for Device-Aware Salient Object Detection

DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective.

ICON: Implicit Clothed humans Obtained from Normals

Unsupervised Image to Image Translation with Generative Adversarial Networks

Easy to use Audio Tagging in PyTorch

Unofficial PyTorch Implementation of Multi-Singer

Related tags

Overview

Multi-Singer

Requirements

TODO

Getting started

Apply recipe to your own dataset

1. Pretrain

Note: Please set params as those in 'encoder/params_data' and 'encoder/params_model'.

2. Preprocess

3. Train

4. Inference

5. Singing Voice Synthesis

Acknowledgements

Citation

Question

Owner

SunMail-hub

Driller: augmenting AFL with symbolic execution!

Static-test - A playground to play with ideas related to testing the comparability of the code

Complex-Valued Neural Networks (CVNN)Complex-Valued Neural Networks (CVNN)

[cvpr22] Perturbed and Strict Mean Teachers for Semi-supervised Semantic Segmentation

Open & Efficient for Framework for Aspect-based Sentiment Analysis

Karate Club: An API Oriented Open-source Python Framework for Unsupervised Learning on Graphs (CIKM 2020)

improvement of CLIP features over the traditional resnet features on the visual question answering, image captioning, navigation and visual entailment tasks.

Pytorch implementation for "Open Compound Domain Adaptation" (CVPR 2020 ORAL)

City Surfaces: City-scale Semantic Segmentation of Sidewalk Surfaces

A Pythonic library for Nvidia Codec.

Implementations of polygamma, lgamma, and beta functions for PyTorch

Educational API for 3D Vision using pose to control carton.

Dungeons and Dragons randomized content generator

Bib-parser - Convenient script to parse .bib files with the ACM Digital Library like metadata

THIS IS THE **OLD** PYMC PROJECT. PLEASE USE PYMC3 INSTEAD:

iNAS: Integral NAS for Device-Aware Salient Object Detection

DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective.

ICON: Implicit Clothed humans Obtained from Normals

Unsupervised Image to Image Translation with Generative Adversarial Networks

Easy to use Audio Tagging in PyTorch

THIS IS THE OLD PYMC PROJECT. PLEASE USE PYMC3 INSTEAD: