A Python wrapper for the high-quality vocoder "World"

Overview

PyWORLD - A Python wrapper of WORLD Vocoder

Linux Windows
Build Status Build Status

WORLD Vocoder is a fast and high-quality vocoder which parameterizes speech into three components:

  1. f0: Pitch contour
  2. sp: Harmonic spectral envelope
  3. ap: Aperiodic spectral envelope (relative to the harmonic spectral envelope)

It can also (re)synthesize speech using these features (see examples below).

For more information, please visit Dr. Morise's WORLD repository and the official website of WORLD Vocoder

APIs

Vocoder Functions

import pyworld as pw
_f0, t = pw.dio(x, fs)    # raw pitch extractor
f0 = pw.stonemask(x, _f0, t, fs)  # pitch refinement
sp = pw.cheaptrick(x, f0, t, fs)  # extract smoothed spectrogram
ap = pw.d4c(x, f0, t, fs)         # extract aperiodicity

y = pw.synthesize(f0, sp, ap, fs) # synthesize an utterance using the parameters

Utility

# Convert speech into features (using default arguments)
f0, sp, ap = pw.wav2world(x, fs)

You can change the default arguments of the function, too. See more info using help.

Installation

Using Pip

pip install pyworld

Building from Source

git clone https://github.com/JeremyCCHsu/Python-Wrapper-for-World-Vocoder.git
cd Python-Wrapper-for-World-Vocoder
git submodule update --init
pip install -U pip
pip install -r requirements.txt
pip install .

It will automatically git clone Morise's World Vocoder (C++ version).
(It seems to me that using virtualenv or conda is the best practice.)

Installation Validation

You can validate installation by running

cd demo
python demo.py

to see if you get results in test/ direcotry. (Please avoid writing and executing codes in the Python-Wrapper-for-World-Vocoder folder for now.)

Environment/Dependencies

  • Operating systems
    • Linux Ubuntu 14.04+
    • Windows (thanks to wuaalb)
    • WSL
  • Python
    • 2.7 (Windows is currently not supported)
    • 3.7/3.6/3.5

You can install dependencies these by pip install -r requirements.txt

Notice

  • WORLD vocoder is designed for speech sampled ≥ 16 kHz. Applying WORLD to 8 kHz speech will fail. See a possible workaround here.
  • When the SNR is low, extracting pitch using harvest instead of dio is a better option.

Troubleshooting

  1. Upgrade your Cython version to 0.24.
    (I failed to build it on Cython 0.20.1post0)
    It'll require you to download Cython form http://cython.org/
    Unzip it, and python setup.py install it.
    (I tried pip install Cython but the upgrade didn't seem correct)
    (Again, add --user if you don't have root access.)
  2. Upon executing demo/demo.py, the following code might be needed in some environments (e.g. when you're working on a remote Linux server):
import matplotlib
matplotlib.use('Agg')
  1. If you encounter library not found: sndfile error upon executing demo.py,
    you might have to install it by apt-get install libsoundfile1.
    You can also replace pysoundfile with scipy or librosa, but some modification is needed:

    • librosa:
      • load(fiilename, dtype=np.float64)
      • output.write_wav(filename, wav, fs)
      • remember to pass dtype argument to ensure that the method gives you a double.
    • scipy:
      • You'll have to write a customized utility function based on the following methods
      • scipy.io.wavfile.read (but this gives you short)
      • scipy.io.wavfile.write
  2. If you have installation issue on Windows, I probably could not provide much help because my development environment is Ubuntu and Windows Subsystem for Linux (read this if you are interested in installing it).

Other Installation Suggestions

  1. Use pip install . is safer and you can easily uninstall pyworld by pip uninstall pyworld
  • For Mac users: You might need to do MACOSX_DEPLOYMENT_TARGET=10.9 pip install . See issue.
  1. Another way to install pyworld is via
    python setup.py install
    • Add --user if you don't have root access
    • Add --record install.txt to track the installation dir
  2. If you just want to try out some experiments, execute
    python setup.py build_ext --inplace
    Then you can use PyWorld from this directory.
    You can also copy the resulting pyworld.so (pyworld.{arch}.pyd on Windows) file to ~/.local/lib/python2.7/site-packages (or corresponding Windows directory) so that you can use it everywhere like an installed package.
    Alternatively you can copy/symlink the compiled files using pip, e.g. pip install -e .

Acknowledgement

Thank all contributors (tats-u, wuaalb, r9y9, rikrd, kudan2510) for making this repo better and sotelo whose world.py inspired this repo.

Owner
Jeremy Hsu
A PhD student drowning in the ocean of generative models.
Jeremy Hsu
?️ Open Source Audio Matching and Mastering

Matching + Mastering = ❤️ Matchering 2.0 is a novel Containerized Web Application and Python Library for audio matching and mastering. It follows a si

Sergey Grishakov 781 Jan 05, 2023
A music player designed for a University Project.

A music player designed for a University Project. Very flexibe and easy to use, a real life working application with user friendly controls. Hope u enjoy!!

Aditya Johorey 1 Nov 19, 2021
A library for augmenting annotated audio data

muda A library for Musical Data Augmentation. muda package implements annotation-aware musical data augmentation, as described in the muda paper. The

Brian McFee 214 Nov 22, 2022
OpenClubhouse - A third-part web application based on flask to play Clubhouse audio.

OpenClubhouse - A third-part web application based on flask to play Clubhouse audio.

1.1k Jan 05, 2023
Musillow is a music recommender app that finds songs similar to your favourites.

MUSILLOW The music recommender app Check it out now!!! View Demo · Report Bug · Request Feature About The App Musillow is a music recommender app that

3 Feb 03, 2022
Python game programming in Jupyter notebooks.

Jupylet Jupylet is a Python library for programming 2D and 3D games, graphics, music and sound synthesizers, interactively in a Jupyter notebook. It i

Nir Aides 178 Dec 09, 2022
Speech recognition module for Python, supporting several engines and APIs, online and offline.

SpeechRecognition Library for performing speech recognition, with support for several engines and APIs, online and offline. Speech recognition engine/

Anthony Zhang 6.7k Jan 08, 2023
Cobra is a highly-accurate and lightweight voice activity detection (VAD) engine.

On-device voice activity detection (VAD) powered by deep learning.

Picovoice 88 Dec 16, 2022
Python interface to the WebRTC Voice Activity Detector

py-webrtcvad This is a python interface to the WebRTC Voice Activity Detector (VAD). It is compatible with Python 2 and Python 3. A VAD classifies a p

John Wiseman 1.5k Dec 22, 2022
Python tools for the corpus analysis of popular music.

CATCHY Corpus Analysis Tools for Computational Hook discovery Python tools for the corpus analysis of popular music recordings. The tools can be used

Jan VB 20 Aug 20, 2022
Graphical interface to control granular sound synthesis.

Granular sound synthesis interface SoundGrain is a graphical interface where users can draw and edit trajectories to control granular sound synthesis

Olivier Bélanger 122 Dec 10, 2022
Spotify Song Recommendation Program

Spotify-Song-Recommendation-Program Made by Esra Nur Özüm Written in Python The aim of this project was to build a recommendation system that recommen

esra nur özüm 1 Jun 30, 2022
Telegram Voice-Chat Bot Written In Python Using Pyrogram.

Telegram Voice-Chat Bot Telegram Voice-Chat Bot To Play Music From Various Sources In Your Group Support All linux based os. Windows Mac Diagram Requi

TheHamkerCat 314 Dec 29, 2022
Royal Music You can play music and video at a time in vc

Royals-Music Royal Music You can play music and video at a time in vc Commands SOON String STRING_SESSION Deployment 🎖 Credits • 🇸ᴏᴍʏᴀ⃝🇯ᴇᴇᴛ • 🇴ғғɪ

2 Nov 23, 2021
Official implementation of A cappella: Audio-visual Singing VoiceSeparation, from BMVC21

Y-Net Official implementation of A cappella: Audio-visual Singing VoiceSeparation, British Machine Vision Conference 2021 Project page: ipcv.github.io

Juan F. Montesinos 12 Oct 22, 2022
This is an AI that runs in the terminal. It is a voice assistant that can do common activities and can also help in your coding doubts like

This is an AI that runs in the terminal. It is a voice assistant that can do common activities and can also help in your coding doubts like

OneBit 1 Nov 05, 2021
Python Audio Analysis Library: Feature Extraction, Classification, Segmentation and Applications

A Python library for audio feature extraction, classification, segmentation and applications This doc contains general info. Click here for the comple

Theodoros Giannakopoulos 5.1k Jan 02, 2023
Audio2midi - Automatic Audio-to-symbolic Arrangement

Automatic Audio-to-symbolic Arrangement This is the repository of the project "A

Ziyu Wang 24 Dec 05, 2022
A Python wrapper around the Soundcloud API

soundcloud-python A friendly wrapper around the Soundcloud API. Installation To install soundcloud-python, simply: pip install soundcloud Or if you'r

SoundCloud 84 Dec 31, 2022
GiantMIDI-Piano is a classical piano MIDI dataset contains 10,854 MIDI files of 2,786 composers

GiantMIDI-Piano is a classical piano MIDI dataset contains 10,854 MIDI files of 2,786 composers

Bytedance Inc. 1.3k Jan 04, 2023