A collection of python scripts for extracting and analyzing acoustics from audio files.

Related tags

AudiopyAcoustics
Overview

pyAcoustics

https://img.shields.io/badge/license-MIT-blue.svg?

A collection of python scripts for extracting and analyzing acoustics from audio files.

1   Common Use Cases

What can you do with this library?

  • Extract pitch and intensity:

    pyacoustics.intensity_and_pitch.praat_pi.getPraatPitchAndIntensity()
    
  • Extract segments of a wav file:

    pyacoustics.signals.audio_scripts.getSubwav()
    
  • Perform simple manipulations on wav files:

    pyacoustics.signals.resampleAudio()
    
    pyacoustics.signals.splitStereoAudio()
    
  • Split audio files on segments of silence or on pure tones:

    pyacoustics.speech_detection.split_on_tone.splitFileOnTone()
    
  • Programmatically manipulate pitch or duration of a file:

    pyacoustics.morph.morph_utils.praat_pitch()
    
  • Mask speech with speech shaped noise:

    pyacoustics.speech_filters.speech_shaped_noise.batchMaskSpeakerData()
    
  • And more!

2   Major revisions

Ver 1.0 (June 7, 2015)

  • first public release.

3   Features as they are added

Mask speech with speech shaped noise (March 21, 2016)

Find syllable nuclei/estimate speech rate using Uwe Reichel's matlab code (July 29, 2015)

Find the valley bottom between peaks (July 7th, 2015)

4   Requirements

Many of the individual features require different packages. If you aren't using those packages then you don't need to install the dependencies.

pyacoustics.intensity_and_pitch.praat_pi requires praat

pyacoustics.intensity_and_pitch.get_f0 requires the ESPS getF0 function as implemented by Snack although I recall having difficulty installing it.

pyacoustics.speech_rate/dictionary_estimate.py requires my library psyle

pyacoustics.signals.data_fitting.py requires SciPy, NumPy, and scikit-learn

My praatIO library is used extensively and can be downloaded here

5   Installation

If you on Windows, you can use the installer found here (check that it is up to date though) Windows installer

PyAcoustics is on pypi and can be installed or upgraded from the command-line shell with pip like so:

python -m pip install pyacoustics --upgrade

Otherwise, to manually install, after downloading the source from github, from a command-line shell, navigate to the directory containing setup.py and type:

python setup.py install

If python is not in your path, you'll need to enter the full path e.g.:

C:\Python36\python.exe setup.py install

6   Example usage

See the example folders for a few real-world examples using this library.

  • examples/split_audio_on_silence.py

    Detects the presence of speech in a recording based on acoustic intensity. Everything louder than some threshold specified by the user is considered speech.

  • examples/split_audio_on_tone.py

    Detects the presence of pure tones in a recording. One can use this to automatically segment stimuli. Beeps can be played while the speech is being recorded and then later this tool can automatically segment the speech, based on the presence of those tones.

    Also detects speech using a pitch analysis. Most syllables contain some voicing, so a stream of modulating pitch values suggests that someone is speaking. This aspect is not extensively tested but it works well for the example files.

  • examples/estimate_speech_rate.py

    Calculates the speech rate through a matlab script written by Uwe Reichel that estimates the location of syllable boundaries.

7   Citing LMEDS

PyAcoustics is general purpose coding and doesn't need to be cited but if you would like to, it can be cited like so:

Tim Mahrt. PyAcoustics. https://github.com/timmahrt/pyAcoustics, 2016.

8   Acknowledgements

PyAcoustics is an ongoing collection of code with contributions from a number of projects worked on over several years. Development of various aspects of PyAcoustics was possible thanks to NSF grant IIS 07-03624 to Jennifer Cole and Mark Hasegawa-Johnson, NSF grant BCS 12-51343 to Jennifer Cole, José Hualde, and Caroline Smith, and NSF grant IBSS SMA 14-16791 to Jennifer Cole, Nancy McElwain, and Daniel Berry.

Owner
Tim
I write tools for working with speech data.
Tim
A2DP agent for promiscuous/permissive audio sinc.

Promiscuous Bluetooth audio sinc A2DP agent for promiscuous/permissive audio sinc for Linux. Once installed, a Bluetooth client, such as a smart phone

Jasper Aorangi 4 May 27, 2022
Sequencer: Deep LSTM for Image Classification

Sequencer: Deep LSTM for Image Classification Created by Yuki Tatsunami Masato Taki This repository contains implementation for Sequencer. Abstract In

Yuki Tatsunami 111 Dec 16, 2022
An app made in Python using the PyTube and Tkinter libraries to download videos and MP3 audio.

yt-dl (GUI Edition) An app made in Python using the PyTube and Tkinter libraries to download videos and MP3 audio. How do I download this? Windows: Fi

1 Oct 23, 2021
Supysonic is a Python implementation of the Subsonic server API.

Supysonic Supysonic is a Python implementation of the Subsonic server API. Current supported features are: browsing (by folders or tags) streaming of

Alban 228 Nov 19, 2022
Sync Toolbox - Python package with reference implementations for efficient, robust, and accurate music synchronization based on dynamic time warping (DTW)

Sync Toolbox - Python package with reference implementations for efficient, robust, and accurate music synchronization based on dynamic time warping (DTW)

Meinard Mueller 66 Jan 02, 2023
GNOME powered sound conversion

SoundConverter A simple sound converter application for the GNOME environment. It reads anything the GStreamer library can read, and writes Ogg Vorbis

Gautier Portet 188 Dec 17, 2022
A simple voice detection system which can be applied practically for designing a device with capability to detect a baby’s cry and automatically turning on music

Auto-Baby-Cry-Detection-with-Music-Player A simple voice detection system which can be applied practically for designing a device with capability to d

2 Dec 15, 2021
praudio provides audio preprocessing framework for Deep Learning audio applications

praudio provides objects and a script for performing complex preprocessing operations on entire audio datasets with one command.

Valerio Velardo 105 Dec 26, 2022
A python program for visualizing MIDI files, and displaying them in a spiral layout

SpiralMusic_python A python program for visualizing MIDI files, and displaying them in a spiral layout For a hardware version using Teensy & LED displ

Gavin 6 Nov 23, 2022
A python wrapper for REAPER

pyreaper A python wrapper for REAPER (Robust Epoch And Pitch EstimatoR) Installation pip install pyreaper Demonstration notebnook http://nbviewer.jupy

Ryuichi Yamamoto 56 Dec 27, 2022
無料で使える中品質なテキスト読み上げソフトウェア、VOICEVOXのコア

無料で使える中品質なテキスト読み上げソフトウェア、VOICEVOXのコア

Hiroshiba 0 Aug 29, 2022
Voicefixer aims at the restoration of human speech regardless how serious its degraded.

Voicefixer aims at the restoration of human speech regardless how serious its degraded.

Leo 324 Dec 26, 2022
Dataset and baseline code for the VocalSound dataset (ICASSP2022).

VocalSound: A Dataset for Improving Human Vocal Sounds Recognition Introduction Citing Download VocalSound Dataset Details Baseline Experiment Contact

Yuan Gong 58 Jan 03, 2023
gentle forced aligner

Gentle Robust yet lenient forced-aligner built on Kaldi. A tool for aligning speech with text. Getting Started There are three ways to install Gentle.

1.2k Dec 30, 2022
Algorithmic Multi-Instrumental MIDI Continuation Implementation

Matchmaker Algorithmic Multi-Instrumental MIDI Continuation Implementation Taming large-scale MIDI datasets with algorithms This is a WIP so please ch

Alex 2 Mar 11, 2022
Audio augmentations library for PyTorch for audio in the time-domain

Audio augmentations library for PyTorch for audio in the time-domain, with support for stochastic data augmentations as used often in self-supervised / contrastive learning.

Janne 166 Jan 08, 2023
Converting UGG files from Rode Wireless Go II transmitters (unsompressed recordings) to WAV format

Rode_WirelessGoII_UGG2wav Converting UGG files from Rode Wireless Go II transmitters (uncompressed recordings) to WAV format Story I backuped the .ugg

Ján Mazanec 31 Dec 22, 2022
Voice to Text using Raspberry Pi

This module will help to convert your voice (speech) into text using Speech Recognition Library. You can control the devices or you can perform the desired tasks by the word recognition

Raspberry_Pi Pakistan 2 Dec 15, 2021
Automatically move or copy files based on metadata associated with the files. For example, file your photos based on EXIF metadata or use MP3 tags to file your music files.

Automatically move or copy files based on metadata associated with the files. For example, file your photos based on EXIF metadata or use MP3 tags to file your music files.

Rhet Turnbull 14 Nov 02, 2022
Bot duniya Music Player

Bot duniya Music Player Requirements 📝 FFmpeg (Latest) NodeJS nodesource.com (NodeJS 17+) Python (3.10+) PyTgCalls (Lastest) 2nd Telegram Account (ne

Aman Vishwakarma 16 Oct 21, 2022