praudio provides audio preprocessing framework for Deep Learning audio applications

Related tags

Audiopraudio
Overview

README

praudio provides objects and a script for performing complex preprocessing operations on entire audio datasets with one command.

praudio is implemented having Deep Learning audio/music applications in mind.

Operations are carried out on CPU. Preprocessing can also be run on-the-fly, for example, while training a model.

The library uses librosa as an audio processing backend.

How do I install the library?

You can install praudio both with pip via PyPi, and by cloning the praudio repo from GitHub.

For both approaches, it's advisable to use a dedicated Python virtual environment.

Installing from PyPi

Installing from PyPi is the easiest option. In the terminal type:

$ pip install praudio

Installing from GitHub

First, you should clone the repository from GitHub:

$ git clone [email protected]:musikalkemist/praudio.git

Then, move to the project root and, to install the package, type in the terminal:

$ pip install .

You can also use a rule in the available Makefile (see below):

$ make install 

To install the package in development mode use:

$ pip install -e .[testing]

You can also use a rule in Makefile:

$ make install_dev 

This will install all the packages necessary to run the tests, lint, type checker. It will also install the package in 'editable' mode, which is ideal for development.

Python version

praudio works in Python 3.6, 3.7, 3.8.

How do I preprocess an audio dataset?

The core of the library is the preprocess entry point. This script works with a config file. You set the type of preprocessing you want to apply in a yaml file, and then run the script. Your dataset will be entirely preprocessed and the results recursively stored in a directory of your choice that can potentially be created from scratch.

To run the entry point, ensure the library is installed and then type:

$ preprocess /path/to/config.yml

In the config.yml, you should provide the following parameters:

  • dataset_dir: Path to the directory where your audio dataset is stored
  • save_dir: Path where to save the preprocessed audio.
  • Under file_preprocessor, you should provide settings for loader and transforms_chain.
  • loader: Provide settings for the loader.
  • transforms_chain: Parameters for each transform in the sequence. of transforms which are applied to your data (i.e., TransformChain).

These config parameters are used to dinamically initialise the relative objects in the library. To learn what parameters are available at each level in the config file, please refer to the docstrings in the relative objects.

Check out test/config.sampleconfig.yml to see an example of a valid config file.

Package structure

The package is divided into a number of subpackages:

  • config
  • creation
  • io
  • preprocessors
  • transforms

config has facilities to load, save, and validate configuration files, which are used to specify the types of preprocessing pipelines to use.

creation has classes that are responsible to instantiate key objects in the library.

io contains facilities to load / save audio signals from / to files.

preprocessors features objects which are responsible to preprocess single audio files, from loading to storing, as well as, batch of files.

transforms contains a series of objects which manipulate audio signals, such as short-time Fourier transform, log, scaling.

What's the Makefile for?

The Makefile has a series of rules that can be used to ensure quality of the code, and automate repetitive tasks.

Linter

The project uses pylint. The linter helps enforcing a coding standard, sniffs for code smells and offers simple refactoring suggestions.

To run the linter type:

$ make lint

Typehint

The project uses mypy. mypy is an optional static type checker for Python. You can add type hints (PEP 484) to your Python programs, and use mypy to type check them statically.

To run the type checker type:

$ make typehint

Testing

The project uses pytest for unittests. Tests can be run in one go using coverage. This package suggests the percentage of code that is covered in unittests.

To run all the unittests type:

$ make test

Checklist

Checklist is a utility rule that runs the linter, type checker, and the test suite in one go:

$ make checklist

Clean

Use the clean rule to get rid of pyc files and __pychache__:

$ make clean

Dependencies

praudio has the following dependencies:

  • librosa==0.8.1
  • pyyaml==5.4.1
  • types-PyYAML==5.4.6

librosa is extensively used to extract audio features in transform objects.

Current limitations

The praudio preprocessors are capable of operating only on mono signals. This is a significant limitation if you are working in generative music. If you are using the library for audio / music analysis, this shouldn't be a problem.

Future improvements

  • Add audio augmentation / padding / cropping transforms.
  • Enable preprocessing of signals with multiple channels.
  • Turn transform parameters into full-fledged objects (e.g., STFTParams)
  • Instead of using a dictionary for configurations, instantiate parameter objects with validation
  • Implement different types of Savers / Loaders with factories to produce them.
Owner
Valerio Velardo
AI audio/music researcher. Love Python.
Valerio Velardo
Frescobaldi LilyPond Editor

README for Frescobaldi Homepage: http://www.frescobaldi.org/ Main author: Wilbert Berendsen Frescobaldi is a LilyPond sheet music text editor. It aims

Frescobaldi 600 Dec 29, 2022
digital audio workstation, instrument and effect plugins, wave editor

digital audio workstation, instrument and effect plugins, wave editor

306 Jan 05, 2023
Stream Music ๐ŸŽต ๐˜ผ ๐™—๐™ค๐™ฉ ๐™ฉ๐™๐™–๐™ฉ ๐™˜๐™–๐™ฃ ๐™ฅ๐™ก๐™–๐™ฎ ๐™ข๐™ช๐™จ๐™ž๐™˜ ๐™ค๐™ฃ ๐™๐™š๐™ก๐™š๐™œ๐™ง๐™–๐™ข ๐™‚๐™ง๐™ค๐™ช๐™ฅ ๐™–๐™ฃ๐™™ ๐˜พ๐™๐™–๐™ฃ๐™ฃ๐™š๐™ก ๐™‘๐™ค๐™ž๐™˜๐™š ๐˜พ๐™๐™–๐™ฉ๐™จ ๐˜ผ๐™ซ๐™–๐™ž๐™ก?

Stream Music ๐ŸŽต ๐˜ผ ๐™—๐™ค๐™ฉ ๐™ฉ๐™๐™–๐™ฉ ๐™˜๐™–๐™ฃ ๐™ฅ๐™ก๐™–๐™ฎ ๐™ข๐™ช๐™จ๐™ž๐™˜ ๐™ค๐™ฃ ๐™๐™š๐™ก๐™š๐™œ๐™ง๐™–๐™ข ๐™‚๐™ง๐™ค๐™ช๐™ฅ ๐™–๐™ฃ๐™™ ๐˜พ๐™๐™–๐™ฃ๐™ฃ๐™š๐™ก ๐™‘๐™ค๐™ž๐™˜๐™š ๐˜พ๐™๐™–๐™ฉ๐™จ ๐˜ผ๐™ซ๐™–๐™ž๐™ก?

Sadew Jayasekara 15 Nov 12, 2022
Scrap electronic music charts into CSV files

musiccharts A small python script to scrap (electronic) music charts into directories with csv files. Installation Download MusicCharts.exe Run MusicC

Dustin Scharf 1 May 11, 2022
Pyroomacoustics is a package for audio signal processing for indoor applications. It was developed as a fast prototyping platform for beamforming algorithms in indoor scenarios.

Summary Pyroomacoustics is a software package aimed at the rapid development and testing of audio array processing algorithms. The content of the pack

Audiovisual Communications Laboratory 1k Jan 09, 2023
แด€ ส™แดแด› แด›สœแด€แด› แด„แด€ษด แด˜สŸแด€ส แดแดœ๊œฑษชแด„ ษชษด แด›แด‡สŸแด‡ษขส€แด€แด ษขส€แดแดœแด˜ แดษด แด แดษชแด„แด‡ แด„แด€สŸสŸ

GJ516 LOVER'S ฤฑฤฑllฤฑllฤฑ โ™ฅ๏ธ โžคโƒGแดŠ516_แดแดœ๊œฑษชแด„_ส™แดแด› โ™ฅ๏ธ ฤฑllฤฑllฤฑ แด€ ส™แดแด› แด›สœแด€แด› แด„แด€ษด แด˜สŸแด€ส แดแดœ๊œฑษชแด„ ษชษด แด›แด‡สŸแด‡ษขส€แด€แด ษขส€แดแดœแด˜ แดษด แด แดษชแด„แด‡ แด„แด€สŸสŸ Requirements ๐Ÿ“ FFmpeg NodeJS nodesou

1 Nov 22, 2021
Voice to Text using Raspberry Pi

This module will help to convert your voice (speech) into text using Speech Recognition Library. You can control the devices or you can perform the desired tasks by the word recognition

Raspberry_Pi Pakistan 2 Dec 15, 2021
An audio guide for destroying oracles in Destiny's Vault of Glass raid

prophet An audio guide for destroying oracles in Destiny's Vault of Glass raid. This project allows you to make any encounter with oracles without hav

24 Sep 15, 2022
Okaeri-Music is a telegram music bot project, allow you to play music on voice chat group telegram.

๐Ÿ—„๏ธ PROJECT MUSIC,THIS IS MAINTAINED Okaeri-Music is a telegram bot project that's allow you to play music on telegram voice chat group Features ๐Ÿ”ฅ Th

Okaeri-Project 2 Dec 23, 2021
All-In-One Digital Audio Workstation and Plugin Suite

How to install Windows Mac OS X Fedora Ubuntu How to Build Debian and Ubuntu Fedora All Other Linux Distros Mac OS X Windows What is MusiKernel? MusiK

j3ffhubb 111 Sep 21, 2021
Manipulate audio with a simple and easy high level interface

Pydub Pydub lets you do stuff to audio in a way that isn't stupid. Stuff you might be looking for: Installing Pydub API Documentation Dependencies Pla

James Robert 6.6k Jan 01, 2023
Mina - A Telegram Music Bot 5 mandatory Assistant written in Python using Pyrogram and Py-Tgcalls

Mina - A Telegram Music Bot 5 mandatory Assistant written in Python using Pyrogram and Py-Tgcalls

3 Feb 07, 2022
Real-time audio visualizations (spectrum, spectrogram, etc.)

Friture Friture is an application to visualize and analyze live audio data in real-time. Friture displays audio data in several widgets, such as a sco

Timothรฉe Lecomte 700 Dec 31, 2022
Python audio and music signal processing library

madmom Madmom is an audio signal processing library written in Python with a strong focus on music information retrieval (MIR) tasks. The library is i

Institute of Computational Perception 1k Dec 26, 2022
A tool for retrieving audio in the past

Rewinder A tool for retrieving audio in the past. Ever felt like, I need to remember that discussion which happened 10 min back. Now you can! Rewind a

Bharat 1 Jan 24, 2022
A collection of python scripts for extracting and analyzing acoustics from audio files.

pyAcoustics A collection of python scripts for extracting and analyzing acoustics from audio files. Contents 1 Common Use Cases 2 Major revisions 3 Fe

Tim 74 Dec 26, 2022
Automatically move or copy files based on metadata associated with the files. For example, file your photos based on EXIF metadata or use MP3 tags to file your music files.

Automatically move or copy files based on metadata associated with the files. For example, file your photos based on EXIF metadata or use MP3 tags to file your music files.

Rhet Turnbull 14 Nov 02, 2022
Sparse Beta-Divergence Tensor Factorization Library

NTFLib Sparse Beta-Divergence Tensor Factorization Library Based off of this beta-NTF project this library is specially-built to handle tensors where

Stitch Fix Technology 46 Jan 08, 2022
The venturimeter works on the principle of Bernoulli's equation, i.e., the pressure decreases as the velocity increases.

The venturimeter works on the principle of Bernoulli's equation, i.e., the pressure decreases as the velocity increases. The cross-section of the throat is less than the cross-section of the inlet pi

Shankar Mahadevan L 1 Dec 03, 2021
Dataset and baseline code for the VocalSound dataset (ICASSP2022).

VocalSound: A Dataset for Improving Human Vocal Sounds Recognition Introduction Citing Download VocalSound Dataset Details Baseline Experiment Contact

Yuan Gong 58 Jan 03, 2023