TONet: Tone-Octave Network for Singing Melody Extraction from Polyphonic Music

Overview

TONet

Introduction

The official implementation of "TONet: Tone-Octave Network for Singing Melody Extraction from Polyphonic Music", in ICASSP 2022

We propose TONet, a plug-and-play model that improves both tone and octave perceptions by leveraging a novel input representation and a novel network architecture. Any CFP-input-based Model can be settled in TONet and lead to possible better performance.

TONet Architecture

Main Results on Extraction Performance

Experiments are done to verify the capability of TONet with various baseline backbone models. Our results show that tone-octave fusion with Tone-CFP can significantly improve the singing voice extraction performance across various datasets -- with substantial gains in octave and tone accuracy.

Results

Getting Started

Download Datasets

After downloading the data, use the txt files in the data folder, and process the CFP feature by feature_extraction.py.

Overwrite the Configuration

The config.py contains all configurations you need to change and set.

Train and Evaluation

python main.py train

python main.py test

Produce the Estimation Digram

Uncomment the write prediction in tonet.py

Estimation

Model Checkpoints

We provide the best TO-FTANet checkpoints in this link. More checkpoints will be uploaded.

Citing

@inproceedings{tonet-ke2022,
  author = {Ke Chen, Shuai Yu, Cheng-i Wang, Wei Li, Taylor Berg-Kirkpatrick, Shlomo Dubnov},
  title = {TONet: Tone-Octave Network for Singing Melody Extraction  from Polyphonic Music},
  booktitle = {{ICASSP} 2022}
}
Owner
Knut(Ke) Chen
ORZ: { godfather: sweetdum, ufo: zgg, dragon sister: lzl, morning king: corner cafƩ }
Knut(Ke) Chen
spafe: Simplified Python Audio-Features Extraction

spafe aims to simplify features extractions from mono audio files. The library can extract of the following features: BFCC, LFCC, LPC, LPCC, MFCC, IMFCC, MSRCC, NGCC, PNCC, PSRCC, PLP, RPLP, Frequenc

Ayoub Malek 310 Jan 01, 2023
Full LAKH MIDI dataset converted to MuseNet MIDI output format (9 instruments + drums)

LAKH MuseNet MIDI Dataset Full LAKH MIDI dataset converted to MuseNet MIDI output format (9 instruments + drums) Bonus: Choir on Channel 10 Please CC

Alex 6 Nov 20, 2022
A fast MDCT implementation using SciPy and FFTs

MDCT A fast MDCT implementation using SciPy and FFTs Installation As usual pip install mdct Dependencies NumPy SciPy STFT Usage import mdct spectrum

Nils Werner 43 Sep 02, 2022
Convert complex chord names to midi notes

ezchord Simple python script that can convert complex chord names to midi notes Prerequisites pip install midiutil Usage ./ezchord.py Dmin7 G7 C timi

Alex Zhang 2 Dec 20, 2022
Marsyas - Music Analysis, Retrieval and Synthesis for Audio Signals

Welcome to MARSYAS. MARSYAS is a software framework for rapid prototyping of audio applications, with flexibility and extensibility as primary concer

Marsyas Developers Group 364 Oct 31, 2022
Python CD-DA ripper preferring accuracy over speed

Whipper Whipper is a Python 3 (3.6+) CD-DA ripper based on the morituri project (CDDA ripper for *nix systems aiming for accuracy over speed). It star

671 Jan 04, 2023
GNU Radio – the Free and Open Software Radio Ecosystem

GNU Radio is a free & open-source software development toolkit that provides signal processing blocks to implement software radios. It can be used wit

GNU Radio 4.1k Jan 06, 2023
Pyrogram bot to automate streaming music in voice chats

Pyrogram bot to automate streaming music in voice chats Help If you face an error, want to discuss this project or get support for it, join it's group

Roj 124 Oct 21, 2022
Official implementation of A cappella: Audio-visual Singing VoiceSeparation, from BMVC21

Y-Net Official implementation of A cappella: Audio-visual Singing VoiceSeparation, British Machine Vision Conference 2021 Project page: ipcv.github.io

Juan F. Montesinos 12 Oct 22, 2022
Bot Music Pintar. Created by Rio

šŸŽ¶ Rio Music šŸŽ¶ Kalo Fork Star Ya Bang Hehehe Requirements šŸ“ FFmpeg NodeJS nodesource.com Python 3.8+ or 3.7 PyTgCalls Generate String Using Replit ⤵

RioProjectX 7 Jun 15, 2022
Audio fingerprinting and recognition in Python

dejavu Audio fingerprinting and recognition algorithm implemented in Python, see the explanation here: How it works Dejavu can memorize audio by liste

Will Drevo 6k Jan 06, 2023
Implementation of "Slow-Fast Auditory Streams for Audio Recognition, ICASSP, 2021" in PyTorch

Auditory Slow-Fast This repository implements the model proposed in the paper: Evangelos Kazakos, Arsha Nagrani, Andrew Zisserman, Dima Damen, Slow-Fa

Evangelos Kazakos 57 Dec 07, 2022
digital audio workstation, instrument and effect plugins, wave editor

digital audio workstation, instrument and effect plugins, wave editor

306 Jan 05, 2023
A lightweight yet powerful audio-to-MIDI converter with pitch bend detection

Basic Pitch is a Python library for Automatic Music Transcription (AMT), using lightweight neural network developed by Spotify's Audio Intelligence La

Spotify 1.4k Jan 01, 2023
MelGAN test on audio decoding

Official repository for the paper MelGAN: Generative Adversarial Networks for Conditional Waveform Synthesis The original work URL: https://github.com

Jurio 1 Apr 29, 2022
IDing the songs played on the do you radio show

IDing the songs played on the do you radio show

Rasmus Jones 36 Nov 15, 2022
Gateware for the Terasic/Arrow DECA board, to become a USB2 high speed audio interface

DECA USB Audio Interface DECA based USB 2.0 High Speed audio interface Status / current limitations enumerates as class compliant audio device on Linu

Hans Baier 16 Mar 21, 2022
A2DP agent for promiscuous/permissive audio sinc.

Promiscuous Bluetooth audio sinc A2DP agent for promiscuous/permissive audio sinc for Linux. Once installed, a Bluetooth client, such as a smart phone

Jasper Aorangi 4 May 27, 2022
šŸŽµ A music bot for discord servers!

music bot A music bot for Discord Servers Features Play songs in your discord server Get the lyrics without going on a web explorer Commands Command P

1 Jul 25, 2022
Gammatone-based spectrograms, using gammatone filterbanks or Fourier transform weightings.

Gammatone Filterbank Toolkit Utilities for analysing sound using perceptual models of human hearing. Jason Heeris, 2013 Summary This is a port of Malc

Jason Heeris 188 Dec 14, 2022