A design of MIDI language for music generation task, specifically for Natural Language Processing (NLP) models.

Overview

MIDI Language

Introduction

Reference

Paper: Pop Music Transformer: Beat-based Modeling and Generation of Expressive Pop Piano Compositions: code

This is a modified version with an extension of multi-instrumental support.

Function

Convert Midi into event sequence, and represented by mapped integer array.

This could send to NLP models for AI auto music composition.

Due to this project considers more about music structures as well as its chord and melody on higher level, including note, drum, tempo, musical instrument (program in midi) and its expressions (tempo and velocity), rather than digging into too much details like sound source & direction, instrumental performing techniques (such as, bend sound, piano sustain pedal, violin overtones), the language of MIDI is design this way (see chapter Details below).

Usage

See language.py, it contains procedures:

  • load w2i (word to integer) and i2w (integer to word), for not calculating it every time;
  • encode midi to iteger array, each object handle one mid file;
  • decode integer array to midi, each object handle many results and export to mid files;

The code language.py has arguments:

  • input: input file of audio file to encode/decode;
  • output: output file of audio file to encode;
  • train: if have, it will switch to training mode with variations (data augmentation);

MidiEncoder data augmentation:

  • pitch_variation_range: a random pitch shift within a range for whole midi;
  • velocity_scale_variation_range: a random note/drum velocity scale for whole midi;
  • velocity_noise_scale_variation_range: a random note/drum velocity scale for each element within midi;
  • tempo_scale_variation_range: a random tempo change for whole midi;

MidiDecoder needs numerator and denominator time signatures for reconstructing midi files.

Details

Event Structure

Required:

  • Bar
  • Position (0~split-1)

Optional:

  • note:
    • Note
    • Program (0~127)
    • Pitch (0~127)
    • Velocity (0~127)
    • Duration (0~split*bar_scale-1)
  • drum:
    • Drum
    • Program (0~127)
    • Pitch (0~127)
    • Velocity (0~127)
    • Duration (0~split*bar_scale-1)
  • chord:
    • Chord (chroma_name:chord_name)
  • tempo:
    • Tempo_Class (T0~Ti)
    • Tempo_Value (0~59)
Owner
Robert Bogan Kang
hello rbk!
Robert Bogan Kang
Materials (slides, code, assignments) for the NYU class I teach on NLP and ML Systems (Master of Engineering).

FREE_7773 Repo containing material for the NYU class (Master of Engineering) I teach on NLP, ML Sys etc. For context on what the class is trying to ac

Jacopo Tagliabue 90 Dec 19, 2022
CPT: A Pre-Trained Unbalanced Transformer for Both Chinese Language Understanding and Generation

CPT This repository contains code and checkpoints for CPT. CPT: A Pre-Trained Unbalanced Transformer for Both Chinese Language Understanding and Gener

fastNLP 342 Jan 05, 2023
NLTK Source

Natural Language Toolkit (NLTK) NLTK -- the Natural Language Toolkit -- is a suite of open source Python modules, data sets, and tutorials supporting

Natural Language Toolkit 11.4k Jan 04, 2023
Reading Wikipedia to Answer Open-Domain Questions

DrQA This is a PyTorch implementation of the DrQA system described in the ACL 2017 paper Reading Wikipedia to Answer Open-Domain Questions. Quick Link

Facebook Research 4.3k Jan 01, 2023
VD-BERT: A Unified Vision and Dialog Transformer with BERT

VD-BERT: A Unified Vision and Dialog Transformer with BERT PyTorch Code for the following paper at EMNLP2020: Title: VD-BERT: A Unified Vision and Dia

Salesforce 44 Nov 01, 2022
txtai: Build AI-powered semantic search applications in Go

txtai: Build AI-powered semantic search applications in Go txtai executes machine-learning workflows to transform data and build AI-powered semantic s

NeuML 49 Dec 06, 2022
Interactive Jupyter Notebook Environment for using the GPT-3 Instruct API

gpt3-instruct-sandbox Interactive Jupyter Notebook Environment for using the GPT-3 Instruct API Description This project updates an existing GPT-3 san

312 Jan 03, 2023
A collection of GNN-based fake news detection models.

This repo includes the Pytorch-Geometric implementation of a series of Graph Neural Network (GNN) based fake news detection models. All GNN models are implemented and evaluated under the User Prefere

SafeGraph 251 Jan 01, 2023
The ibet-Prime security token management system for ibet network.

ibet-Prime The ibet-Prime security token management system for ibet network. Features ibet-Prime is an API service that enables the issuance and manag

BOOSTRY 8 Dec 22, 2022
A Transformer Implementation that is easy to understand and customizable.

Simple Transformer I've written a series of articles on the transformer architecture and language models on Medium. This repository contains an implem

Naoki Shibuya 4 Jan 20, 2022
Open-World Entity Segmentation

Open-World Entity Segmentation Project Website Lu Qi*, Jason Kuen*, Yi Wang, Jiuxiang Gu, Hengshuang Zhao, Zhe Lin, Philip Torr, Jiaya Jia This projec

DV Lab 408 Dec 29, 2022
NLP command-line assistant powered by OpenAI

NLP command-line assistant powered by OpenAI

Axel 16 Dec 09, 2022
This repository contains data used in the NAACL 2021 Paper - Proteno: Text Normalization with Limited Data for Fast Deployment in Text to Speech Systems

Proteno This is the data release associated with the corresponding NAACL 2021 Paper - Proteno: Text Normalization with Limited Data for Fast Deploymen

37 Dec 04, 2022
Pytorch-version BERT-flow: One can apply BERT-flow to any PLM within Pytorch framework.

Pytorch-version BERT-flow: One can apply BERT-flow to any PLM within Pytorch framework.

Ubiquitous Knowledge Processing Lab 59 Dec 01, 2022
PyTorch Implementation of Meta-StyleSpeech : Multi-Speaker Adaptive Text-to-Speech Generation

StyleSpeech - PyTorch Implementation PyTorch Implementation of Meta-StyleSpeech : Multi-Speaker Adaptive Text-to-Speech Generation. Status (2021.06.09

Keon Lee 142 Jan 06, 2023
Minimal GUI for accessing the Watson Text to Speech service.

Description Minimal graphical application for accessing the Watson Text to Speech service. Requirements Python 3 plus all dependencies listed in requi

Moritz Maxeiner 1 Oct 22, 2021
Chinese real time voice cloning (VC) and Chinese text to speech (TTS).

Chinese real time voice cloning (VC) and Chinese text to speech (TTS). 好用的中文语音克隆兼中文语音合成系统,包含语音编码器、语音合成器、声码器和可视化模块。

Kuang Dada 6 Nov 08, 2022
Just a basic Telegram AI chat bot written in Python using Pyrogram.

Nikko ChatBot Just a basic Telegram AI chat bot written in Python using Pyrogram. Requirements Python 3.7 or higher. A bot token. Installation $ https

ʀᴇxɪɴᴀᴢᴏʀ 2 Oct 21, 2022
A multi-voice TTS system trained with an emphasis on quality

TorToiSe Tortoise is a text-to-speech program built with the following priorities: Strong multi-voice capabilities. Highly realistic prosody and inton

James Betker 2.1k Jan 01, 2023