MIDI-DDSP: Detailed Control of Musical Performance via Hierarchical Modeling

Related tags

Audiomidi-ddsp
Overview
logo

MIDI-DDSP: Detailed Control of Musical Performance via Hierarchical Modeling

Demos | Blog Post | Colab Notebook | Paper | Hugging Face Spaces

MIDI-DDSP is a hierarchical audio generation model for synthesizing MIDI expanded from DDSP.

Links

Install MIDI-DDSP

You could install MIDI-DDSP via pip, which allows you to use the cool Command-line MIDI synthesis to synthesize your MIDI.

To install MIDI-DDSP via pip, simply run:

pip install midi-ddsp

Train MIDI-DDSP

To train MIDI-DDSP, please first install midi-ddsp and clone the MIDI-DDSP repository:

git clone https://github.com/magenta/midi-ddsp.git

For dataset, please download the tfrecord files for the URMP dataset in here to the data folder in your cloned repository using the following commands:

cd midi-ddsp # enter the project directory
mkdir ./data # create a data folder
gsutil cp gs://magentadata/datasets/urmp/urmp_20210324/* ./data # download tfrecords to directory

Please check here for how to install and use gsutil.

Finally, you can run the script train_midi_ddsp.sh to train the exact same model we used in the paper:

sh ./train_midi_ddsp.sh

The current codebase does not support training with arbitrary dataset, but we will hopefully update that in the near future.

Side note:

If one download the dataset to a different location, please change the data_dir parameter in train_midi_ddsp.sh.

The training of MIDI-DDSP takes approximately 18 hours on a single RTX 8000. The training code for now does not support multi-GPU training. We recommend using a GPU with more than 24G of memory when training Synthesis Generator in batch size of 16. For a GPU with less memory, please consider using a smaller batch size and change the batch size in train_midi_ddsp.sh.

Try to play with MIDI-DDSP yourself!

Please try out MIDI-DDSP in Colab notebooks!

In this notebook, you will try to use MIDI-DDSP to synthesis a monophonic MIDI file, adjust note expressions, make pitch bend by adjusting synthesis parameters, and synthesize quartet from Bach chorales.

We have trained MIDI-DDSP on the URMP dataset which support synthesizing 13 instruments: violin, viola, cello, double bass, flute, oboe, clarinet, saxophone, bassoon, trumpet, horn, trombone, tuba. You could find how to download and use our pre-trained model below:

Command-line MIDI synthesis

On can use the MIDI-DDSP as a command-line MIDI synthesizer just like FluidSynth.

To use command-line synthesis to synthesize a midi file, please first download the model weights by running:

midi_ddsp_download_model_weights

To synthesize a midi file simply run the following command:

midi_ddsp_synthesize --midi_path <path-to-midi>

For a starter, you can try to synthesize the example midi file in this repository:

midi_ddsp_synthesize --midi_path ./midi_example/ode_to_joy.mid

The command line also enables synthesize a folder of midi files. For more advance use (synthesize a folder, using FluidSynth for instruments not supported, etc.), please see synthesize_midi.py --help.

If you have a trouble downloading the model weights, please manually download from here, and specify the synthesis_generator_weight_path and expression_generator_weight_path by yourself when using the command line. You can also specify your other model weights if you want to use your own trained model.

Python Usage

After installing midi-ddsp, you could import midi-ddsp in python and synthesize MIDI in your code.

Minimal Example

Here is a simple example to use MIDI-DDSP to synthesize a midi file:

from midi_ddsp import synthesize_midi, load_pretrained_model

midi_file = 'ode_to_joy.mid'
# Load pre-trained model
synthesis_generator, expression_generator = load_pretrained_model()
# Synthesize MIDI
output = synthesize_midi(synthesis_generator, expression_generator, midi_file)
# The synthesized audio
synthesized_audio = output['mix_audio']

Advance Usage

Here is an advance example to synthesize the ode_to_joy.mid, change the note expression controls, and adjust the synthesis parameters:

import numpy as np
import tensorflow as tf
from midi_ddsp.utils.midi_synthesis_utils import synthesize_mono_midi, conditioning_df_to_audio
from midi_ddsp.utils.inference_utils import get_process_group
from midi_ddsp.midi_ddsp_synthesize import load_pretrained_model
from midi_ddsp.data_handling.instrument_name_utils import INST_NAME_TO_ID_DICT

# -----MIDI Synthesis-----
midi_file = 'ode_to_joy.mid'
# Load pre-trained model
synthesis_generator, expression_generator = load_pretrained_model()
# Synthesize with violin:
instrument_name = 'violin'
instrument_id = INST_NAME_TO_ID_DICT[instrument_name]
# Run model prediction
midi_audio, midi_control_params, midi_synth_params, conditioning_df = synthesize_mono_midi(synthesis_generator,
                                                                                           expression_generator,
                                                                                           midi_file, instrument_id,
                                                                                           output_dir=None)

synthesized_audio = midi_audio  # The synthesized audio

# -----Adjust note expression controls and re-synthesize-----

# Make all notes weak vibrato:
conditioning_df_changed = conditioning_df.copy()
note_vibrato = conditioning_df_changed['vibrato_extend'].value
conditioning_df_changed['vibrato_extend'] = np.ones_like(conditioning_df['vibrato_extend'].values) * 0.1
# Re-synthesize
midi_audio_changed, midi_control_params_changed, midi_synth_params_changed = conditioning_df_to_audio(
  synthesis_generator, conditioning_df_changed, tf.constant([instrument_id]))

synthesized_audio_changed = midi_audio_changed  # The synthesized audio

# There are 6 note expression controls in conditioning_df that you could change:
# 'amplitude_mean', 'amplitude_std', 'vibrato_extend', 'brightness', 'attack_level', 'amplitudes_max_pos'.
# Please refer to https://colab.research.google.com/github/magenta/midi-ddsp/blob/main/midi_ddsp/colab/MIDI_DDSP_Demo.ipynb#scrollTo=XfPPrdPu5sSy for the effect of each control. 

# -----Adjust synthesis parameters and re-synthesize-----

# The original synthesis parameters:
f0_ori = midi_synth_params['f0_hz']
amps_ori = midi_synth_params['amplitudes']
noise_ori = midi_synth_params['noise_magnitudes']
hd_ori = midi_synth_params['harmonic_distribution']

# TODO: make your change of the synthesis parameters here:
f0_changed = f0_ori
amps_changed = amps_ori
noise_changed = noise_ori
hd_changed = hd_ori

# Resynthesis the audio using DDSP
processor_group = get_process_group(midi_synth_params['amplitudes'].shape[1], use_angular_cumsum=True)
midi_audio_changed = processor_group({'amplitudes': amps_changed,
                                      'harmonic_distribution': hd_changed,
                                      'noise_magnitudes': noise_changed,
                                      'f0_hz': f0_changed, },
                                     verbose=False)
midi_audio_changed = synthesis_generator.reverb_module(midi_audio_changed, reverb_number=instrument_id, training=False)

synthesized_audio_changed = midi_audio_changed  # The synthesized audio
Comments
  • ImportError and AttributeError

    ImportError and AttributeError

    Hi! Very interesting work! I’m trying to run MIDI_DDSP_Demo.ipynb, and I encountered some errors.

    1. ImportError: cannot import name 'LD_RANGE' occurred in from ddsp.spectral_ops import F0_RANGE, LD_RANGE(see here). -> According to DDSP, I think 'DB_RANGE' is correct, not 'LD_RANGE' .

    2. AttributeError: module 'ddsp.spectral_ops' has no attribute 'amplitude_to_db' occurred where ddsp.spectral_ops.amplitude_to_db is used (see here). -> According to DDSP, I suppose it is 'ddsp.core', not 'ddsp.spectral_ops'.

    opened by MasayaKawamura 3
  • Question on the installation

    Question on the installation

    Hi,

    When I opened up a new colab notebook and tried to pip install the midi-ddsp package, it took over 40 minutes and the installation can not be completed. I didn't experience this until this week.

    I've tried pip install midi-ddsp and pip install git+https://github.com/magenta/midi-ddsp and both gave me the same results.

    It seemed like pip would spend a lot of time trying to find which version of etils was compatible.

    opened by tiianhk 2
  • Error when using

    Error when using "pip install midi-ddsp"

    Hello everyone,

    I'm trying to use midi-ddsp to synthesize a few .midi files. In order to achieve it, I create a virtual environment with:

    python3 -m venv .venv source .venv/bin/activate

    note: python version == 3.10.6

    After creating it I run: "pip install midi-ddsp" . Getting this error message:

    Collecting ddsp Using cached ddsp-1.9.0-py2.py3-none-any.whl (200 kB) Using cached ddsp-1.7.1-py2.py3-none-any.whl (199 kB) Using cached ddsp-1.7.0-py2.py3-none-any.whl (197 kB) Using cached ddsp-1.6.5-py2.py3-none-any.whl (194 kB) Using cached ddsp-1.6.3-py2.py3-none-any.whl (194 kB) Using cached ddsp-1.6.2-py2.py3-none-any.whl (194 kB) Using cached ddsp-1.6.0-py2.py3-none-any.whl (194 kB) Using cached ddsp-1.4.0-py2.py3-none-any.whl (192 kB) Using cached ddsp-1.3.1-py2.py3-none-any.whl (192 kB) Using cached ddsp-1.3.0-py2.py3-none-any.whl (183 kB) Using cached ddsp-1.2.0-py2.py3-none-any.whl (179 kB) Using cached ddsp-1.1.0-py2.py3-none-any.whl (175 kB) Using cached ddsp-1.0.1-py2.py3-none-any.whl (170 kB) Using cached ddsp-1.0.0-py2.py3-none-any.whl (168 kB) Using cached ddsp-0.14.0-py2.py3-none-any.whl (143 kB) Using cached ddsp-0.13.1-py2.py3-none-any.whl (129 kB) Using cached ddsp-0.13.0-py2.py3-none-any.whl (129 kB) Using cached ddsp-0.12.0-py2.py3-none-any.whl (127 kB) Using cached ddsp-0.10.0-py2.py3-none-any.whl (109 kB) Using cached ddsp-0.9.0-py2.py3-none-any.whl (109 kB) Using cached ddsp-0.8.0-py2.py3-none-any.whl (108 kB) Using cached ddsp-0.7.0-py2.py3-none-any.whl (107 kB) Using cached ddsp-0.5.1-py2.py3-none-any.whl (101 kB) Using cached ddsp-0.5.0-py2.py3-none-any.whl (101 kB) Using cached ddsp-0.4.0-py2.py3-none-any.whl (97 kB) Using cached ddsp-0.2.4-py2.py3-none-any.whl (89 kB) Using cached ddsp-0.2.3-py2.py3-none-any.whl (89 kB) Using cached ddsp-0.2.2-py2.py3-none-any.whl (89 kB) Using cached ddsp-0.2.0-py2.py3-none-any.whl (88 kB) Using cached ddsp-0.1.0-py3-none-any.whl (88 kB) Using cached ddsp-0.0.10-py3-none-any.whl (88 kB) Using cached ddsp-0.0.9-py3-none-any.whl (86 kB) Using cached ddsp-0.0.8-py3-none-any.whl (86 kB) Using cached ddsp-0.0.7-py3-none-any.whl (85 kB) Using cached ddsp-0.0.6-py2.py3-none-any.whl (91 kB) Using cached ddsp-0.0.5-py2.py3-none-any.whl (91 kB) Using cached ddsp-0.0.4-py2.py3-none-any.whl (83 kB) Using cached ddsp-0.0.3-py2.py3-none-any.whl (81 kB) Using cached ddsp-0.0.1-py2.py3-none-any.whl (75 kB) Using cached ddsp-0.0.0-py2.py3-none-any.whl (75 kB) INFO: pip is looking at multiple versions of midi-ddsp to determine which version is compatible with other requirements. This could take a while. Collecting midi-ddsp Using cached midi_ddsp-0.1.3-py3-none-any.whl (56 kB) Using cached midi_ddsp-0.1.1-py3-none-any.whl (56 kB) Using cached midi_ddsp-0.1.0-py3-none-any.whl (53 kB) ERROR: Cannot install midi-ddsp because these package versions have conflicting dependencies.

    The conflict is caused by: ddsp 3.4.4 depends on tensorflow ddsp 3.4.3 depends on tensorflow ddsp 3.4.1 depends on tensorflow ddsp 3.4.0 depends on tensorflow ddsp 3.3.6 depends on tensorflow ddsp 3.3.4 depends on tensorflow ddsp 3.3.2 depends on tensorflow ddsp 3.3.0 depends on tensorflow ddsp 3.2.1 depends on tensorflow ddsp 3.2.0 depends on tensorflow ddsp 3.1.0 depends on tensorflow ddsp 1.9.0 depends on tensorflow ddsp 1.7.1 depends on tensorflow ddsp 1.7.0 depends on tensorflow ddsp 1.6.5 depends on tensorflow ddsp 1.6.3 depends on tensorflow ddsp 1.6.2 depends on tensorflow ddsp 1.6.0 depends on tensorflow ddsp 1.4.0 depends on tensorflow ddsp 1.3.1 depends on tensorflow ddsp 1.3.0 depends on tensorflow ddsp 1.2.0 depends on tensorflow ddsp 1.1.0 depends on tensorflow ddsp 1.0.1 depends on tensorflow ddsp 1.0.0 depends on tensorflow ddsp 0.14.0 depends on tensorflow ddsp 0.13.1 depends on tensorflow ddsp 0.13.0 depends on tensorflow ddsp 0.12.0 depends on tensorflow ddsp 0.10.0 depends on tensorflow ddsp 0.9.0 depends on tensorflow ddsp 0.8.0 depends on tensorflow ddsp 0.7.0 depends on tensorflow ddsp 0.5.1 depends on tensorflow ddsp 0.5.0 depends on tensorflow ddsp 0.4.0 depends on tensorflow ddsp 0.2.4 depends on tensorflow ddsp 0.2.3 depends on tensorflow ddsp 0.2.2 depends on tensorflow ddsp 0.2.0 depends on tensorflow ddsp 0.1.0 depends on tensorflow ddsp 0.0.10 depends on tensorflow ddsp 0.0.9 depends on tensorflow ddsp 0.0.8 depends on tensorflow ddsp 0.0.7 depends on tensorflow ddsp 0.0.6 depends on tensorflow ddsp 0.0.5 depends on tensorflow ddsp 0.0.4 depends on tensorflow ddsp 0.0.3 depends on tensorflow ddsp 0.0.1 depends on tensorflow ddsp 0.0.0 depends on tensorflow>=2.1.0

    To fix this you could try to:

    1. loosen the range of package versions you've specified
    2. remove package versions to allow pip attempt to solve the dependency conflict

    ERROR: Resolution Impossible: for help visit https://pip.pypa.io/en/latest/topics/dependency-resolution/#dealing-with-dependency-conflicts

    ======== I am currently working from an M1 mac so I would be pleased if you could help me reaching a solution for this problem.

    Thank you in advance, Juan Carlos

    opened by JuanCarlosMartinezSevilla 2
  • Why is vibrato_rate not used?

    Why is vibrato_rate not used?

    In my opinion, vibrato_rate (peak frequency) is more plausible than vibrato_extend (peak amplitude) to represent pitch pulsating. Why is vibrato_rate not used?

    opened by bfs18 2
  • Docker image with all the necessary packages

    Docker image with all the necessary packages

    Hi again!

    I'm still trying to execute your code with the "pip install midi-ddsp".

    I'm using docker from a tensorflow/tensorflow:latest image and running the pip command. I get this error:

    "E tensorflow/stream_executor/cuda/cuda_blas.cc:2981] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered"

    I ask you guys in order to know if you work with docker... if I could be able to have access to the image you use so all versions are the same and to avoid these type of errors.

    Thank you, Best regards, Juan Carlos

    opened by JuanCarlosMartinezSevilla 0
  • How to distinguish whether it is cresc or decresc in fluctuation?

    How to distinguish whether it is cresc or decresc in fluctuation?

    Hi! I am interested in your project. Currently, I am doing similar stuff, but i have some questions. (1) I know that Fluctuation stands for how much does that note cresc or decresc. Peak means which position got the maximum energy. However, i am wondering how to know if that note is cresc or decresc. Let's assume our peak position is 0.4, and fluc is 0.3. Then, is it decrease 0.3 or increase 0.3? How to distinguish it? (2) I know that all the expressive control values are normalized between 0 and 1, but what's the unit measure in these expressive controls?

    Thanks for answering

    opened by pillow8781 2
  • What is ``input`` in the def call()?

    What is ``input`` in the def call()?

    Hi, I am looking inside the code. I've seen a lot of methods about def call(self, inputs) in your code, especially looking at this one.

      def call(self, inputs):
        synth_params = self.get_synth_params(inputs)
    

    However, I couldn't find out what's the calculation of inputs, there are some clues I've found. In those codes, inputs is respond to the data in get_fake_data_synthesis_generator, then what are the data and units you input to get_fake_data_synthesis_generator? Frames? Amplitude or anything else?

    Thanks!

    opened by Megan8821 4
  • How to solve the error when installing midi-ddsp?

    How to solve the error when installing midi-ddsp?

    Hi, I am new in training model, and I got some problems in here. Does anyone how to solve the error of error: subprocess-exited-with -error and error: metadata-generation-failed in the terminal?

    opened by Megan8821 4
  • tfrecord features

    tfrecord features

    Hello, I'm very interested in your amazing work. Took a deep look at the urmp tfrecord datasets, I found something a bit confusing for me. Could you be so kind to help?

    1. I checked some urmp_tfrecords data, I found that some of them contain the following features: {"audio", "f0_confidence", "f0_hz", "f0_time", "id", "instrument_id", "loudness_db", "note_active_frame_indices", "note_active_velocities", "note_offsets", "note_onsets", "orig_f0_hz", "orig_f0_time", "power_db", "recording_id", "sequence"}. However, some of them don't include {"orig_f0_hz", "orig_f0_time"} in their tfrecord data. Why is this so and does such an inconsistency influence the model training?
    2. I want to include piano music when I train my own model. To this end, I think I need to generate tfrecords that have the same content as the urmp ones you used in your model. I plan to use maestro dataset. Could you be so kind to indicate if there's a tfrecord data generation code that we can take as a reference? Like the one you used to generate tfrecords for the midi-ddsp model?
    3. What is the difference between "batched" and "unbatched" dataset?

    Thank you very much for your help in advance.

    opened by gladys0313 1
  • Can i test with custom data?

    Can i test with custom data?

    Hi, i am interested in this exciting project and i am trying to test this with our custom dataset and reproduce the format of original data. But there are some difficulties and questions below.

    1. Is there no way to use custom datasets at all?
    1. Is there any code to calculate elements of dataset below?
    • I want to know how to get "note_active_velocities", "note_active_frame_indices", "power_db", "note_onsets", "note_offsets" but there is no any code on repository.

    Thank you for reading!

    opened by 589hero 6
  • Question on Figure

    Question on Figure

    image

    quick question on this figure in the blog post: i know coconet is its own model that will generate subsequent melodies given the input midi file. however, should i decide to train midi ddsp, will the training of coconet also be a part of this? or should i expect a monophonic midi melody as input and the generated audio as output.

    thanks for all the help and this awesome project

    opened by theadamsabra 7
Releases(v0.2.5)
Owner
Magenta
An open source research project exploring the role of machine learning as a tool in the creative process.
Magenta
Tradutor de um arquivo MIDI para ser usado em um simulador RISC-V(RARS)

Tradutor_MIDI-RISC-V Tradutor de um arquivo MIDI para ser usado em um simulador RISC-V(RARS) *O resultado sai com essa formatação: nota,duração,nota,d

Gabriel B. G. 4 Sep 02, 2022
Muzic: Music Understanding and Generation with Artificial Intelligence

Muzic is a research project on AI music that empowers music understanding and generation with deep learning and artificial intelligence.

Microsoft 2.6k Dec 30, 2022
Open Sound Strip, Sequence or Record in Audacity

Audacity Tools For Blender Sound editing in Blender Video Sequence Editor with Audacity integrated. Send/receive the full edited sequence or single st

64 Dec 31, 2022
Telegram Bot to play music in VoiceChat with Channel Support and autostarts Radio.

VCPlayerBot Telegram bot to stream videos in telegram voicechat for both groups and channels. Supports live streams, YouTube videos and telegram media

Abdisamad Omar Mohamed 1 Oct 15, 2021
GNOME powered sound conversion

SoundConverter A simple sound converter application for the GNOME environment. It reads anything the GStreamer library can read, and writes Ogg Vorbis

Gautier Portet 188 Dec 17, 2022
BART aids transcribe tasks by taking a source audio file and creating automatic repeated loops, allowing transcribers to listen to fragments multiple times

BART (Beyond Audio Replay Technology) aids transcribe tasks by taking a source audio file and creating automatic repeated loops, allowing transcribers to listen to fragments multiple times (with poss

2 Feb 04, 2022
Cobra is a highly-accurate and lightweight voice activity detection (VAD) engine.

On-device voice activity detection (VAD) powered by deep learning.

Picovoice 88 Dec 16, 2022
A fast MDCT implementation using SciPy and FFTs

MDCT A fast MDCT implementation using SciPy and FFTs Installation As usual pip install mdct Dependencies NumPy SciPy STFT Usage import mdct spectrum

Nils Werner 43 Sep 02, 2022
pedalboard is a Python library for adding effects to audio.

pedalboard is a Python library for adding effects to audio. It supports a number of common audio effects out of the box, and also allows the use of VST3® and Audio Unit plugin formats for third-party

Spotify 3.9k Jan 02, 2023
Python game programming in Jupyter notebooks.

Jupylet Jupylet is a Python library for programming 2D and 3D games, graphics, music and sound synthesizers, interactively in a Jupyter notebook. It i

Nir Aides 178 Dec 09, 2022
❤️ Hi There Im Cozmo Music Bot A next gen powerful telegram group Music bot for get your Songs and music @Venuja_Sadew

🎵 Cozmo MUSIC 🎵 Cozmo Music is a Music powerfull bot for playing music on telegram voice chat groups. Requirements FFmpeg NodeJS nodesource.com Pyth

Venuja Sadew 3 Jan 08, 2022
IDing the songs played on the do you radio show

IDing the songs played on the do you radio show

Rasmus Jones 36 Nov 15, 2022
The project aims to develop a personal-assistant for Windows & Linux-based systems

The project aims to develop a personal-assistant for Windows & Linux-based systems. Samiksha draws its inspiration from virtual assistants like Cortana for Windows, and Siri for iOS. It has been desi

SHUBHANSHU RAI 1 Jan 16, 2022
TONet: Tone-Octave Network for Singing Melody Extraction from Polyphonic Music

TONet Introduction The official implementation of "TONet: Tone-Octave Network for Singing Melody Extraction from Polyphonic Music", in ICASSP 2022 We

Knut(Ke) Chen 29 Dec 01, 2022
Datamoshing with FFmpeg

ffmosher Datamoshing with FFmpeg Drag and drop video onto mosh.bat to create a datamoshed video. To datamosh an image, please ensure the file is in a

18 Sep 11, 2022
A python script that can play .mp3 URLs upon the ringing or motion detection of a Ring doorbell. The sound plays through Sonos speakers.

Ring x Sonos A python script that plays .mp3 files whenever a doorbell is rung or a doorbell detects motion. Features Music! Authors @braden Running T

braden 0 Nov 12, 2021
Anaphones are like anagrams, but for sounds.

Anaphones Anaphones are like anagrams but for sounds (phonemes). Examples include: salami-awesomely, atari-tiara, and beefy-phoebe. Anaphones can be a

James Murphy 18 Nov 02, 2022
A simple python script to play bell sound in your system infinitely, just for fun and experimental purposes

A simple python script to play bell sound in your system infinitely, just for fun and experimental purposes

نافع الهلالي 1 Oct 29, 2021
Basically Play Pauses the song when it is safe to do so. when you die in a round

Basically Play Pauses the song when it is safe to do so. when you die in a round

AG_1436 1 Feb 13, 2022
A Python 3 script for capturing and recording a SDR stream to a WAV file (or serving it to a HTTP audio stream).

rfsoapyfile A Python 3 script for capturing and recording a SDR stream to a WAV file (or serving it to a HTTP audio stream). The script is threaded fo

4 Dec 19, 2022