A deep learning model for style-specific music generation.

Last update: Nov 23, 2022

Overview

DeepJ: A model for style-specific music generation

Abstract

Recent advances in deep neural networks have enabled algorithms to compose music that is comparable to music composed by humans. However, few algorithms allow the user to generate music with tunable parameters. The ability to tune properties of generated music will yield more practical benefits for aiding artists, filmmakers, and composers in their creative tasks. In this paper, we introduce DeepJ - an end-to-end generative model that is capable of composing music conditioned on a specific mixture of composer styles. Our innovations include methods to learn musical style and music dynamics. We use our model to demonstrate a simple technique for controlling the style of generated music as a proof of concept. Evaluation of our model using human raters shows that we have improved over the Biaxial LSTM approach.

Requirements

Python 3.5

Clone Python MIDI (https://github.com/vishnubob/python-midi) cd python-midi then install using python3 setup.py install.

Then, install other dependencies of this project.

pip install -r requirements.txt

The dataset is not provided in this repository. To train a custom model, you will need to include a MIDI dataset in the data/ folder.

Usage

To train a new model, run the following command:

python train.py

To generate music, run the following command:

python generate.py

Use the help command to see CLI arguments:

python generate.py --help

A deep learning model for style-specific music generation.

Related tags

Overview

DeepJ: A model for style-specific music generation

Abstract

Requirements

Usage

Owner

Henry Mao

Data & Code for ACCENTOR Adding Chit-Chat to Enhance Task-Oriented Dialogues

Betafold - AlphaFold with tunings

Official NumPy Implementation of Deep Networks from the Principle of Rate Reduction (2021)

Camview - A CLI-tool used to stream CCTV online footage based on URL params

Multi-Stage Spatial-Temporal Convolutional Neural Network (MS-GCN)

Implementation of Retrieval-Augmented Denoising Diffusion Probabilistic Models in Pytorch

Cartoon-StyleGan2 🙃 : Fine-tuning StyleGAN2 for Cartoon Face Generation

A 2D Visual Localization Framework based on Essential Matrices [ICRA2020]

Graph Convolutional Networks for Temporal Action Localization (ICCV2019)

Hierarchical probabilistic 3D U-Net, with attention mechanisms (—𝘈𝘵𝘵𝘦𝘯𝘵𝘪𝘰𝘯 𝘜-𝘕𝘦𝘵, 𝘚𝘌𝘙𝘦𝘴𝘕𝘦𝘵) and a nested decoder structure with deep supervision (—𝘜𝘕𝘦𝘵++).

Model Zoo for MindSpore

HiFi++: a Unified Framework for Neural Vocoding, Bandwidth Extension and Speech Enhancement

[ICCV 2021] Code release for "Sub-bit Neural Networks: Learning to Compress and Accelerate Binary Neural Networks"

[ICML 2021] “ Self-Damaging Contrastive Learning”, Ziyu Jiang, Tianlong Chen, Bobak Mortazavi, Zhangyang Wang

ALFRED - A Benchmark for Interpreting Grounded Instructions for Everyday Tasks

Invert and perturb GAN images for test-time ensembling

An unofficial implementation of "Unpaired Image Super-Resolution using Pseudo-Supervision." CVPR2020

Chinese Mandarin tts text-to-speech 中文 (普通话) 语音合成 , by fastspeech 2 , implemented in pytorch, using waveglow as vocoder,

Runtime type annotations for the shape, dtype etc. of PyTorch Tensors.

Finetune SSL models for MOS prediction

A deep learning model for style-specific music generation.

Related tags

Overview

DeepJ: A model for style-specific music generation

Abstract

Requirements

Usage

Owner

Henry Mao

Data & Code for ACCENTOR Adding Chit-Chat to Enhance Task-Oriented Dialogues

Betafold - AlphaFold with tunings

Official NumPy Implementation of Deep Networks from the Principle of Rate Reduction (2021)

Camview - A CLI-tool used to stream CCTV online footage based on URL params

Multi-Stage Spatial-Temporal Convolutional Neural Network (MS-GCN)

Implementation of Retrieval-Augmented Denoising Diffusion Probabilistic Models in Pytorch

Cartoon-StyleGan2 🙃 : Fine-tuning StyleGAN2 for Cartoon Face Generation

A 2D Visual Localization Framework based on Essential Matrices [ICRA2020]

Graph Convolutional Networks for Temporal Action Localization (ICCV2019)

Hierarchical probabilistic 3D U-Net, with attention mechanisms (—𝘈𝘵𝘵𝘦𝘯𝘵𝘪𝘰𝘯 𝘜-𝘕𝘦𝘵, 𝘚𝘌𝘙𝘦𝘴𝘕𝘦𝘵) and a nested decoder structure with deep supervision (—𝘜𝘕𝘦𝘵++).

Model Zoo for MindSpore

HiFi++: a Unified Framework for Neural Vocoding, Bandwidth Extension and Speech Enhancement

[ICCV 2021] Code release for "Sub-bit Neural Networks: Learning to Compress and Accelerate Binary Neural Networks"

[ICML 2021] “ Self-Damaging Contrastive Learning”, Ziyu Jiang, Tianlong Chen, Bobak Mortazavi, Zhangyang Wang

ALFRED - A Benchmark for Interpreting Grounded Instructions for Everyday Tasks

Invert and perturb GAN images for test-time ensembling

An unofficial implementation of "Unpaired Image Super-Resolution using Pseudo-Supervision." CVPR2020

Chinese Mandarin tts text-to-speech 中文 (普通话) 语音 合成 , by fastspeech 2 , implemented in pytorch, using waveglow as vocoder,

Runtime type annotations for the shape, dtype etc. of PyTorch Tensors.

Finetune SSL models for MOS prediction

Chinese Mandarin tts text-to-speech 中文 (普通话) 语音合成 , by fastspeech 2 , implemented in pytorch, using waveglow as vocoder,