https://github.com/salu133445/mmt

Official Implementation of "Multitrack Music Transformer" (ICASSP 2023)
https://github.com/salu133445/mmt

machine-learning music music-generation music-information-retrieval python

Last synced: 7 months ago
JSON representation

Official Implementation of "Multitrack Music Transformer" (ICASSP 2023)

Host: GitHub
URL: https://github.com/salu133445/mmt
Owner: salu133445
License: mit
Created: 2022-04-16T05:24:58.000Z (over 3 years ago)
Default Branch: main
Last Pushed: 2024-03-14T23:51:59.000Z (over 1 year ago)
Last Synced: 2024-10-03T19:25:29.020Z (about 1 year ago)
Topics: machine-learning, music, music-generation, music-information-retrieval, python
Language: Python
Homepage: https://salu133445.github.io/mmt/
Size: 410 MB
Stars: 135
Watchers: 5
Forks: 23
Open Issues: 2
Metadata Files:
- Readme: README.md
- Funding: .github/FUNDING.yml
- License: LICENSE
- Citation: CITATION.cff

Awesome Lists containing this project

README

          # Multitrack Music Transformer

This repository contains the official implementation of "Multitrack Music Transformer" (ICASSP 2023).

__Multitrack Music Transformer__


Hao-Wen Dong, Ke Chen, Shlomo Dubnov, Julian McAuley and Taylor Berg-Kirkpatrick


_IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)_, 2023


[[homepage](https://salu133445.github.io/mmt/)]

[[paper](https://arxiv.org/pdf/2207.06983.pdf)]

[[code](https://github.com/salu133445/mmt)]

[[reviews](https://salu133445.github.io/pdf/mmt-icassp2023-reviews.pdf)]

## Content

- [Prerequisites](#prerequisites)

- [Preprocessing](#preprocessing)

  - [Preprocessed Datasets](#preprocessed-datasets)

  - [Preprocessing Scripts](#preprocessing-scripts)

- [Training](#training)

  - [Pretrained Models](#pretrained-models)

  - [Training Scripts](#training-scripts)

- [Evaluation](#evaluation)

- [Generation (Inference)](#generation-inference)

- [Citation](#citation)

## Prerequisites

We recommend using Conda. You can create the environment with the following command.

```sh

conda env create -f environment.yml

```

## Preprocessing

### Preprocessed Datasets

The preprocessed datasets can be found [here](https://ucsdcloud-my.sharepoint.com/:f:/g/personal/h3dong_ucsd_edu/Er7nrsVc7NhNtYVSdWpHMQwBS5U1dXo0q0eQEi2LW-DVGw).

Extract the files to `data/{DATASET_KEY}/processed/json` and `data/{DATASET_KEY}/processed/notes`, where `DATASET_KEY` is `sod`, `lmd`, `lmd_full` or `snd`.

### Preprocessing Scripts

__You can skip this section if you download the preprocessed datasets.__

#### Step 1 -- Download the datasets

Please download the [Symbolic orchestral database (SOD)](https://qsdfo.github.io/LOP/database.html). You may download it via command line as follows.

```sh

wget https://qsdfo.github.io/LOP/database/SOD.zip

```

We also support the following two datasets:

- [Lakh MIDI Dataset (LMD)](https://qsdfo.github.io/LOP/database.html):

  ```sh

  wget http://hog.ee.columbia.edu/craffel/lmd/lmd_full.tar.gz

  ```

- [SymphonyNet Dataset](https://symphonynet.github.io/):

  ```sh

  gdown https://drive.google.com/u/0/uc?id=1j9Pvtzaq8k_QIPs8e2ikvCR-BusPluTb&export=download

  ```

#### Step 2 -- Prepare the name list

Get a list of filenames for each dataset.

```sh

find data/sod/SOD -type f -name *.mid -o -name *.xml | cut -c 14- > data/sod/original-names.txt

```

> Note: Change the number in the cut command for different datasets.

#### Step 3 -- Convert the data

Convert the MIDI and MusicXML files into MusPy files for processing.

```sh

python convert_sod.py

```

> Note: You may enable multiprocessing with the `-j` option, for example, `python convert_sod.py -j 10` for 10 parallel jobs.

#### Step 4 -- Extract the note list

Extract a list of notes from the MusPy JSON files.

```sh

python extract.py -d sod

```

#### Step 5 -- Split training/validation/test sets

Split the processed data into training, validation and test sets.

```sh

python split.py -d sod

```

## Training

### Pretrained Models

The pretrained models can be found [here](https://ucsdcloud-my.sharepoint.com/:f:/g/personal/h3dong_ucsd_edu/EqYq6KHrcltHvgJTmw7Nl6MBtv4szg4RUZUPXc4i_RgEkw).

### Training Scripts

Train a Multitrack Music Transformer model.

- Absolute positional embedding (APE):

  `python mmt/train.py -d sod -o exp/sod/ape -g 0`

- Relative positional embedding (RPE):

  `python mmt/train.py -d sod -o exp/sod/rpe --no-abs_pos_emb --rel_pos_emb -g 0`

- No positional embedding (NPE):

  `python mmt/train.py -d sod -o exp/sod/npe --no-abs_pos_emb --no-rel_pos_emb -g 0`

## Generation (Inference)

Generate new samples using a trained model.

```sh

python mmt/generate.py -d sod -o exp/sod/ape -g 0

```

## Evaluation

Evaluate the trained model using objective evaluation metrics.

```sh

python mmt/evaluate.py -d sod -o exp/sod/ape -ns 100 -g 0

```

## Acknowledgment

The code is based largely on the [x-transformers](https://github.com/lucidrains/x-transformers) library developed by [lucidrains](https://github.com/lucidrains).

## Citation

Please cite the following paper if you use the code provided in this repository.

 > Hao-Wen Dong, Ke Chen, Shlomo Dubnov, Julian McAuley, and Taylor Berg-Kirkpatrick, "Multitrack Music Transformer," _IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)_, 2023.

```bibtex

@inproceedings{dong2023mmt,

    author = {Hao-Wen Dong and Ke Chen and Shlomo Dubnov and Julian McAuley and Taylor Berg-Kirkpatrick},

    title = {Multitrack Music Transformer},

    booktitle = {IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},

    year = 2023,

}

```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/salu133445/mmt

Awesome Lists containing this project

README