https://github.com/salu133445/mmt
Official Implementation of "Multitrack Music Transformer" (ICASSP 2023)
https://github.com/salu133445/mmt
machine-learning music music-generation music-information-retrieval python
Last synced: 7 months ago
JSON representation
Official Implementation of "Multitrack Music Transformer" (ICASSP 2023)
- Host: GitHub
- URL: https://github.com/salu133445/mmt
- Owner: salu133445
- License: mit
- Created: 2022-04-16T05:24:58.000Z (over 3 years ago)
- Default Branch: main
- Last Pushed: 2024-03-14T23:51:59.000Z (over 1 year ago)
- Last Synced: 2024-10-03T19:25:29.020Z (about 1 year ago)
- Topics: machine-learning, music, music-generation, music-information-retrieval, python
- Language: Python
- Homepage: https://salu133445.github.io/mmt/
- Size: 410 MB
- Stars: 135
- Watchers: 5
- Forks: 23
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- Funding: .github/FUNDING.yml
- License: LICENSE
- Citation: CITATION.cff
Awesome Lists containing this project
README
# Multitrack Music Transformer
This repository contains the official implementation of "Multitrack Music Transformer" (ICASSP 2023).
__Multitrack Music Transformer__
Hao-Wen Dong, Ke Chen, Shlomo Dubnov, Julian McAuley and Taylor Berg-Kirkpatrick
_IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)_, 2023
[[homepage](https://salu133445.github.io/mmt/)]
[[paper](https://arxiv.org/pdf/2207.06983.pdf)]
[[code](https://github.com/salu133445/mmt)]
[[reviews](https://salu133445.github.io/pdf/mmt-icassp2023-reviews.pdf)]## Content
- [Prerequisites](#prerequisites)
- [Preprocessing](#preprocessing)
- [Preprocessed Datasets](#preprocessed-datasets)
- [Preprocessing Scripts](#preprocessing-scripts)
- [Training](#training)
- [Pretrained Models](#pretrained-models)
- [Training Scripts](#training-scripts)
- [Evaluation](#evaluation)
- [Generation (Inference)](#generation-inference)
- [Citation](#citation)## Prerequisites
We recommend using Conda. You can create the environment with the following command.
```sh
conda env create -f environment.yml
```## Preprocessing
### Preprocessed Datasets
The preprocessed datasets can be found [here](https://ucsdcloud-my.sharepoint.com/:f:/g/personal/h3dong_ucsd_edu/Er7nrsVc7NhNtYVSdWpHMQwBS5U1dXo0q0eQEi2LW-DVGw).
Extract the files to `data/{DATASET_KEY}/processed/json` and `data/{DATASET_KEY}/processed/notes`, where `DATASET_KEY` is `sod`, `lmd`, `lmd_full` or `snd`.
### Preprocessing Scripts
__You can skip this section if you download the preprocessed datasets.__
#### Step 1 -- Download the datasets
Please download the [Symbolic orchestral database (SOD)](https://qsdfo.github.io/LOP/database.html). You may download it via command line as follows.
```sh
wget https://qsdfo.github.io/LOP/database/SOD.zip
```We also support the following two datasets:
- [Lakh MIDI Dataset (LMD)](https://qsdfo.github.io/LOP/database.html):
```sh
wget http://hog.ee.columbia.edu/craffel/lmd/lmd_full.tar.gz
```- [SymphonyNet Dataset](https://symphonynet.github.io/):
```sh
gdown https://drive.google.com/u/0/uc?id=1j9Pvtzaq8k_QIPs8e2ikvCR-BusPluTb&export=download
```#### Step 2 -- Prepare the name list
Get a list of filenames for each dataset.
```sh
find data/sod/SOD -type f -name *.mid -o -name *.xml | cut -c 14- > data/sod/original-names.txt
```> Note: Change the number in the cut command for different datasets.
#### Step 3 -- Convert the data
Convert the MIDI and MusicXML files into MusPy files for processing.
```sh
python convert_sod.py
```> Note: You may enable multiprocessing with the `-j` option, for example, `python convert_sod.py -j 10` for 10 parallel jobs.
#### Step 4 -- Extract the note list
Extract a list of notes from the MusPy JSON files.
```sh
python extract.py -d sod
```#### Step 5 -- Split training/validation/test sets
Split the processed data into training, validation and test sets.
```sh
python split.py -d sod
```## Training
### Pretrained Models
The pretrained models can be found [here](https://ucsdcloud-my.sharepoint.com/:f:/g/personal/h3dong_ucsd_edu/EqYq6KHrcltHvgJTmw7Nl6MBtv4szg4RUZUPXc4i_RgEkw).
### Training Scripts
Train a Multitrack Music Transformer model.
- Absolute positional embedding (APE):
`python mmt/train.py -d sod -o exp/sod/ape -g 0`
- Relative positional embedding (RPE):
`python mmt/train.py -d sod -o exp/sod/rpe --no-abs_pos_emb --rel_pos_emb -g 0`
- No positional embedding (NPE):
`python mmt/train.py -d sod -o exp/sod/npe --no-abs_pos_emb --no-rel_pos_emb -g 0`
## Generation (Inference)
Generate new samples using a trained model.
```sh
python mmt/generate.py -d sod -o exp/sod/ape -g 0
```## Evaluation
Evaluate the trained model using objective evaluation metrics.
```sh
python mmt/evaluate.py -d sod -o exp/sod/ape -ns 100 -g 0
```## Acknowledgment
The code is based largely on the [x-transformers](https://github.com/lucidrains/x-transformers) library developed by [lucidrains](https://github.com/lucidrains).
## Citation
Please cite the following paper if you use the code provided in this repository.
> Hao-Wen Dong, Ke Chen, Shlomo Dubnov, Julian McAuley, and Taylor Berg-Kirkpatrick, "Multitrack Music Transformer," _IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)_, 2023.
```bibtex
@inproceedings{dong2023mmt,
author = {Hao-Wen Dong and Ke Chen and Shlomo Dubnov and Julian McAuley and Taylor Berg-Kirkpatrick},
title = {Multitrack Music Transformer},
booktitle = {IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
year = 2023,
}
```