{"id":2040471,"url":"https://github.com/gwinndr/MusicTransformer-Pytorch","last_synced_at":"2025-04-04T10:31:14.415Z","repository":{"id":39765020,"uuid":"224257710","full_name":"gwinndr/MusicTransformer-Pytorch","owner":"gwinndr","description":"MusicTransformer written for MaestroV2 using the Pytorch framework for music generation","archived":false,"fork":false,"pushed_at":"2022-05-26T00:01:12.000Z","size":113,"stargazers_count":236,"open_issues_count":6,"forks_count":48,"subscribers_count":1,"default_branch":"master","last_synced_at":"2024-11-05T06:34:33.420Z","etag":null,"topics":["maestro","mit","music-generation","music-transformer","python","pytorch","transformer"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/gwinndr.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2019-11-26T18:17:20.000Z","updated_at":"2024-11-02T18:16:34.000Z","dependencies_parsed_at":"2022-08-29T14:11:56.972Z","dependency_job_id":null,"html_url":"https://github.com/gwinndr/MusicTransformer-Pytorch","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gwinndr%2FMusicTransformer-Pytorch","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gwinndr%2FMusicTransformer-Pytorch/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gwinndr%2FMusicTransformer-Pytorch/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gwinndr%2FMusicTransformer-Pytorch/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/gwinndr","download_url":"https://codeload.github.com/gwinndr/MusicTransformer-Pytorch/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247160421,"owners_count":20893831,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["maestro","mit","music-generation","music-transformer","python","pytorch","transformer"],"created_at":"2024-01-21T03:59:54.932Z","updated_at":"2025-04-04T10:31:14.070Z","avatar_url":"https://github.com/gwinndr.png","language":"Python","funding_links":[],"categories":["Python"],"sub_categories":[],"readme":"# Music Transformer\n[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/asigalov61/SuperPiano/blob/master/Super_Piano_3.ipynb)\n\nCurrently supports Pytorch \u003e= 1.2.0 with Python \u003e= 3.6  \n\nThere is now a much friendlier [Google Colab version](https://github.com/asigalov61/SuperPiano/blob/master/Super_Piano_3.ipynb) of this project courtesy of [Alex](https://github.com/asigalov61)! \n\n## About\nThis is a reproduction of the MusicTransformer (Huang et al., 2018) for Pytorch. This implementation utilizes the generic Transformer implementation introduced in Pytorch 1.2.0 (https://pytorch.org/docs/stable/nn.html#torch.nn.Transformer).\n\n## Generated Music:\nSome various music results (midi and mp3) are in the following Google Drive folder:  \nhttps://drive.google.com/drive/folders/1qS4z_7WV4LLgXZeVZU9IIjatK7dllKrc?usp=sharing\n\nSee the results section for the model hyperparameters used for generation.\n\nMp3 results were played through a [Kawai MP11SE](https://kawaius.com/product/mp11se/). \nIn order to play .mid files, we used [Midi Editor](https://www.midieditor.org/) which is free to use and open source.\n\n## TODO\n* Write own midi pre-processor (sustain pedal errors with jason's)\n   * Support any midi file beyond Maestro\n* Fixed length song generation\n* Midi augmentations from paper\n* Multi-GPU support\n\n## How to run\n1. Download the Maestro dataset (we used v2 but v1 should work as well). You can download the dataset [here](https://magenta.tensorflow.org/datasets/maestro). You only need the MIDI version if you're tight on space. \n\n2. Run `git submodule update --init --recursive` to get the MIDI pre-processor provided by jason9693 et al. (https://github.com/jason9693/midi-neural-processor), which is used to convert the MIDI file into discrete ordered message types for training and evaluating. \n\n3. Run `preprocess_midi.py -output_dir \u003cpath_to_save_output\u003e \u003cpath_to_maestro_data\u003e`, or run with `--help` for details. This will write pre-processed data into folder split into `train`, `val`, and `test` as per Maestro's recommendation.\n\n4. To train a model, run `train.py`. Use `--help` to see the tweakable parameters. See the results section for details on model performance. \n\n5. After training models, you can evaluate them with `evaluate.py` and generate a MIDI piece with `generate.py`. To graph and compare results visually, use `graph_results.py`.\n\nFor the most part, you can just leave most arguments at their default values. If you are using a different dataset location or other such things, you will need to specify that in the arguments. Beyond that, the average user does not have to worry about most of the arguments.\n\n### Training\nAs an example to train a model using the parameters specified in results:\n\n```\npython train.py -output_dir rpr --rpr \n```\nYou can additonally specify both a weight and print modulus that determine what epochs to save weights and what batches to print. The weights that achieved the best loss and the best accuracy (separate) are always stored in results, regardless of weight modulus input.\n\n### Evaluation\nYou can evaluate a model using;\n```\npython evaluate.py -model_weights rpr/results/best_acc_weights.pickle --rpr\n```\n\nYour model's results may vary because a random sequence start position is chosen for each evaluation piece. This may be changed in the future.\n\n### Generation\nYou can generate a piece with a trained model by using:\n```\npython generate.py -output_dir output -model_weights rpr/results/best_acc_weights.pickle --rpr\n```\n\nThe default generation method is a sampled probability distribution with the softmaxed output as the weights. You can also use beam search but this simply does not work well and is not recommended.\n\n## Pytorch Transformer\nWe used the Transformer class provided since Pytorch 1.2.0 (https://pytorch.org/docs/stable/nn.html#torch.nn.Transformer). The provided Transformer assumes an encoder-decoder architecture. To make it decoder-only like the Music Transformer, you use stacked encoders with a custom dummy decoder. This decoder-only model can be found in model/music_transformer.py.\n\nAt the time this reproduction was produced, there was no Relative Position Representation (RPR) (Shaw et al., 2018) support in the Pytorch Transformer code. To account for the lack of RPR support, we modified Pytorch 1.2.0 Transformer code to support it. This is based on the Skew method proposed by Huang et al. which is more memory efficient. You can find the modified code in model/rpr.py. This modified Pytorch code will not be kept up to date and will be removed when Pytorch provides RPR support.\n\n## Results\nWe trained a base and RPR model with the following parameters (taken from the paper) for 300 epochs:\n* **learn_rate**: None\n* **ce_smoothing**: None\n* **batch_size**: 2\n* **max_sequence**: 2048\n* **n_layers**: 6\n* **num_heads**: 8\n* **d_model**: 512\n* **dim_feedforward**: 1024\n* **dropout**: 0.1\n\nThe following graphs were generated with the command: \n```\npython graph_results.py -input_dirs base_model/results?rpr_model/results -model_names base?rpr\n```\n\nNote, multiple input models are separated with a '?'\n\n![Loss Results Graph](https://lh3.googleusercontent.com/u6AL9vIXG7gBeKuLlVJGFeex7-q2NYLbMqYVZGFI3qxWlpa6hAXdVlOsD52i4jKjrVcf4YZCGBaMIVIagcu_z-7Sg5YhDcgsqcs-p4aR48C287c1QraG0tRnHnmimLd8jizk9afW8g=w2400 \"Loss Results\")\n\n![Accuracy Results Graph](https://lh3.googleusercontent.com/ajbanROlOAM9YrNDaHrv1tWM8tZ4nrcrTehwoHsaftnPPZ4xEBLG0RmBa4awYXntBQF0RR_Uh3bsLZv4mdzmZM_TNisMnreKsB2jZIY7iSZjQiL4kRumypymuxIiHu-VdPB0kUkILQ=w2400 \"Accuracy Results\")\n\n![Learn Rate Results Graph](https://lh3.googleusercontent.com/Gz8N8tgHN2qstvdq77GqQQiukWjwBUettMK8IYV0228il5NvRdrnoISS5HTrxd7xVOrRpSzTtLlRppT-UwWJ2ke1XnAsRMbJ0bCElSvCQAA_z08HSZjbJ4wQXBbg4lVzuGdikEN5Ug=w2400 \"Learn Rate Results\")\n\nBest loss for *base* model: 1.99 on epoch 250  \nBest loss for *rpr* model: 1.92 on epoch 216\n\n## Discussion\nThe results were overall close to the results from the paper. Huang et al. reported a loss of around 1.8 for the base and rpr models on Maestro V1. We use Maestro V2 and perform no midi augmentations as they had discussed in their paper. Furthermore, [there are issues with how sustain is handled](https://github.com/jason9693/midi-neural-processor/pull/2) which can be observed by listening to some pre-processed midi files. More refinement with the addition of those augmentations and fixes may yield the loss results in line with the paper.\n\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgwinndr%2FMusicTransformer-Pytorch","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgwinndr%2FMusicTransformer-Pytorch","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgwinndr%2FMusicTransformer-Pytorch/lists"}