{"id":13434457,"url":"https://github.com/carlosholivan/DeepLearningMusicGeneration","last_synced_at":"2025-03-17T14:31:33.246Z","repository":{"id":41339024,"uuid":"255102019","full_name":"carlosholivan/DeepLearningMusicGeneration","owner":"carlosholivan","description":"State of the Art of Music Generation with Deep Learning and AI","archived":false,"fork":false,"pushed_at":"2023-03-16T19:28:34.000Z","size":17630,"stargazers_count":274,"open_issues_count":0,"forks_count":25,"subscribers_count":12,"default_branch":"master","last_synced_at":"2024-10-27T14:45:39.081Z","etag":null,"topics":["deep-learning","mir","music-generation","music-generation-deep-learning","neural-network-architectures"],"latest_commit_sha":null,"homepage":"https://carlosholivan.github.io/DeepLearningMusicGeneration/","language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/carlosholivan.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2020-04-12T14:38:16.000Z","updated_at":"2024-10-26T15:20:20.000Z","dependencies_parsed_at":"2024-01-21T03:38:57.483Z","dependency_job_id":null,"html_url":"https://github.com/carlosholivan/DeepLearningMusicGeneration","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/carlosholivan%2FDeepLearningMusicGeneration","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/carlosholivan%2FDeepLearningMusicGeneration/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/carlosholivan%2FDeepLearningMusicGeneration/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/carlosholivan%2FDeepLearningMusicGeneration/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/carlosholivan","download_url":"https://codeload.github.com/carlosholivan/DeepLearningMusicGeneration/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":244050248,"owners_count":20389663,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["deep-learning","mir","music-generation","music-generation-deep-learning","neural-network-architectures"],"created_at":"2024-07-31T02:01:57.738Z","updated_at":"2025-03-17T14:31:33.240Z","avatar_url":"https://github.com/carlosholivan.png","language":null,"funding_links":[],"categories":["Others"],"sub_categories":[],"readme":"# \u003cspan style=\"color:#9EB1FF; font-size:30.0pt\"\u003eDEEP LEARNING FOR MUSIC GENERATION\u003c/span\u003e\n\n\nThis repository is maintained by [**Carlos Hernández-Oliván**](https://carlosholivan.github.io/index.html)(carloshero@unizar.es) and it presents the State of the Art of Music Generation. Most of these references (previous to 2022) are included in the review paper [\"Music Composition with Deep Learning: A Review\"](#https://arxiv.org/abs/2108.12290). The authors of the paper want to thank Jürgen Schmidhuber for his suggestions.\n\n[![License](https://img.shields.io/badge/license-Apache2.0-green)](./LICENSE)\n\nMake a pull request if you want to contribute to this references list.\n\nYou can download a PDF version of this repo here: [README.pdf](AIMusicGeneration.pdf)\n\nAll the images belong to their corresponding authors.\n\n## Table of Contents\n\n1. [Algorithmic Composition](#algorithmic-composition)\n\n    - [1992](#1992alg)\n\n    - [Books](#books-alg)\n\n\n2. [Neural Network Architectures](#neural-network-architectures)\n\n3. [Deep Learning Models for Symbolic Music Generation](#deep-learning-music-generation)\n\n    - [2023](#2023deep)\n    - [2022](#2022deep)\n    - [2021](#2021deep)\n    - [2020](#2020deep)\n    - [2019](#2019deep)\n    - [2018](#2018deep)\n    - [2017](#2017deep)\n    - [2016](#2016deep)\n    - [2015](#2015deep)\n    - [2002](#2002deep)\n    - [1990s](#1990deep)\n\n    - [Books and Reviews](#books-reviews-deep)\n      - [Books](#books-deep)\n      - [Reviews](#reviews-deep)\n\n4. [Deep Learning Models for Audio Music Generation](#deep-learning-audio-generation)\n\n    - [2023](#2023audiodeep)\n    - [2022](#2022audiodeep)\n    - [2021](#2021audiodeep)\n    - [2020](#2020audiodeep)\n    - [2017](#2017audiodeep)\n\n5. [Datasets](#datasets)\n\n6. [Journals and Conferences](#journals)\n\n7. [Authors](#authors)\n\n8. [Research Groups and Labs](#labs)\n\n10. [Apps for Music Generation with AI](#apps)\n\n11. [Other Resources](#other-resources)\n\n\n\n## \u003cspan id=\"algorithmic-composition\" style=\"color:#9EB1FF; font-size:25.0pt\"\u003e2. Algorithmic Composition\u003c/span\u003e\n\n### \u003cspan id=\"1992alg\" style=\"color:#A8FF9E; font-size:20.0pt\"\u003e1992\u003c/span\u003e\n\n#### \u003cspan id=\"harmonet\" style=\"color:#FF9EC3; font-size:15.0pt\"\u003eHARMONET\u003c/span\u003e\n\n\nHild, H., Feulner, J., \u0026 Menzel, W. (1992). HARMONET: A neural net for harmonizing chorales in the style of JS Bach. In Advances in neural information processing systems (pp. 267-274). [Paper](https://proceedings.neurips.cc/paper/1991/file/a7aeed74714116f3b292a982238f83d2-Paper.pdf)\n\n\n### \u003cspan id=\"books-alg\" style=\"color:#A8FF9E; font-size:25.0pt\"\u003eBooks\u003c/span\u003e\n\n* Westergaard, P. (1959). Experimental Music. Composition with an Electronic Computer.\n\n* Todd, P. M. (1989). A connectionist approach to algorithmic composition. Computer Music Journal, 13(4), 27-43.\n\n* Cope, D. (2000). The algorithmic composer (Vol. 16). AR Editions, Inc..\n\n* Nierhaus, G. (2009). Algorithmic composition: paradigms of automated music generation. Springer Science \u0026 Business Media.\n\n* Müller, M. (2015). Fundamentals of music processing: Audio, analysis, algorithms, applications. Springer.\n\n* McLean, A., \u0026 Dean, R. T. (Eds.). (2018). The Oxford handbook of algorithmic music. Oxford University Press.\n\n\n## \u003cspan id=\"neural-network-architectures\" style=\"color:#9EB1FF; font-size:25.0pt\"\u003e2. Neural Network Architectures\u003c/span\u003e\n\n| NN Architecture | Year | Authors | Link to original paper | Slides |\n| ------------- | ------------- | ------------- | ------------- | ------------- |\n| Long Short-Term Memory (LSTM) | 1997 | Sepp Hochreiter, Jürgen Schmidhuber | http://www.bioinf.jku.at/publications/older/2604.pdf | [LSTM.pdf](Slides/LSTM_v1.pdf) |\n| Convolutional Neural Network (CNN) | 1998 | Yann LeCun, Léon Bottou, YoshuaBengio, Patrick Haffner | http://vision.stanford.edu/cs598_spring07/papers/Lecun98.pdf |  |\n| Variational Auto Encoder (VAE) | 2013 | Diederik P. Kingma, Max Welling | https://arxiv.org/pdf/1312.6114.pdf |\n| Generative Adversarial Networks (GAN) | 2014 | Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, Yoshua Bengio | https://arxiv.org/pdf/1406.2661.pdf |  | \n| Transformer | 2017 | Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, Illia Polosukhin | https://arxiv.org/pdf/1706.03762.pdf | |\n| Diffusion Models | 2015 | Jascha Sohl-Dickstein, Eric A. Weiss, Niru Maheswaranathan, Surya Ganguli | https://arxiv.org/abs/1503.03585 | |\n\n\n## \u003cspan id=\"deep-learning-music-generation\" style=\"color:#9EB1FF; font-size:25.0pt\"\u003e3. Deep Learning Models for Music Generation\u003c/span\u003e\n\n### \u003cspan id=\"2023deep\" style=\"color:#A8FF9E; font-size:20.0pt\"\u003e2023\u003c/span\u003e\n\n#### \u003cspan id=\"rl-chord\" style=\"color:#FF9EC3; font-size:15.0pt\"\u003eRL-Chord\u003c/span\u003e\n\nJi, S., Yang, X., Luo, J., \u0026 Li, J. (2023). RL-Chord: CLSTM-Based Melody Harmonization Using Deep Reinforcement Learning. IEEE Transactions on Neural Networks and Learning Systems.\n\n[Paper](https://ieeexplore.ieee.org/abstract/document/10063204)\n\n#### \u003cspan id=\"figaro\" style=\"color:#FF9EC3; font-size:15.0pt\"\u003eFIGARO: Generating Symbolic Music with Fine-Grained Artistic Control\u003c/span\u003e\n\nvon Rütte, D., Biggio, L., Kilcher, Y., \u0026 Hoffman, T. (2022). FIGARO: Generating Symbolic Music with Fine-Grained Artistic Control. Accepted ICLR 2023.\n\n\u003cimg src=\"images/Figaro.png\" width=\"100\" height=\"150\"\u003e\n\n[Paper](https://arxiv.org/abs/2201.10936)\n\n### \u003cspan id=\"2022deep\" style=\"color:#A8FF9E; font-size:20.0pt\"\u003e2022\u003c/span\u003e\n\n#### \u003cspan id=\"museformer\" style=\"color:#FF9EC3; font-size:15.0pt\"\u003eMuseformer\u003c/span\u003e\n\nYu, B., Lu, P., Wang, R., Hu, W., Tan, X., Ye, W., ... \u0026 Liu, T. Y. (2022). Museformer: Transformer with Fine-and Coarse-Grained Attention for Music Generation. NIPS 2022.\n\n[Paper](https://openreview.net/forum?id=GFiqdZOm-Ei) [NIPS Presentation](https://nips.cc/virtual/2022/poster/54604)\n\n#### \u003cspan id=\"bar-transformer\" style=\"color:#FF9EC3; font-size:15.0pt\"\u003eBar Transformer\u003c/span\u003e\n\nQin, Y., Xie, H., Ding, S., Tan, B., Li, Y., Zhao, B., \u0026 Ye, M. (2022). Bar transformer: a hierarchical model for learning long-term structure and generating impressive pop music. Applied Intelligence, 1-19.\n\n[Paper](https://link.springer.com/article/10.1007/s10489-022-04049-3)\n\n#### \u003cspan id=\"sympony-generation\" style=\"color:#FF9EC3; font-size:15.0pt\"\u003eSymphony Generation with Permutation Invariant Language Model\u003c/span\u003e\n\nLiu, J., Dong, Y., Cheng, Z., Zhang, X., Li, X., Yu, F., \u0026 Sun, M. (2022). Symphony Generation with Permutation Invariant Language Model. arXiv preprint arXiv:2205.05448.\n\n\u003cimg src=\"images/Symphony Generation.png\" width=\"300\" height=\"120\"\u003e\n\n[Paper](http://128.84.4.34/abs/2205.05448) [Code](https://github.com/symphonynet/SymphonyNet) [Samples](https://symphonynet.github.io/)\n\n\n#### \u003cspan id=\"theme-transformer\" style=\"color:#FF9EC3; font-size:15.0pt\"\u003eTheme Transfomer\u003c/span\u003e\n\nShih, Y. J., Wu, S. L., Zalkow, F., Muller, M., \u0026 Yang, Y. H. (2022). Theme Transformer: Symbolic Music Generation with Theme-Conditioned Transformer. IEEE Transactions on Multimedia.\n\n\u003cimg src=\"images/Theme Transformer.png\" width=\"300\" height=\"100\"\u003e\n\n[Paper](https://arxiv.org/abs/2111.04093) [GitHub](https://github.com/atosystem/ThemeTransformer)\n\n\n\n### \u003cspan id=\"2021deep\" style=\"color:#A8FF9E; font-size:20.0pt\"\u003e2021\u003c/span\u003e\n\n\n#### \u003cspan id=\"compound-word\" style=\"color:#FF9EC3; font-size:15.0pt\"\u003eCompound Word Transformer\u003c/span\u003e\n\nHsiao, W. Y., Liu, J. Y., Yeh, Y. C., \u0026 Yang, Y. H. (2021, May). Compound word transformer: Learning to compose full-song music over dynamic directed hypergraphs. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 35, No. 1, pp. 178-186).\n\n[Paper](https://ojs.aaai.org/index.php/AAAI/article/view/16091) [GitHub](https://github.com/YatingMusic/compound-word-transformer)\n\n#### \u003cspan id=\"melody-lyrics-models\" style=\"color:#FF9EC3; font-size:15.0pt\"\u003eMelody Generation from Lyrics\u003c/span\u003e\n\nYu, Y., Srivastava, A., \u0026 Canales, S. (2021). Conditional lstm-gan for melody generation from lyrics. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), 17(1), 1-20.\n\n\u003cimg src=\"images/Melody Generation from Lyrics.jpg\" width=\"300\" height=\"200\"\u003e\n\n[Paper](https://dl.acm.org/doi/abs/10.1145/3424116)\n\n\n#### \u003cspan id=\"diffusion-models\" style=\"color:#FF9EC3; font-size:15.0pt\"\u003eMusic Generation with Diffusion Models\u003c/span\u003e\n\nMittal, G., Engel, J., Hawthorne, C., \u0026 Simon, I. (2021). Symbolic music generation with diffusion models. arXiv preprint arXiv:2103.16091.\n\n\u003cimg src=\"images/Music Generation with Diffusion Models.png\" width=\"400\" height=\"200\"\u003e\n\n[Paper](https://arxiv.org/abs/2103.16091) [GitHub](https://github.com/magenta/symbolic-music-diffusion)\n\n### \u003cspan id=\"2020deep\" style=\"color:#A8FF9E; font-size:20.0pt\"\u003e2020\u003c/span\u003e\n\n#### \u003cspan id=\"pop-musc-transfomer\" style=\"color:#FF9EC3; font-size:15.0pt\"\u003ePop Musc Transfomer\u003c/span\u003e\n\nHuang, Y. S., \u0026 Yang, Y. H. (2020, October). Pop music transformer: Beat-based modeling and generation of expressive pop piano compositions. In Proceedings of the 28th ACM International Conference on Multimedia (pp. 1180-1188).\n\n[Paper](https://dl.acm.org/doi/abs/10.1145/3394171.3413671) [GitHub](https://github.com/YatingMusic/remi)\n\n\n#### \u003cspan id=\"controllable-polyphonic\" style=\"color:#FF9EC3; font-size:15.0pt\"\u003eControllable Polyphonic Music Generation\u003c/span\u003e\n\nWang, Z., Wang, D., Zhang, Y., \u0026 Xia, G. (2020). Learning interpretable representation for controllable polyphonic music generation. arXiv preprint arXiv:2008.07122.\n\n\u003cimg src=\"images/Controllable Polyphonic Music Generation.png\" width=\"200\" height=\"200\"\u003e\n\n[Paper](https://arxiv.org/abs/2008.07122) [Web](https://program.ismir2020.net/poster_5-05.html) [Video](https://www.youtube.com/watch?v=Sb6jXP_7dtE\u0026t=28s\u0026ab_channel=ISMIR2020)\n\n#### \u003cspan id=\"mmm\" style=\"color:#FF9EC3; font-size:15.0pt\"\u003eMMM: Multitrack Music Generation\u003c/span\u003e\n\nEns, J., \u0026 Pasquier, P. (2020). Mmm: Exploring conditional multi-track music generation with the transformer. arXiv preprint arXiv:2008.06048.\n\n\u003cimg src=\"images/MMM Multitrack Music Generation.png\" width=\"300\" height=\"200\"\u003e\n\n[Paper](https://arxiv.org/abs/2008.06048) [Web](https://jeffreyjohnens.github.io/MMM/) [Colab](https://colab.research.google.com/drive/1xGZW3GP24HUsxnbebqfy1iCyYySQ64Vs?usp=sharing) [Github (AI Guru)](https://github.com/AI-Guru/MMM-JSB)\n\n#### \u003cspan id=\"xl\" style=\"color:#FF9EC3; font-size:15.0pt\"\u003eTransformer-XL\u003c/span\u003e\n\nWu, X., Wang, C., \u0026 Lei, Q. (2020). Transformer-XL Based Music Generation with Multiple Sequences of Time-valued Notes. arXiv preprint arXiv:2007.07244.\n\n\u003cimg src=\"images/Transformer-XL.png\" width=\"400\" height=\"300\"\u003e\n\n[Paper](https://arxiv.org/abs/2007.07244)\n\n\n#### \u003cspan id=\"transformer-vae\" style=\"color:#FF9EC3; font-size:15.0pt\"\u003eTransformer VAE\u003c/span\u003e\n\nJiang, J., Xia, G. G., Carlton, D. B., Anderson, C. N., \u0026 Miyakawa, R. H. (2020, May). Transformer vae: A hierarchical model for structure-aware and interpretable music representation learning. In ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 516-520). IEEE.\n\n\u003cimg src=\"images/Transformer VAE.png\" width=\"200\" height=\"200\"\u003e\n\n[Paper](https://ieeexplore.ieee.org/document/9054554)\n\n\n### \u003cspan id=\"2019deep\" style=\"color:#A8FF9E; font-size:20.0pt\"\u003e2019\u003c/span\u003e\n\n#### \u003cspan id=\"tonicnet\" style=\"color:#FF9EC3; font-size:15.0pt\"\u003eTonicNet\u003c/span\u003e\n\nPeracha, O. (2019). Improving polyphonic music models with feature-rich encoding. arXiv preprint arXiv:1911.11775.\n\n\u003cimg src=\"images/TonicNet.jpg\" width=\"200\" height=\"200\"\u003e\n\n[Paper](https://arxiv.org/abs/1911.11775)\n\n\n#### \u003cspan id=\"lakhnes\" style=\"color:#FF9EC3; font-size:15.0pt\"\u003eLakhNES\u003c/span\u003e\n\nDonahue, C., Mao, H. H., Li, Y. E., Cottrell, G. W., \u0026 McAuley, J. (2019). LakhNES: Improving multi-instrumental music generation with cross-domain pre-training. arXiv preprint arXiv:1907.04868.\n\n\u003cimg src=\"images/LakhNES.png\" width=\"200\" height=\"200\"\u003e\n\n[Paper](https://arxiv.org/abs/1907.04868)\n\n\n#### \u003cspan id=\"r-transformer\" style=\"color:#FF9EC3; font-size:15.0pt\"\u003eR-Transformer\u003c/span\u003e\n\nWang, Z., Ma, Y., Liu, Z., \u0026 Tang, J. (2019). R-transformer: Recurrent neural network enhanced transformer. arXiv preprint arXiv:1907.05572.\n\n\u003cimg src=\"images/R-Transformer.png\" width=\"400\" height=\"200\"\u003e\n\n[Paper](https://arxiv.org/abs/1907.05572)\n\n\n#### \u003cspan id=\"maia\" style=\"color:#FF9EC3; font-size:15.0pt\"\u003eMaia Music Generator\u003c/span\u003e\n\n\u003cimg src=\"images/Maia Music Generator.png\" width=\"400\" height=\"200\"\u003e\n\n[Web](https://maia.music.blog/2019/05/13/maia-a-new-music-generator/)\n\n\n#### \u003cspan id=\"counterpoint-convolution\" style=\"color:#FF9EC3; font-size:15.0pt\"\u003eCoconet: Counterpoint by Convolution\u003c/span\u003e\n\nHuang, C. Z. A., Cooijmans, T., Roberts, A., Courville, A., \u0026 Eck, D. (2019). Counterpoint by convolution. arXiv preprint arXiv:1903.07227.\n\n\u003cimg src=\"images/Coconet Counterpoint by Convolution.png\" width=\"150\" height=\"200\"\u003e\n\n[Paper](https://arxiv.org/abs/1903.07227) [Web](https://coconets.github.io/)\n\n\n### \u003cspan id=\"2018deep\" style=\"color:#A8FF9E; font-size:20.0pt\"\u003e2018\u003c/span\u003e\n\n#### \u003cspan id=\"music-transformer\" style=\"color:#FF9EC3; font-size:15.0pt\"\u003eMusic Transformer - Google Magenta\u003c/span\u003e\n\nHuang, C. Z. A., Vaswani, A., Uszkoreit, J., Shazeer, N., Simon, I., Hawthorne, et al. (2018). Music transformer. arXiv preprint arXiv:1809.04281.\n\n\u003cimg src=\"images/Music Transformer.png\" width=\"400\" height=\"100\"\u003e\n\n[Web](https://magenta.tensorflow.org/music-transformer) [Poster](Images/transformer_poster.jpg) [Paper](https://arxiv.org/pdf/1809.04281.pdf)\n\n\n#### \u003cspan id=\"imposing-structure\" style=\"color:#FF9EC3; font-size:15.0pt\"\u003eImposing Higher-level Structure in Polyphonic Music\u003c/span\u003e\n\nLattner, S., Grachten, M., \u0026 Widmer, G. (2018). Imposing higher-level structure in polyphonic music generation using convolutional restricted boltzmann machines and constraints. Journal of Creative Music Systems, 2, 1-31.\n\n\u003cimg src=\"images/Imposing Higher-level Structure in Polyphonic Music.png\" width=\"400\" height=\"200\"\u003e\n\n\n[Paper](https://arxiv.org/pdf/1612.04742.pdf)\n\n#### \u003cspan id=\"music-vae\" style=\"color:#FF9EC3; font-size:15.0pt\"\u003eMusicVAE - Google Magenta\u003c/span\u003e\n\nRoberts, A., Engel, J., Raffel, C., Hawthorne, C., \u0026 Eck, D. (2018, July). A hierarchical latent vector model for learning long-term structure in music. In International Conference on Machine Learning (pp. 4364-4373). PMLR.\n\n\u003cimg src=\"images/MusicVAE.png\" width=\"400\" height=\"200\"\u003e\n\n[Web](https://magenta.tensorflow.org/music-vae) [Paper](https://arxiv.org/pdf/1803.05428.pdf) [Code](https://github.com/tensorflow/magenta/tree/master/magenta/models/music_vae) [Google Colab](https://colab.research.google.com/notebooks/magenta/music_vae/music_vae.ipynb) [Explanation](https://medium.com/@musicvaeubcse/musicvae-understanding-of-the-googles-work-for-interpolating-two-music-sequences-621dcbfa307c)\n\n\n### \u003cspan id=\"2017deep\" style=\"color:#A8FF9E; font-size:20.0pt\"\u003e2017\u003c/span\u003e\n\n#### \u003cspan id=\"morpheus\" style=\"color:#FF9EC3; font-size:15.0pt\"\u003eMorpheuS\u003c/span\u003e\n\nHerremans, D., \u0026 Chew, E. (2017). MorpheuS: generating structured music with constrained patterns and tension. IEEE Transactions on Affective Computing, 10(4), 510-523.\n\n\u003cimg src=\"images/MorpheuS.png\" width=\"200\" height=\"200\"\u003e\n\n[Paper](https://arxiv.org/pdf/1812.04832.pdf)\n\n#### \u003cspan id=\"music-gan\" style=\"color:#FF9EC3; font-size:15.0pt\"\u003ePolyphonic GAN\u003c/span\u003e\n\nLee, S. G., Hwang, U., Min, S., \u0026 Yoon, S. (2017). Polyphonic music generation with sequence generative adversarial networks. arXiv preprint arXiv:1710.11418.\n\n\u003cimg src=\"images/Polyphonic GAN 1.png\" width=\"350\" height=\"150\"\u003e\n\n\u003cimg src=\"images/Polyphonic GAN 2.png\" width=\"350\" height=\"100\"\u003e\n\n[Paper](https://arxiv.org/abs/1710.11418)\n\n\n#### \u003cspan id=\"bach-chorales-lstm\" style=\"color:#FF9EC3; font-size:15.0pt\"\u003eBachBot - Microsoft\u003c/span\u003e\n\nLiang, F. T., Gotham, M., Johnson, M., \u0026 Shotton, J. (2017, October). Automatic Stylistic Composition of Bach Chorales with Deep LSTM. In ISMIR (pp. 449-456).\n\n\u003cimg src=\"images/BachBot 1.png\" width=\"350\" height=\"100\"\u003e\n\n\u003cimg src=\"images/BachBot 2.png\" width=\"350\" height=\"100\"\u003e\n\n[Paper](https://www.microsoft.com/en-us/research/publication/automatic-stylistic-composition-of-bach-chorales-with-deep-lstm/) [Liang Master Thesis 2016](https://www.mlmi.eng.cam.ac.uk/files/feynman_liang_8224771_assignsubmission_file_liangfeynmanthesis.pdf)\n\n\n#### \u003cspan id=\"musegan\" style=\"color:#FF9EC3; font-size:15.0pt\"\u003eMuseGAN\u003c/span\u003e\n\nDong, H. W., Hsiao, W. Y., Yang, L. C., \u0026 Yang, Y. H. (2018, April). Musegan: Multi-track sequential generative adversarial networks for symbolic music generation and accompaniment. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 32, No. 1).\n\n\u003cimg src=\"images/MuseGAN.png\" width=\"400\" height=\"150\"\u003e\n\n[Web](https://salu133445.github.io/musegan/) [Paper](https://arxiv.org/pdf/1709.06298.pdf) [Poster](Images/musegan_ismir2017.jpg) [GitHub](https://github.com/salu133445/musegan)\n\n\n#### \u003cspan id=\"lstm-composing\" style=\"color:#FF9EC3; font-size:15.0pt\"\u003eComposing Music with LSTM\u003c/span\u003e\n\nJohnson, D. D. (2017, April). Generating polyphonic music using tied parallel networks. In International conference on evolutionary and biologically inspired music and art (pp. 128-143). Springer, Cham.\n\n\u003cimg src=\"images/Composing Music with LSTM.png\" width=\"250\" height=\"150\"\u003e\n\n[Paper](https://link.springer.com/chapter/10.1007/978-3-319-55750-2_9) [Web](https://www.danieldjohnson.com/2015/08/03/composing-music-with-recurrent-neural-networks/) [GitHub](https://github.com/danieldjohnson/biaxial-rnn-music-composition) [Blog](https://www.danieldjohnson.com/2015/08/03/composing-music-with-recurrent-neural-networks/)\n\n\n#### \u003cspan id=\"organ\" style=\"color:#FF9EC3; font-size:15.0pt\"\u003eORGAN\u003c/span\u003e\n\nGuimaraes, G. L., Sanchez-Lengeling, B., Outeiral, C., Farias, P. L. C., \u0026 Aspuru-Guzik, A. (2017). Objective-reinforced generative adversarial networks (ORGAN) for sequence generation models. arXiv preprint arXiv:1705.10843.\n\n\u003cimg src=\"images/ORGAN.png\" width=\"400\" height=\"100\"\u003e\n\n[Paper](https://arxiv.org/abs/1705.10843)\n\n\n#### \u003cspan id=\"midinet\" style=\"color:#FF9EC3; font-size:15.0pt\"\u003eMidiNet\u003c/span\u003e\n\nYang, L. C., Chou, S. Y., \u0026 Yang, Y. H. (2017). MidiNet: A convolutional generative adversarial network for symbolic-domain music generation. arXiv preprint arXiv:1703.10847.\n\n\u003cimg src=\"images/MidiNet.png\" width=\"400\" height=\"150\"\u003e\n\n[Paper](https://arxiv.org/abs/1703.10847)\n\n\n### \u003cspan id=\"2016deep\" style=\"color:#A8FF9E; font-size:20.0pt\"\u003e2016\u003c/span\u003e\n\n#### \u003cspan id=\"deepbach\" style=\"color:#FF9EC3; font-size:15.0pt\"\u003eDeepBach\u003c/span\u003e\n\nHadjeres, G., Pachet, F., \u0026 Nielsen, F. (2017, July). Deepbach: a steerable model for bach chorales generation. In International Conference on Machine Learning (pp. 1362-1371). PMLR.\n\n\u003cimg src=\"images/DeepBach.png\" width=\"200\" height=\"250\"\u003e\n\n[Web](http://www.flow-machines.com/history/projects/deepbach-polyphonic-music-generation-bach-chorales/) [Paper](https://arxiv.org/pdf/1612.01010.pdf) [Code](https://github.com/Ghadjeres/DeepBach)\n\n\n#### \u003cspan id=\"fine-tuning-rl\" style=\"color:#FF9EC3; font-size:15.0pt\"\u003eFine-Tuning with RL\u003c/span\u003e\n\nJaques, N., Gu, S., Turner, R. E., \u0026 Eck, D. (2016). Generating music by fine-tuning recurrent neural networks with reinforcement learning.\n\n\u003cimg src=\"images/Fine-Tuning with RL.png\" width=\"400\" height=\"200\"\u003e\n\n[Paper](https://research.google/pubs/pub45871/)\n\n\n#### \u003cspan id=\"c-rnn-gan\" style=\"color:#FF9EC3; font-size:15.0pt\"\u003eC-RNN-GAN\u003c/span\u003e\n\nMogren, O. (2016). C-RNN-GAN: Continuous recurrent neural networks with adversarial training. arXiv preprint arXiv:1611.09904.\n\n\u003cimg src=\"images/C-RNN-GAN.png\" width=\"300\" height=\"200\"\u003e\n\n[Paper](https://arxiv.org/abs/1611.09904)\n\n\n#### \u003cspan id=\"seqgan\" style=\"color:#FF9EC3; font-size:15.0pt\"\u003eSeqGAN\u003c/span\u003e\n\nYu, L., Zhang, W., Wang, J., \u0026 Yu, Y. (2017, February). Seqgan: Sequence generative adversarial nets with policy gradient. In Proceedings of the AAAI conference on artificial intelligence (Vol. 31, No. 1).\n\n\u003cimg src=\"images/SeqGAN.png\" width=\"400\" height=\"150\"\u003e\n\n[Paper](https://arxiv.org/abs/1609.05473)\n\n\n### \u003cspan id=\"2002deep\" style=\"color:#A8FF9E; font-size:20.0pt\"\u003e2002\u003c/span\u003e\n\n#### \u003cspan id=\"seqgan\" style=\"color:#FF9EC3; font-size:15.0pt\"\u003eTemporal Structure in Music\u003c/span\u003e\n\nEck, D., \u0026 Schmidhuber, J. (2002, September). Finding temporal structure in music: Blues improvisation with LSTM recurrent networks. In Proceedings of the 12th IEEE workshop on neural networks for signal processing (pp. 747-756). IEEE.\n\n[Paper](https://ieeexplore.ieee.org/document/1030094)\n\n\n### \u003cspan id=\"1990deep\" style=\"color:#A8FF9E; font-size:20.0pt\"\u003e1980s - 1990s\u003c/span\u003e\n\nMozer, M. C. (1994). Neural network music composition by prediction: Exploring the benefits of psychoacoustic constraints and multi-scale processing. Connection Science, 6(2-3), 247-280.\n\n[Paper](https://www.tandfonline.com/doi/abs/10.1080/09540099408915726)\n\n### \u003cspan id=\"books-reviews-deep\" style=\"color:#A8FF9E; font-size:25.0pt\"\u003eBooks and Reviews\u003c/span\u003e\n\n### \u003cspan id=\"books-deep\" style=\"color:#3C8CE8; font-size:20.0pt\"\u003eBooks\u003c/span\u003e\n\n* Briot, J. P., Hadjeres, G., \u0026 Pachet, F. (2020). Deep learning techniques for music generation (pp. 1-249). Springer.\n\n### \u003cspan id=\"reviews-deep\" style=\"color:#3C8CE8; font-size:20.0pt\"\u003eReviews\u003c/span\u003e\n\n* Hernandez-Olivan, C., \u0026 Beltran, J. R. (2021). Music composition with deep learning: A review. arXiv preprint arXiv:2108.12290.\n[Paper](https://arxiv.org/abs/2108.12290)\n\n* Ji, S., Luo, J., \u0026 Yang, X. (2020). A Comprehensive Survey on Deep Music Generation: Multi-level Representations, Algorithms, Evaluations, and Future Directions. arXiv preprint arXiv:2011.06801.\n[Paper](https://arxiv.org/abs/2011.06801)\n\n* Briot, J. P., Hadjeres, G., \u0026 Pachet, F. D. (2017). Deep learning techniques for music generation--a survey. arXiv preprint arXiv:1709.01620.\n[Paper](https://arxiv.org/abs/1709.01620)\n\n\n## \u003cspan id=\"deep-learning-audio-generation\" style=\"color:#9EB1FF; font-size:25.0pt\"\u003e4. Audio Generation\u003c/span\u003e\n\n### \u003cspan id=\"2023audiodeep\" style=\"color:#A8FF9E; font-size:20.0pt\"\u003e2023\u003c/span\u003e\n\n#### \u003cspan id=\"vall-e-x-music\" style=\"color:#FF9EC3; font-size:15.0pt\"\u003eVall-E X\u003c/span\u003e\n\nZhang, Z., Zhou, L., Wang, C., Chen, S., Wu, Y., Liu, S., ... \u0026 Wei, F. (2023). Speak Foreign Languages with Your Own Voice: Cross-Lingual Neural Codec Language Modeling. arXiv preprint arXiv:2303.03926.\n\n[Paper](https://arxiv.org/abs/2303.03926)\n\n#### \u003cspan id=\"ernie-music\" style=\"color:#FF9EC3; font-size:15.0pt\"\u003eERNIE Music\u003c/span\u003e\n\nZhu, P., Pang, C., Wang, S., Chai, Y., Sun, Y., Tian, H., \u0026 Wu, H. (2023). ERNIE-Music: Text-to-Waveform Music Generation with Diffusion Models. arXiv preprint arXiv:2302.04456.\n\n[Paper](https://arxiv.org/abs/2302.04456)\n\n#### \u003cspan id=\"multi-source-diffusion-models\" style=\"color:#FF9EC3; font-size:15.0pt\"\u003eMulti-Source Diffusion Models\u003c/span\u003e\n\nMariani, G., Tallini, I., Postolache, E., Mancusi, M., Cosmo, L., \u0026 Rodolà, E. (2023). Multi-Source Diffusion Models for Simultaneous Music Generation and Separation. arXiv preprint arXiv:2302.02257.\n\n[Paper](https://arxiv.org/abs/2302.02257) [Samples](https://gladia-research-group.github.io/multi-source-diffusion-models/)\n\n#### \u003cspan id=\"singsong\" style=\"color:#FF9EC3; font-size:15.0pt\"\u003eSingSong\u003c/span\u003e\n\nDonahue, C., Caillon, A., Roberts, A., Manilow, E., Esling, P., Agostinelli, A., ... \u0026 Engel, J. (2023). SingSong: Generating musical accompaniments from singing. arXiv preprint arXiv:2301.12662.\n\n[Paper](https://arxiv.org/abs/2301.12662) [Samples](https://storage.googleapis.com/sing-song/index.html)\n\n#### \u003cspan id=\"audioldm\" style=\"color:#FF9EC3; font-size:15.0pt\"\u003eAudioLDM\u003c/span\u003e\n\nLiu, H., Chen, Z., Yuan, Y., Mei, X., Liu, X., Mandic, D., ... \u0026 Plumbley, M. D. (2023). AudioLDM: Text-to-Audio Generation with Latent Diffusion Models. arXiv preprint arXiv:2301.12503.\n\n[Paper](https://arxiv.org/abs/2301.12503) [Samples](https://audioldm.github.io/) [GitHub] (https://github.com/haoheliu/AudioLDM)\n\n#### \u003cspan id=\"mousai\" style=\"color:#FF9EC3; font-size:15.0pt\"\u003eMousai\u003c/span\u003e\n\nSchneider, F., Jin, Z., \u0026 Schölkopf, B. (2023). Mo\\^ usai: Text-to-Music Generation with Long-Context Latent Diffusion. arXiv preprint arXiv:2301.11757.\n\n[Paper](https://arxiv.org/abs/2301.11757)\n\n#### \u003cspan id=\"make-an-audio\" style=\"color:#FF9EC3; font-size:15.0pt\"\u003eMake-An-Audio\u003c/span\u003e\n\nHuang, R., Huang, J., Yang, D., Ren, Y., Liu, L., Li, M., ... \u0026 Zhao, Z. (2023). Make-An-Audio: Text-To-Audio Generation with Prompt-Enhanced Diffusion Models. arXiv preprint arXiv:2301.12661.\n\n[Paper](https://arxiv.org/abs/2301.12661) [Samples](https://text-to-audio.github.io/)\n\n#### \u003cspan id=\"noise2music\" style=\"color:#FF9EC3; font-size:15.0pt\"\u003eNoise2Music\u003c/span\u003e\n\nHuang, Q., Park, D. S., Wang, T., Denk, T. I., Ly, A., Chen, N., ... \u0026 Han, W. (2023). Noise2Music: Text-conditioned Music Generation with Diffusion Models. arXiv preprint arXiv:2302.03917.\n\n[Paper](https://arxiv.org/abs/2302.03917) [Samples](https://google-research.github.io/noise2music/)\n\n#### \u003cspan id=\"msanii\" style=\"color:#FF9EC3; font-size:15.0pt\"\u003eMsanii\u003c/span\u003e\n\nMaina, K. (2023). Msanii: High Fidelity Music Synthesis on a Shoestring Budget. arXiv preprint arXiv:2301.06468.\n\n[Paper](https://arxiv.org/abs/2301.06468)\n\n#### \u003cspan id=\"musiclm\" style=\"color:#FF9EC3; font-size:15.0pt\"\u003eMusicLM\u003c/span\u003e\n\nAgostinelli, A., Denk, T. I., Borsos, Z., Engel, J., Verzetti, M., Caillon, A., ... \u0026 Frank, C. (2023). Musiclm: Generating music from text. arXiv preprint arXiv:2301.11325.\n\n[Paper](https://arxiv.org/abs/2301.11325) [Samples](https://google-research.github.io/seanet/musiclm/examples/) [Dataset](https://www.kaggle.com/datasets/googleai/musiccaps)\n\n\n### \u003cspan id=\"2022audiodeep\" style=\"color:#A8FF9E; font-size:20.0pt\"\u003e2022\u003c/span\u003e\n\n#### \u003cspan id=\"musika\" style=\"color:#FF9EC3; font-size:15.0pt\"\u003eMusika\u003c/span\u003e\n\nPasini, M., \u0026 Schlüter, J. (2022). Musika! Fast Infinite Waveform Music Generation. arXiv preprint arXiv:2208.08706.\n\n[Paper](https://arxiv.org/abs/2208.08706)\n\n#### \u003cspan id=\"audiolm\" style=\"color:#FF9EC3; font-size:15.0pt\"\u003eAudioLM\u003c/span\u003e\n\nBorsos, Z., Marinier, R., Vincent, D., Kharitonov, E., Pietquin, O., Sharifi, M., ... \u0026 Zeghidour, N. (2022). Audiolm: a language modeling approach to audio generation. arXiv preprint arXiv:2209.03143.\n\n[Paper](https://arxiv.org/abs/2209.03143) [Samples](https://google-research.github.io/seanet/audiolm/examples/)\n\n### \u003cspan id=\"2021audiodeep\" style=\"color:#A8FF9E; font-size:20.0pt\"\u003e2021\u003c/span\u003e\n\n#### \u003cspan id=\"rave\" style=\"color:#FF9EC3; font-size:15.0pt\"\u003eRAVE\u003c/span\u003e\n\nCaillon, A., \u0026 Esling, P. (2021). RAVE: A variational autoencoder for fast and high-quality neural audio synthesis. arXiv preprint arXiv:2111.05011.\n\n[Paper](https://arxiv.org/abs/2111.05011) [GitHub](https://github.com/acids-ircam/RAVE)\n\n\n### \u003cspan id=\"2020audiodeep\" style=\"color:#A8FF9E; font-size:20.0pt\"\u003e2020\u003c/span\u003e\n\n#### \u003cspan id=\"musenet\" style=\"color:#FF9EC3; font-size:15.0pt\"\u003eJukebox - OpenAI\u003c/span\u003e\n\n\u003cimg src=\"images/Jukebox.png\" width=\"400\" height=\"150\"\u003e\n\n[Web](https://openai.com/blog/jukebox/) [Paper](https://arxiv.org/abs/2005.00341) [GitHub](https://github.com/openai/jukebox/)\n\n### \u003cspan id=\"2017audiodeep\" style=\"color:#A8FF9E; font-size:20.0pt\"\u003e2017\u003c/span\u003e\n\n#### \u003cspan id=\"musenet\" style=\"color:#FF9EC3; font-size:15.0pt\"\u003eMuseNet - OpenAI\u003c/span\u003e\n\n[Web](https://openai.com/blog/musenet/)\n\n## \u003cspan id=\"datasets\" style=\"color:#9EB1FF; font-size:25.0pt\"\u003e5. Datasets\u003c/span\u003e\n\n* JSB Chorales Dataset [Web](http://www-ens.iro.umontreal.ca/~boulanni/icml2012)\n\n* Maestro Dataset [Web](https://magenta.tensorflow.org/datasets/maestro)\n\n* The Lakh MIDI Dataset v0.1 [Web](https://colinraffel.com/projects/lmd/) [Tutorial IPython](https://nbviewer.jupyter.org/github/craffel/midi-dataset/blob/master/Tutorial.ipynb)\n\n* MetaMIDI Dataset [Web](https://metacreation.net/metamidi-dataset/) [Zenodo](https://zenodo.org/record/5142664)\n\n\n## \u003cspan id=\"journals\" style=\"color:#9EB1FF; font-size:25.0pt\"\u003e6. Journals and Conferences\u003c/span\u003e\n\n* International Society for Music Information Retrieval (ISMIR) [Web](https://www.ismir.net/)\n\n* IEEE Signal Processing (ICASSP) [Web](https://signalprocessingsociety.org/publications-resources)\n\n* ELSEVIER Signal Processing Journal [Web](https://www.journals.elsevier.com/signal-processing)\n\n* Association for the Advancement of Artificial Intelligence (AAAI) [Web](https://www.aaai.org/)\n\n* Journal of Artificial Intelligence Research (JAIR) [Web](https://www.jair.org/index.php/jair)\n\n* International Joint Conferences on Artificial Intelligence (IJCAI) [Web](https://www.ijcai.org/)\n\n* International Conference on Learning Representations (ICLR) [Web](https://iclr.cc)\n\n* IET Signal Processing Journal [Web](https://digital-library.theiet.org/content/journals/iet-spr)\n\n* Journal of New Music Research (JNMR) [Web](https://www.tandfonline.com/loi/nnmr20)\n\n* Audio Engineering Society - Conference on Semantic Audio (AES) [Web](http://www.aes.org/)\n\n* International Conference on Digital Audio Effects (DAFx) [Web](http://dafx.de/)\n\n\n## \u003cspan id=\"authors\" style=\"color:#9EB1FF; font-size:25.0pt\"\u003e7. Authors\u003c/span\u003e\n\n* David Cope [Web](http://artsites.ucsc.edu/faculty/cope/)\n\n* Colin Raffel [Web](https://colinraffel.com/)\n\n* Jesse Engel [Web](https://jesseengel.github.io/)\n\n* Douglas Eck [Web](http://www.iro.umontreal.ca/~eckdoug/)\n\n* Anna Huang [Web](https://mila.quebec/en/person/anna-huang/)\n\n* François Pachet [Web](https://www.francoispachet.fr/)\n\n* Jeff Ens [Web](https://jeffens.com/)\n\n* Philippe Pasquier [Web](https://www.sfu.ca/siat/people/research-faculty/philippe-pasquier.html)\n\n\n## \u003cspan id=\"labs\" style=\"color:#9EB1FF; font-size:25.0pt\"\u003e8. Research Groups and Labs\u003c/span\u003e\n\n* Google Magenta [Web](https://magenta.tensorflow.org/)\n\n* Audiolabs Erlangen [Web](https://www.audiolabs-erlangen.de/)\n\n* Music Informatics Group [Web](https://musicinformatics.gatech.edu/)\n\n* Music and Artificial Intelligence Lab [Web](https://musicai.citi.sinica.edu.tw/)\n\n* Metacreation Lab [Web](https://metacreation.net/)\n\n\n## \u003cspan id=\"apps\" style=\"color:#9EB1FF; font-size:25.0pt\"\u003e9. Apps for Music Generation with AI\u003c/span\u003e\n\n* AIVA (paid) [Web](https://www.aiva.ai/)\n\n* Amper Music (paid) [Web](https://www.ampermusic.com/)\n\n* Ecrett Music (paid) [Web](https://ecrettmusic.com/)\n\n* Humtap (free, iOS) [Web](https://www.humtap.com/)\n\n* Amadeus Code (free/paid, iOS) [Web](https://amadeuscode.com/top)\n\n* Computoser (free) [Web](computoser.com)\n\n* Brain.fm (paid) [Web](https://www.brain.fm/login?next=/app/player)\n\n\n## \u003cspan id=\"other-resources\" style=\"color:#9EB1FF; font-size:25.0pt\"\u003e10. Other Resources\u003c/span\u003e\n\n* Bustena (web in spanish to learn harmony theory) [Web](http://www.bustena.com/curso-de-armonia-i/)\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcarlosholivan%2FDeepLearningMusicGeneration","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcarlosholivan%2FDeepLearningMusicGeneration","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcarlosholivan%2FDeepLearningMusicGeneration/lists"}