{"id":13862149,"url":"https://github.com/EleutherAI/aria","last_synced_at":"2025-07-14T11:32:30.563Z","repository":{"id":169009069,"uuid":"642937730","full_name":"EleutherAI/aria","owner":"EleutherAI","description":null,"archived":false,"fork":false,"pushed_at":"2025-06-30T14:45:43.000Z","size":477,"stargazers_count":48,"open_issues_count":0,"forks_count":11,"subscribers_count":3,"default_branch":"main","last_synced_at":"2025-06-30T15:24:32.054Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/EleutherAI.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2023-05-19T17:34:59.000Z","updated_at":"2025-06-30T14:45:46.000Z","dependencies_parsed_at":"2024-01-24T18:49:05.084Z","dependency_job_id":"1176a3d7-e865-4b51-834b-1873866f321c","html_url":"https://github.com/EleutherAI/aria","commit_stats":null,"previous_names":["eleutherai/music-transformer","eleutherai/aria"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/EleutherAI/aria","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/EleutherAI%2Faria","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/EleutherAI%2Faria/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/EleutherAI%2Faria/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/EleutherAI%2Faria/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/EleutherAI","download_url":"https://codeload.github.com/EleutherAI/aria/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/EleutherAI%2Faria/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":265285687,"owners_count":23740581,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-05T06:01:38.255Z","updated_at":"2025-07-14T11:32:30.549Z","avatar_url":"https://github.com/EleutherAI.png","language":"Python","funding_links":[],"categories":["Python"],"sub_categories":[],"readme":"# Aria\n\nThis repository contains training, inference, and evaluation code for the paper [*Scaling Self-Supervised Representation Learning for Symbolic Piano Performance (ISMIR 2025)*](https://arxiv.org/abs/2506.23869), as well as implementations of our real-time piano continuation demo. *Aria* is a pretrained autoregressive generative model for symbolic music, based on the LLaMA 3.2 (1B) architecture, which was trained on ~60k hours of MIDI transcriptions of expressive solo-piano recordings. Alongside the base model, we are releasing a checkpoint finetuned to improve generative quality, as well as a checkpoint finetuned to produce general-purpose piano MIDI embeddings using a SimCSE-style contrastive training objective.\n\n📖 Read our [paper](https://arxiv.org/abs/2506.23869)  \n🤗 Access our models via the [HuggingFace page](https://huggingface.co/loubb/aria-medium-base)  \n📊 Get access to our training dataset [Aria-MIDI](https://huggingface.co/datasets/loubb/aria-midi) and train your own models\n\n## Installation \n\nInstallation requires Python 3.11+. To install the package and all dependencies with pip:\n\n```bash\ngit clone https://github.com/EleutherAI/aria \ncd aria\npip install -e \".[all]\"\n```\n\n## Quickstart\n\nDownload model weights from the official HuggingFace page for our pretrained model, as well as checkpoints finetuned for piano-continuation and generating MIDI-embeddings: \n\n- `aria-medium-base` ([huggingface](https://huggingface.co/loubb/aria-medium-base), [direct-download](https://huggingface.co/loubb/aria-medium-base/resolve/main/model.safetensors?download=true))\n- `aria-medium-gen`([huggingface](https://huggingface.co/loubb/aria-medium-gen), [direct-download](https://huggingface.co/loubb/aria-medium-gen/resolve/main/model.safetensors?download=true)) \n- `aria-medium-embedding`([huggingface](https://huggingface.co/loubb/aria-medium-embedding), [direct-download](https://huggingface.co/loubb/aria-medium-embedding/resolve/main/model.safetensors?download=true)) \n\n### Inference (Prompt Continuation)\n\nWe provide optimized model implementations for PyTorch (CUDA) and MLX (Apple Silicon). You can generate continuations of a MIDI file using the CLI, e.g., using CUDA (Linux):\n\n```bash\naria generate \\\n    --backend torch_cuda \\\n    --checkpoint_path \u003cpath-to-model-weights\u003e \\\n    --prompt_midi_path \u003cpath-to-midi-file-to-continue\u003e \\\n    --prompt_duration \u003clength-in-seconds-for-prompt\u003e \\\n    --variations \u003cnumber-of-variations-to-generate\u003e \\\n    --temp 0.98 \\\n    --min_p 0.035 \\\n    --length 2048 \\\n    --save_dir \u003cdir-to-save-results\u003e\n```\n\nSince the model has not been post-trained with instruction tuning or RLHF (similar to pre-instruct GPT models), it is very sensitive to input quality and performs best when prompted with well-played music. To get prompt MIDI files, see the `example-prompts/` directory, explore the [Aria-MIDI](https://huggingface.co/datasets/loubb/aria-midi) dataset, or transcribe your own files using our [piano-transcription model](https://github.com/EleutherAI/aria-amt). For a full list of sampling options: `aria generate -h`. If you wish to do inference on the CPU, please see the platform-agnostic implementation on our HuggingFace page [link].\n\n### Intended Use and Limitations\n\nAria performs best when **continuing existing piano MIDI files** rather than generating music from scratch. While multi-track tokenization and generation are supported, the model was trained primarily on **single-track expressive piano performances**, and we recommend using single-track inputs for optimal results.\n\nDue to the high representation of popular classical works (e.g., Chopin) in the training data and the difficulty of complete deduplication, the model may **memorize or closely reproduce** such pieces. For more original outputs, we suggest prompting Aria with **lesser-known works or your own compositions**.\n\n### Inference (MIDI embeddings)\n\nYou can generate embeddings from MIDI files using the `aria.embeddings` module. This is primarily exposed with the `get_global_embedding_from_midi` function, for example:\n\n```python\nfrom aria.embeddings import get_global_embedding_from_midi\nfrom aria.model import TransformerEMB, ModelConfig\nfrom aria.config import load_model_config\nfrom ariautils.tokenizer import AbsTokenizer\nfrom safetensors.torch import load_file\n\n# Load model\nmodel_config = ModelConfig(**load_model_config(name=\"medium-emb\"))\nmodel_config.set_vocab_size(AbsTokenizer().vocab_size)\nmodel = TransformerEMB(model_config)\nstate_dict = load_file(filename=CHECKPOINT_PATH)\nmodel.load_state_dict(state_dict=state_dict, strict=True)\n\n# Generate embedding\nembedding = get_global_embedding_from_midi(\n    model=model,\n    midi_path=MIDI_PATH,\n    device=\"cpu\",\n)\n```\n\nOur embedding model was trained to capture composition-level and performance-level attributes, and therefore might not be appropriate for every use case.\n\n## Real-time demo\n\nIn `demo/` we provide CUDA (Linux/PyTorch) and MLX (Apple Silicon) implementations of the real-time interactive piano-continuation demo showcased in our release blog post. For the demo we used an acoustic Yamaha Disklavier piano with simultaneous MIDI input and output ports connected via a standard MIDI interface. \n\n❗**NOTE**: Responsiveness of the real-time demo is dependent on your system configuration, e.g., GPU FLOPS and memory bandwidth.\n\nA MIDI input device is not strictly required to play around with the demo: By using the `--midi_path` and `--midi_through` arguments you can mock real-time input by playing from a MIDI file. All that is required are MIDI drivers (e.g., CoreMIDI, ALSA) and a virtual software instrument (e.g., Fluidsynth, Pianoteq) to render the output. \n\nExample usage (MLX):\n\n```bash\nMIDI_PATH=\"example-prompts/pokey_jazz.mid\"\n\npython demo/demo_mlx.py \\\n    --checkpoint \u003ccheckpoint-path\u003e \\\n    --midi_path ${MIDI_PATH} \\\n    --midi_through \u003cport-to-stream-midi-file-through\u003e \\  \n    --midi_out \u003cport-to-stream-generation-over\u003e \\\n    --save_path \u003cpath-to-save-result\u003e \\\n    --temp 0.98 \\\n    --min_p 0.035\n```\n\n## Evaluation\n\nWe provide the specific files/splits we used for Aria-MIDI derived linear-probe and classification evaluations. These can be downloaded from HuggingFace ([direct-download](https://huggingface.co/loubb/aria-medium-base/resolve/main/eval-splits.tar.gz?download=true)). Class labels are provided in `metadata.json` with the schema:\n\n```json\n{\n  \"\u003ccategory\u003e\": {\n    \"\u003csplit-name\u003e\": {\n      \"\u003crelative/path/to/file.mid\u003e\": \"\u003cmetadata_value_for_that_category\u003e\",\n      ...\n    },\n    ...\n  },\n  ...\n}\n```\n\n## License and Attribution\n\nThe Aria project has been kindly supported by EleutherAI, Stability AI, as well as by a compute grant from the Ministry of Science and ICT of Korea. Our models and MIDI tooling are released under the Apache-2.0 license. If you use the models or tooling for follow-up work, please cite the paper in which they were introduced:\n\n```bibtex\n@inproceedings{bradshawscaling,\n  title={Scaling Self-Supervised Representation Learning for Symbolic Piano Performance},\n  author={Bradshaw, Louis and Fan, Honglu and Spangher, Alexander and Biderman, Stella and Colton, Simon},\n  booktitle={arXiv preprint},\n  year={2025},\n  url={https://arxiv.org/abs/2506.23869}\n}\n```","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FEleutherAI%2Faria","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FEleutherAI%2Faria","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FEleutherAI%2Faria/lists"}