https://github.com/eleutherai/aria-amt
Efficient and robust implementation of seq-to-seq automatic piano transcription.
https://github.com/eleutherai/aria-amt
Last synced: 8 months ago
JSON representation
Efficient and robust implementation of seq-to-seq automatic piano transcription.
- Host: GitHub
- URL: https://github.com/eleutherai/aria-amt
- Owner: EleutherAI
- License: apache-2.0
- Created: 2024-02-12T14:24:01.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2025-07-09T13:07:00.000Z (11 months ago)
- Last Synced: 2025-07-09T14:24:23.636Z (11 months ago)
- Language: Python
- Homepage:
- Size: 91.8 MB
- Stars: 50
- Watchers: 2
- Forks: 9
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# aria-amt
Efficient and robust implementation of seq-to-seq automatic piano transcription.
## Install
Requires Python 3.11
```
git clone https://github.com/EleutherAI/aria-amt.git
cd aria-amt
pip install -e .
```
Download the preliminary model weights:
Piano (v1)
```
wget https://storage.googleapis.com/aria-checkpoints/amt/piano-medium-double-1.0.safetensors
```
## Usage
You can download mp3s from youtube using [yt-dlp](https://github.com/yt-dlp/yt-dlp):
```
yt-dlp --audio-format mp3 --extract-audio --no-playlist --audio-quality 0 -o
```
You can then transcribe using the cli:
```
aria-amt transcribe \
medium-double \
\
-load_path \
-save_dir \
-bs 1 \
-compile
```
If you want to do batch transcription, use the `-load_dir` flag and adjust `-bs` accordingly. Compiling and may take some time, but provides a significant speedup. Quantizing (`-q8` flag) further speeds up inference when the `-compile` flag is also used.
NOTE: Int8 quantization is only supported on GPUs that support BF16.