https://github.com/sinedied/am2mid
YouTube → isolated melody stem → MIDI pipeline for trance & co. (yt-dlp + Demucs + Basic Pitch)
https://github.com/sinedied/am2mid
audio-to-midi basic-pitch demucs midi music-information-retrieval python-cli stem-separation trance youtube yt-dlp
Last synced: 5 days ago
JSON representation
YouTube → isolated melody stem → MIDI pipeline for trance & co. (yt-dlp + Demucs + Basic Pitch)
- Host: GitHub
- URL: https://github.com/sinedied/am2mid
- Owner: sinedied
- License: mit
- Created: 2026-05-30T19:33:03.000Z (24 days ago)
- Default Branch: main
- Last Pushed: 2026-05-30T20:48:58.000Z (23 days ago)
- Last Synced: 2026-06-09T11:31:29.218Z (14 days ago)
- Topics: audio-to-midi, basic-pitch, demucs, midi, music-information-retrieval, python-cli, stem-separation, trance, youtube, yt-dlp
- Language: Python
- Size: 44.9 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# am2mid
**am2mid** = *Audio Melody to MIDI*. Tiny CLI that turns a YouTube track into a melody MIDI file (and/or an isolated melody stem you can drag into [NeuralNote](https://github.com/DamRsn/NeuralNote) or any other audio→MIDI VST).
Pipeline:
```
YouTube URL ──► yt-dlp ──► (optional trim) ──► Demucs (htdemucs) ──► melody stem ──► Basic Pitch ──► .mid
```
## Requirements
System tools (install once):
```bash
brew install ffmpeg
```
Python deps are managed with [uv](https://docs.astral.sh/uv/):
```bash
uv sync
```
That installs `yt-dlp`, `demucs`, `basic-pitch`, etc. into a local `.venv`.
## Usage
Single track (default: all stems saved as wav, MIDI generated for `other`):
```bash
uv run am2mid "https://www.youtube.com/watch?v=zGDzdps75ns" \
--range "1:05-2:05" \
--out ./output
```
Skip transcription if you plan to use NeuralNote VST instead:
```bash
uv run am2mid "" --range "1:05-2:05" --no-midi
```
Transcribe every stem (not just `other`):
```bash
uv run am2mid "" --range "1:05-2:05" --midi-all
```
Quantize the transcription to a 1/16th grid (BPM auto-detected from the drums stem):
```bash
uv run am2mid "" --range "1:05-2:05" --quantize 1/16
```
Force a specific BPM and also keep the stem .wav files:
```bash
uv run am2mid "" --range "1:05-2:05" --quantize 1/16 --bpm 138 --with-stems
```
Use the 6-stem model when the lead is a recognizable guitar/piano:
```bash
uv run am2mid "" --range "1:05-2:05" --model htdemucs_6s --stem piano
```
Batch mode from a YAML file (see [`example.yaml`](./example.yaml)):
```bash
uv run am2mid --batch example.yaml
```
### Options
| Flag | Default | Description |
|------|---------|-------------|
| `url` (positional) | — | YouTube URL (omit if using `--batch`). |
| `--range`, `-r` | full track | Melody window, format `MM:SS-MM:SS`. |
| `--batch`, `-b` | — | YAML file with a list of songs. |
| `--out`, `-o` | `./output` | Output directory. |
| `--name`, `-n` | from video title | Folder name (single-URL mode). |
| `--stem` | `other` | Which stem gets transcribed to MIDI. All stems are always saved as `.wav`. |
| `--model` | `htdemucs` | Demucs model (use `htdemucs_6s` for guitar/piano stems). |
| `--midi-all` | off | Transcribe every stem, not just `--stem`. |
| `--no-midi` | off | Skip basic-pitch (use a VST instead). |
| `--quantize`, `-q` | off | Snap MIDI notes to a grid. Subdivision as `1/16`, `1/8`, `16`… (powers of 2, 1–32). Writes a sidecar `*.q.mid`. |
| `--bpm` | `auto` (when `--quantize`) | Tempo. Either a number (`138`) or `auto` to detect from the drums stem. |
| `--with-stems`, `-s` / `--no-stems` | on | Save the chosen melody stem `.wav` (only `--stem`). |
| `--all-stems`, `-a` | off | Save ALL stems as `.wav` (for A/B comparison). |
| `--with-full` / `--no-full` | on | Save the un-separated full audio as `midi/stems/_full.wav`. |
| `--bpm-range` | `120-160` | Window used to fold auto-detected BPM (e.g. 70 → 140). Format `MIN-MAX`. Ignored when `--bpm` is explicit. |
| `--force`, `-f` | off | Overwrite existing midi/stem files without prompting. |
### Output layout
```
output/
midi/
.mid # primary MIDI (the --stem one, default 'other')
.q16.mid # quantized sidecar (if --quantize)
_drums.mid # extra MIDIs when --midi-all
stems/ # created when --with-stems / --all-stems / --with-full
_full.wav # un-separated audio (default on)
_other.wav # chosen melody stem (default on)
_drums.wav # other stems only with --all-stems
...
.work// # raw download + demucs intermediates
```
`` is your `--name`, or the YouTube title sanitized into a readable, filesystem-safe form.
### YAML batch + global defaults + per-song overrides
`example.yaml`:
```yaml
defaults: # any CLI flag, applied to every song; CLI overrides win
quantize: 1/16
bpm: auto # auto-detected, folded into bpm_range
bpm_range: 120-160 # trance-friendly window
with_full: true
songs:
- url: https://www.youtube.com/watch?v=...
name: my-track
range: "1:05-2:05"
- url: https://www.youtube.com/watch?v=...
name: tricky-tempo
bpm: 140 # per-song override beats defaults + CLI
all_stems: true # also dump every stem just for this track
```
Run with:
```bash
uv run am2mid --batch example.yaml
# Override a default at runtime (per-song overrides still win):
uv run am2mid --batch example.yaml --no-full
```
### Auto BPM folding
`auto` (the default when `--quantize` is set) runs librosa's beat tracker on the **drums** stem (cleaner than the full mix). The raw estimate is then folded into `--bpm-range` (default `120-160`) by ×2 / ÷2 multiples, so half-time detections like 70 BPM become 140 BPM. Explicit `--bpm 138` is never altered.
## Tips for trance leads
- The synth lead almost always lands in the `other` stem of `htdemucs` (the default 4-stem model). If the track has a clear guitar or piano lead, switch to the 6-stem model with `--model htdemucs_6s --stem guitar` (or `piano`).
- Pick a clean melodic section (no vocal chops, no big build-up FX) with `--range` — Basic Pitch is much happier with 30–60s of clear melody than the whole 7-minute track.
- For best quality, keep the stem (`--keep-stem`) and run it through [NeuralNote](https://github.com/DamRsn/NeuralNote) in your DAW, then nudge octaves and quantize manually.