{"id":49250432,"url":"https://github.com/simonholliday/subsample","last_synced_at":"2026-04-25T00:02:44.085Z","repository":{"id":351285757,"uuid":"1184700479","full_name":"simonholliday/subsample","owner":"simonholliday","description":"Open-source Python live sampler, automatic drum-kit builder, and MIDI sample instrument. Records or imports any audio, fingerprints each sound across 58 acoustic dimensions, and maps them to MIDI notes by similarity - automatically, in real time.","archived":false,"fork":false,"pushed_at":"2026-04-14T10:28:06.000Z","size":2403,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-04-14T12:22:49.151Z","etag":null,"topics":["audio","audio-analysis","beat-detection","digital-signal-processing","drum-machine","dsp","field-recording","live-coding","midi","music","music-information-retrieval","music-production","osc","pitch-shifting","python","recorder","sample","sample-library","sampler","time-stretching"],"latest_commit_sha":null,"homepage":"https://simonholliday.com/projects/subsample","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"agpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/simonholliday.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-03-17T21:06:55.000Z","updated_at":"2026-04-14T10:28:09.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/simonholliday/subsample","commit_stats":null,"previous_names":["simonholliday/subsample"],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/simonholliday/subsample","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/simonholliday%2Fsubsample","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/simonholliday%2Fsubsample/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/simonholliday%2Fsubsample/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/simonholliday%2Fsubsample/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/simonholliday","download_url":"https://codeload.github.com/simonholliday/subsample/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/simonholliday%2Fsubsample/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32245154,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-24T13:21:15.438Z","status":"ssl_error","status_checked_at":"2026-04-24T13:21:15.005Z","response_time":64,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["audio","audio-analysis","beat-detection","digital-signal-processing","drum-machine","dsp","field-recording","live-coding","midi","music","music-information-retrieval","music-production","osc","pitch-shifting","python","recorder","sample","sample-library","sampler","time-stretching"],"created_at":"2026-04-25T00:02:38.422Z","updated_at":"2026-04-25T00:02:44.042Z","avatar_url":"https://github.com/simonholliday.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Subsample\n\n**Cross-platform open-source Python live sampler, automatic drum-kit builder,\nand MIDI sample instrument.** Point a microphone at the world (or feed in\nfield recordings, sample packs, or radio captures) and Subsample captures,\ntrims, analyses, and routes every distinct sound into a playable, mix-ready\nMIDI instrument, automatically, in real time. Workflows that normally require\nexpensive hardware samplers, sample-pack organiser plugins, or hours of manual\nchopping and tagging happen continuously in the background.\n\nBuild a custom drum kit from your favourite vinyl. Turn a walk through the\nwoods into a playable instrument. Slice and re-tempo a breakbeat. Feed a pile\nof unsorted samples in and watch them organise themselves. All four are the\nsame workflow.\n\nTraditional samplers - hardware or software - require you to manually record,\nchop, name, categorise, and map every sample by hand. Subsample automates the\nentire pipeline: it detects individual sounds from a live audio stream or\npre-recorded files, builds a 58-element acoustic fingerprint for each one,\nassigns them to MIDI notes based on how they sound, and runs a per-sample DSP\nprocessing chain that adapts its parameters from the audio content itself. A\nchaotic environment becomes an organised, mix-ready sample instrument while\nyou focus on playing.\n\n\n## Contents\n\n- **1. Getting Started**\n  - [Why Subsample?](#why-subsample)\n  - [At a glance](#at-a-glance)\n  - [Quick start](#quick-start)\n- **2. Concepts**\n  - [How it works](#how-it-works)\n  - [MIDI map](#midi-map)\n  - [Similarity engine](#similarity-engine)\n  - [Transforms](#transforms)\n- **3. Configuration \u0026 Operation**\n  - [Configuration](#configuration)\n  - [Output](#output)\n  - [Instrument sample library](#instrument-sample-library)\n  - [Reference sample library](#reference-sample-library)\n  - [Live-coding the MIDI map](#live-coding-the-midi-map)\n- **4. Integration**\n  - [Virtual MIDI](#virtual-midi)\n  - [OSC integration](#osc-integration)\n  - [Works with Subsequence](#works-with-subsequence)\n- **5. Project Info**\n  - [Performance](#performance)\n  - [Scripts](#scripts)\n  - [Roadmap](#roadmap)\n  - [Architecture](#architecture)\n  - [Requirements](#requirements)\n  - [Tests](#tests)\n  - [Type Checking](#type-checking)\n  - [Dependencies and Credits](#dependencies-and-credits)\n  - [About the Author](#about-the-author)\n  - [License](#license)\n  - [Commercial licensing](#commercial-licensing)\n\n\n## Why Subsample?\n\n- **A studio sampler that builds itself.** Drop samples in (or record them\n  live) and Subsample maps them to MIDI notes, processes them through an\n  adaptive DSP chain, and presents a playable, mix-ready instrument with no\n  manual chopping, naming, or mapping. Free, open-source, and runs anywhere\n  CPython 3.12 does - from a Raspberry Pi in the rehearsal room to a studio\n  Mac or Linux rack server.\n- **Automatic similarity-based sample organisation.** A 58-dimensional\n  acoustic fingerprint matches kicks to kick pads, snares to snare pads,\n  hi-hats to hi-hat pads - no labels, no training data, no manual tagging.\n  The same engine handles tonal samples without special treatment, and works\n  equally well on a chaotic Splice library or a fresh field-recording session.\n- **Real-time live sampling.** Point a microphone at the world and Subsample\n  captures, trims, analyses, and adds every distinct sound event to your\n  instrument library as it happens. Adaptive noise floor tracking works in\n  noisy rehearsal rooms as well as quiet studios; back-to-back sounds are\n  captured reliably with zero-gap detection.\n- **Beat slicer and auto-quantize for loops.** Detected onsets in long samples\n  are individually placed on a beat grid using onset-aligned timemaps - loops\n  snap to your target BPM with musical precision. A pad-quantize mode\n  preserves natural timbre by inserting silence between hits instead of\n  time-stretching. Per-hit segment playback: cycle through hits with\n  `round_robin`, pick randomly with `random`, or map specific segments to\n  specific notes by index.\n- **Pitched and percussive in one engine.** Tonal samples are auto-detected by\n  a seven-criterion stability gate and pitch-shifted across the keyboard range\n  at the highest available quality (Rubber Band offline finer mode). Drums,\n  melodic, and effect samples share one library and one workflow.\n- **17-processor DSP chain with intelligent defaults.** Compression, gating,\n  transient shaping, filters, distortion, saturation, vocoder cross-synthesis,\n  beat-quantize, pitch-shift, time-stretch, reverse, envelope reshape, and\n  HPSS harmonic/percussive separation. Every parameter auto-adapts to each\n  sample's analysis data - write `compress: true` and the right threshold,\n  attack, and release are derived from the audio. Variants are pre-rendered\n  in a background worker pool and ready before you press a key.\n- **Sweep anything with a knob.** Bind any numeric parameter - filter cutoff,\n  beat-quantize amount, distortion drive, compression threshold - to a MIDI\n  CC controller. Variants are re-rendered in the background between knob\n  positions and bridged smoothly, so you can play with parameters that aren't\n  normally automatable on samplers at all.\n- **Multichannel in, multichannel out.** Records from any subset of physical\n  inputs on a multi-channel interface (e.g. inputs 3-4 of a Focusrite Scarlett\n  18i20). Routes individual instruments to specific outputs (kick to outputs\n  1-2, snare to outputs 3-4) for separate external processing. Standard\n  ITU-R BS.775 downmix and conservative upmix for stereo, quad, 5.1, and 7.1.\n  First-order ambisonic capture from tetrahedral mics (Rode NT-SF1,\n  generic A-format, or pre-encoded B-format FuMA/AmbiX) with decoder and\n  rotation at playback time - see [Ambisonic](#ambisonic-capture).\n- **WAV or lossless FLAC storage.** Opt into FLAC (`audio_format: flac`) to\n  shrink your sample library by ~40-60% with zero quality loss. Existing\n  WAV samples continue to load unchanged alongside any new FLAC captures -\n  see [Storage format](#storage-format).\n- **Visual sample previews.** Every capture gets a fixed 1024x256 `.preview.png`\n  thumbnail (waveform + 4-band frequency skyline + onset ticks + pitch/BPM\n  badge) for browsing in an OS file manager, plus a compact preview-data\n  block embedded in the analysis sidecar that the Supervisor dashboard\n  renders as scalable SVG on demand - see [Sample previews](#sample-previews).\n- **Headless and config-driven.** Everything is YAML - version-controllable,\n  reproducible, no GUI required. Runs equally well on a studio Mac, a\n  Raspberry Pi in the rehearsal room, or a rack server. Drive it from any\n  DAW, hardware controller, or sequencer over standard MIDI.\n- **Plays nicely with the rest of your studio.** Standard MIDI input from any\n  DAW or hardware controller, [virtual MIDI](#virtual-midi) ports for\n  software-only routing on the same machine, [OSC integration](#osc-integration)\n  for talking to sequencers and visualisers, and a ready-to-play GM drums map\n  that turns any sample collection into a coherent, pre-mixed drum kit on\n  first play.\n- **Pairs with Subsequence.** Subsample is one part of a fully open-source\n  generative sampler workstation - its sister project\n  [Subsequence](https://github.com/simonholliday/subsequence) is a Python\n  MIDI sequencer. Subsequence drives the patterns; Subsample provides the\n  sounds. Each works independently - see [Works with Subsequence](#works-with-subsequence).\n\n\n## At a glance\n\n| | |\n|---|---|\n| **Live capture** | Adaptive noise floor, zero-gap back-to-back detection, S-curve fades |\n| **Analysis** | 58 dimensions across 5 feature groups; cached `.analysis.json` sidecars |\n| **Matching** | Cosine similarity, classification-free, ranked fallback, dynamic re-assignment |\n| **DSP processors** | 17 (filter, comp, gate, distort, saturate, reshape, transient, HPSS, vocoder, repitch, beat-quantize, pad-quantize, ...) |\n| **Adaptive defaults** | Compressor, gate, transient shaper, distortion, envelope reshape - all auto-derive parameters from each sample |\n| **Pitch shifting** | Rubber Band offline finer (highest available quality), pre-rendered |\n| **Time stretch** | Beat-quantized with onset-aligned timemaps, partial-quantize amount, pad-quantize alternative for speech |\n| **Segment playback** | Per-hit round-robin, random, or indexed - for sliced loops |\n| **MIDI input** | Hardware port, named virtual port, or both |\n| **MIDI control** | Note on/off, Program Change for banks, CC binding for any numeric parameter |\n| **OSC** | Sender + receiver (optional dependency) |\n| **Audio formats in** | WAV, BWF, FLAC, AIFF, OGG, MP3/MPEG (libsndfile) |\n| **Channels** | Mono through 7.1, ITU-R BS.775 downmix, conservative upmix, per-instrument output routing |\n| **Audio precision** | End-to-end 32-bit float pipeline, 64-bit DSP for IIR filters and envelope followers |\n| **Latency** | Pre-rendered variants - playback is a memory copy into the mix buffer |\n| **Library mgmt** | Memory-bounded with FIFO eviction, persistent disk cache for variants, hot-loading from watched directories |\n| **Live-coding** | Edit the MIDI map YAML and assignments reload on save |\n| **Bank switching** | Multiple instrument directories swappable via MIDI Program Change |\n| **GM drums** | Ready-to-play map of 47 GM percussion instruments with researched mix chain |\n| **Configuration** | YAML, version-controllable, headless, no GUI |\n| **Platform** | Linux, macOS, Windows (via WSL), Raspberry Pi - anywhere CPython 3.12 runs |\n| **License** | AGPL-3.0 (commercial licensing on request) |\n\n\n## How it works\n\n### 1. Capture\n\nSubsample listens continuously to a live audio input and captures every distinct\nsound event. An adaptive noise floor (exponential moving average) tracks the\nambient level in real time, so it works equally well in a quiet studio and a\nnoisy rehearsal space. Each captured sound is trimmed with smooth S-curve fades\nto avoid clicks.\n\nAll channel formats are preserved end-to-end - a stereo microphone records and\nplays back in stereo, a quad recording keeps its four channels, and\nmultichannel samples are automatically mapped to the output layout using\nstandard ITU downmix coefficients. On multi-channel interfaces (e.g. Focusrite\nScarlett 18i20), `recorder.audio.input` selects which physical inputs to\nrecord from - for example `[3, 4]` records a stereo pair from inputs 3 and 4.\n\nYou can also feed it pre-recorded WAV files - they pass through the same\ndetection pipeline, making it easy to build sample libraries from existing\nrecordings. For pre-trimmed sources (commercial sample packs, field recordings,\nSDR radio captures), `import_samples.py` bypasses detection entirely and imports\nfiles directly with silence trimming, safety fades, re-encoding, and full\nanalysis.\n\n### 2. Analyse\n\nEach captured sound is fingerprinted across 58 acoustic dimensions spanning five\ngroups:\n\n| Group | Dimensions | What it captures |\n|-------|-----------|------------------|\n| Spectral shape | 14 | Brightness, noisiness, attack/release character |\n| Sustained timbre | 12 | Steady-state tonal colour |\n| Timbre dynamics | 12 | How the sound evolves over time |\n| Attack character | 12 | Transient signature |\n| Band energy | 8 | Per-band energy distribution and decay (drum-type signature) |\n\nTonal sounds are identified by a seven-criterion pitch stability gate - only\nsamples with a single, confident, stable pitch are flagged for chromatic mapping.\nPercussive sounds are handled naturally by the same feature space without special\ntreatment.\n\nAnalysis results are cached as `.analysis.json` sidecar files alongside each\nWAV. The cache is versioned and auto-invalidating - when the analysis algorithm\nimproves, stale sidecars are detected and re-analysed automatically on startup.\n\n### 3. Assign\n\nSounds are matched to your reference library using cosine similarity on the\n58-element feature vector. The best kick-like sound maps to your kick pad; the\nbest snare maps to your snare. When multiple notes share a reference, they\nreceive ranked matches: first note gets the best match, second note gets the\nsecond-best, and so on.\n\nAs new sounds arrive, assignments update dynamically. Evicted samples are\nreplaced by the next-best match. The instrument stays playable and fresh without\nany manual intervention.\n\n### 4. Process and mix\n\nEach assigned sample passes through a per-instrument DSP processing chain before\nplayback. The chain is declared in the MIDI map - a sequence of processors that\ncan include filtering, compression, limiting, gating, distortion, saturation,\nenvelope reshaping, transient shaping, time-stretching, pitch-shifting, reversal,\nharmonic/percussive separation, and beat quantization. Variants are computed\noffline in a background worker pool and cached to disk, so by the time you press\na key the processed audio is already waiting in memory.\n\nEvery processor is designed with **intelligent defaults that adapt per sample**.\nFilters default to classic console channel-strip values (80 Hz HPF, 16 kHz LPF).\nThe compressor analyses each sample's peak level, onset speed, and decay\ncharacter to set threshold, attack, and release automatically - a percussive kick\ngets a slow attack that preserves the beater transient, while a sustained pad\ngets a faster attack with longer release to avoid pumping. The gate reads the\nnoise floor to set its threshold. Transient shaping reads the crest factor to\ndecide how much punch to add or remove. Envelope reshape reads the decay\ncharacter to tighten the tail. Write `compress: true` or `transient: true` and\nthe right parameters are derived from the audio itself.\n\nBeat-quantized time-stretching locks samples to a target BPM using onset-aligned\ntimemaps - each onset is individually placed on the beat grid with minimal\nstretching between them. For speech and other material where time-stretch\nartifacts are unacceptable, pad-quantize snaps onsets to the grid by inserting\nsilence instead, preserving natural timbre completely.\n\nThe included `midi-map-gm-drums.yaml` applies all of this across the full GM\npercussion set: 47 instruments, each with researched filtering, compression\n(where appropriate), panning, and gain. The result is a coherent, pre-mixed drum\nkit from whatever samples you have - no manual tweaking required. Every setting\ncan be overridden by an experienced user who wants precise control.\n\n## MIDI map\n\nThe MIDI map is where Subsample becomes an instrument you can *play*. It is the\nmost expressive MIDI routing surface of any sampler we know of: you don't just\nassign samples to notes, you write *rules* that pick samples from your library\nat trigger time - by similarity to a reference, by analysis metadata, by age,\nby user-defined scoring functions, or by whatever combination you can write\ndown in a few lines of plain text. Samples can then be reshaped on the way\nout through an ordered effects chain with MIDI CC control over every\nparameter.\n\nThere is real complexity here - the price of a surface this expressive. The\nrest of this section leads you in gently. A five-step tutorial first, each\nstep adding one concept on top of the last. Then the complete reference, then\nthe advanced features (banks, ambisonic capture, MIDI CC mapping).\n\nMIDI routing is defined in a YAML file - by default `midi-map.yaml` in the\nproject directory, referenced from `config.yaml`:\n\n```yaml\nplayer:\n  midi_map: midi-map.yaml\n```\n\nTwo maps ship with the project:\n\n- **[midi-map.yaml.default](midi-map.yaml.default)** - a heavily-commented\n  template; open it, copy to `midi-map.yaml`, uncomment the example you want,\n  and go.\n- **[midi-map-gm-drums.yaml](midi-map-gm-drums.yaml)** - a complete General\n  MIDI percussion kit, ready to play against any sample library you point it\n  at. Instant kit, no tweaking needed.\n\n### Tutorial - five steps from simple to expressive\n\nThe examples below are working YAML. Each one is a self-contained\n`assignments:` entry. Copy any of them into `midi-map.yaml` under\n`assignments:` and reload to hear it.\n\n#### Step 1 - play one specific sample\n\nThe simplest possible assignment: MIDI note 36 (on channel 10, the GM drum\nchannel) always plays one named sample.\n\n```yaml\n- name: My favourite kick\n  channel: 10\n  notes: 36\n  select:\n    where:\n      name: 2026-03-24_14-37-14\n```\n\n`name` matches a sample's filename stem (no extension, no path). Strike note\n36 and Subsample plays that exact sample. Everything else in the library is\nignored for this assignment.\n\n#### Step 2 - \"find me the best kick\"\n\nNow the interesting bit. Instead of naming a specific sample, describe the\n*kind* of sample you want. Subsample's similarity engine will pick the closest\nmatch from your library - every time you load new samples, the best candidate\nmay change, but you never have to rewrite the YAML.\n\n```yaml\n- name: Any kick\n  channel: 10\n  notes: 36\n  select:\n    where:\n      reference: samples/reference/GM36_BassDrum1.wav\n```\n\n`reference` points at a reference WAV shipped in `samples/reference/`. The\nlibrary's samples are ranked against this reference by a 58-dimensional\nspectral/rhythmic fingerprint; the top-ranked match plays. (When `reference`\nis set and no `order` is given, `order: [{ by: similarity, dir: desc }]` is\nassumed - see [Implicit defaults](#implicit-defaults) further down.)\n\n#### Step 3 - rule-based selection\n\nFilter the library by analysis metadata, sort the qualifying samples, and\npick one. This example plays the **oldest pitched sample** across a whole\nkeyboard range, pitch-shifted to each MIDI note:\n\n```yaml\n- name: Pitched keyboard\n  channel: 1\n  notes: C2..C6\n  select:\n    where:\n      pitched: true              # only samples with a stable detected pitch\n    order:\n      - { by: age, dir: asc }    # oldest first\n    pick: 1                      # take the top result\n  process:\n    - repitch: true              # pitch-shift each note to its MIDI value\n  one_shot: false                # release on note-off (sustained playback)\n```\n\nThe `notes: C2..C6` range expands to every MIDI note between C2 and C6 - one\nassignment, 49 notes. `repitch: true` pitch-shifts the chosen sample per note.\n\n#### Step 4 - process the sample on the way out\n\nEverything in `process:` is an ordered audio-effects pipeline. Order matters -\nthe sample flows through top to bottom.\n\n```yaml\n- name: Warm keys\n  channel: 1\n  notes: C2..C6\n  select:\n    where: { pitched: true }\n    order: [{ by: age, dir: asc }]\n  process:\n    - filter_low: { freq: 2000, resonance: 6 }   # low-pass with resonant peak\n    - saturate: { drive: 4 }                      # analog-style soft-clip\n    - compress: true                              # adaptive dynamics\n    - repitch: true                               # then pitch-shift\n  one_shot: false\n```\n\nEvery processor accepts `true` for sensible defaults, or a dict for\nfine-grained control. All the parameters of every processor are documented in\nthe [Process](#process---how-to-present-the-sample) reference below.\n\n#### Step 5 - lock a loop to your session tempo\n\n`stretch_quantize` time-stretches a sample to a target BPM and snaps its onsets\nto a beat grid - turning any loosely-timed loop in your library into something\nlocked to the session. Combine it with filtering for a length+rhythm pick:\n\n```yaml\n- name: Tight loops\n  channel: 2\n  notes: C3..C4\n  select:\n    where:\n      duration: { gte: 1.0, lt: 8.0 }   # at least 1 bar, less than 8\n      onsets:   { gte: 4 }              # at least 4 transients\n    order:\n      - { by: duration, dir: desc }     # prefer longer loops\n  process:\n    - stretch_quantize: { strength: 0.7 }  # 70% snap - loose but locked\n  one_shot: false\n```\n\n`duration`, `onsets`, and other numeric predicates take per-field operator\ndicts (`gte`, `lte`, `gt`, `lt`, `eq`). `strength: 0.7` is a partial-quantize\namount - fully snapped at 1.0, unchanged at 0.0.\n\nThat's the ladder. The rest of this section is the full reference - every\nfield, every predicate, every processor, every option - then the advanced\nfeatures (banks, ambisonic capture, MIDI CC mapping).\n\n### The GM drums map - instant professional drum kit\n\nBefore the reference, a quick mention of the \"no-config\" path. If you want a\ncomplete drum kit in under a minute, use\n[midi-map-gm-drums.yaml](midi-map-gm-drums.yaml) directly. Point it at your\ninstrument directory (any sample collection will do) and every MIDI drum note\nautomatically finds the closest matching sample and plays it through a\nprofessional mix chain:\n\n- **Similarity matching** - each note finds the best sample via spectral\n  fingerprint comparison against GM reference sounds\n- **Console-style filtering** - per-instrument HPF/LPF to carve frequency space\n  (30 Hz HPF on kicks, 300 Hz on hi-hats, 1 kHz on triangles, etc.)\n- **Adaptive compression** on 28 transient instruments - threshold, attack, and\n  release auto-adapt to each sample's analysis data.  Foundation sounds get\n  tailored settings: kicks at 6:1 with 15 ms attack (beater punch + thick body),\n  snares at 5:1 with 8 ms attack (stick crack + ring), hi-hats at gentle 2:1\n  (consistency without flattening dynamics).  Cymbals, shakers, and expressive\n  instruments are left uncompressed.\n- **Audience-perspective panning** - hi-hats left, ride right, toms spread\n  across the stereo field, kick and snare near centre\n- **Gain balancing** - cymbals and small percussion pulled back so the kit sits\n  together without any one instrument dominating\n\nThe result: a new user with a collection of recorded samples hears a coherent,\npre-mixed drum kit on first play - no manual configuration needed.\n\n---\n\n**Reference - every option.** From here on, this section is reference material:\nevery field, every predicate, every processor option. Skim it once; come back\nwhen you want to try something the tutorial didn't show.\n\n---\n\n### Assignment fields\n\n| Field | Required | Description |\n|-------|----------|-------------|\n| `name` | yes | Label shown in logs |\n| `channel` | yes | MIDI channel 1-16 (standard numbering) |\n| `notes` | yes | Single note, list, or range (see Note syntax below) |\n| `select` | yes | Which sample to play (see Select below) |\n| `process` | no | How to present it (see Process below) |\n| `one_shot` | no | `true` = play to natural end regardless of note-off (default). `false` = fade out on note-off |\n| `gain` | no | Level offset in dB (default 0.0). Negative = quieter, positive = louder |\n| `pan` | no | Per-channel weights (constant-power normalised at mix time) e.g. `[50, 50]` = centre (default). Ratios matter, not absolute values: `[1, 1]` and `[100, 100]` are both centre. |\n| `output` | no | Physical output channels (1-indexed) e.g. `[3, 4]` routes to outputs 3-4 |\n\n### Note syntax\n\n```yaml\nnotes: 36          # single MIDI note number\nnotes: C4          # note name (C4 = MIDI 60, same as Ableton/Logic/FL Studio)\nnotes: [36, 35]    # list - each gets the next similarity rank (first = best match)\nnotes: C2..C4      # range - expands to every MIDI note from C2 (36) to C4 (60)\nnotes: 36..60      # range with note numbers\n```\n\nNote names use the convention C4 = 60 (C-1 = 0, G9 = 127). Sharps: `C#4`,\n`D#3`. Flats: `Db4`, `Eb3`.\n\n### Select - which sample to play\n\nThe `select` block defines how to choose a sample from the instrument library.\nIt has three parts: filter predicates (`where`), a sort order (`order`), and\na pick position (`pick`).\n\n```yaml\nselect:\n  where:\n    duration: { gte: 1.0 }                # at least 1 second long\n    onsets:   { gte: 4 }                  # at least 4 transient hits\n  order:\n    - { by: age, dir: desc }              # most recently captured first\n  pick: 1                                 # take the first match\n```\n\n`select` is usually a single block like this. It can also be a *list* of\nblocks for fallback chains - try the first, and if nothing matches try the\nnext. See [Fallback chains](#fallback-chains) below.\n\nAll `where` predicates must pass (AND logic).\n\nNumeric predicates (`duration`, `onsets`, `tempo`, `pitch`, `quantized_beats`)\nuse a per-field operator dict. Operators:\n\n| Operator | Meaning |\n|-----|---------|\n| `gte` | `\u003e=` inclusive lower bound |\n| `lte` | `\u003c=` inclusive upper bound |\n| `gt` | `\u003e` strict lower bound |\n| `lt` | `\u003c` strict upper bound |\n| `eq` | `==` exact equality |\n\nAny combination on one field AND-composes. A bare scalar under a numeric field\nis shorthand for `eq` — e.g. `quantized_beats: 4` is the same as\n`quantized_beats: { eq: 4 }`.\n\n| Predicate | Type | Description |\n|-----------|------|-------------|\n| `duration` | float (seconds) | Filter by sample length. Example: `{ gte: 1.0, lt: 5.0 }` |\n| `onsets` | int | Filter by detected transient count. Example: `{ gte: 4 }` |\n| `tempo` | float (BPM) | Filter by detected tempo. Example: `{ gte: 100, lte: 140 }` |\n| `pitch` | Hz or note name | Filter by detected frequency. Each operator value is either a Hz float (`{ gte: 130.8 }`) or a note name (`{ gte: C3, lt: C6 }`). The two forms are interchangeable - note names are converted to Hz at parse time. Sharps: `C#4`; flats: `Db4`. |\n| `quantized_beats` | float (beats) | Filter by the beat length of the assignment's `stretch_quantize`/`pad_quantize` output. Samples whose quantized variant has not yet been computed (or whose assignment has no quantize step with a valid BPM) are excluded when this predicate is active. Non-integer values accepted. |\n| `pitched` | bool | `true` = has stable pitch; `false` = not pitched |\n| `reference` | path | Similarity match against a reference sample (path to WAV) |\n| `name` | string | Exact filename stem match. Legacy: a path-like value (containing `/` or starting with `.`) is still auto-detected as a `path:` - see below |\n| `path` | path | Match a specific WAV file at this path (relative paths resolved against the MIDI map's directory). Preferred over `name:` for file references |\n| `directory` | path | Only match samples whose file path is inside this directory (auto-loads on startup; see [Banks vs directory predicate](#banks-vs-directory-predicate)) |\n\n`name:` and `path:` are mutually exclusive within a single `where` block - use\none, not both.\n\n**Legacy `min_X` / `max_X` syntax**: the pre-2026-04 form\n(`min_duration: 1.0`, `max_pitch: A4`, etc.) still works indefinitely —\nthe parser translates each legacy key into the equivalent operator\n(`gte` for `min_`, `lte` for `max_`). Mixing both forms on the same\nfield in one `where` block raises an error; use one form per field. New\nYAML should prefer the operator-dict form.\n\n`order` is a list of clauses. Each clause has a `by` (scorer name), a `dir`\n(`asc` or `desc`, default `asc`), and optional scorer-specific parameters.\nLater clauses break ties on earlier ones, so primary sort + secondary\ntie-breaker is natural:\n\n```yaml\norder:\n  - { by: duration, dir: desc }           # primary\n  - { by: onsets,   dir: asc }            # tiebreaker\n```\n\nBuilt-in scorers:\n\n| `by` | What it sorts by |\n|-----|------------------|\n| `age` | Arrival time (sample_id) — `desc` = newest first |\n| `duration` | Sample length in seconds |\n| `pitch` | Dominant frequency |\n| `onsets` | Detected onset count |\n| `tempo` | Detected BPM |\n| `level` | RMS loudness |\n| `quantized_beats` | Beat length of the assignment's `stretch_quantize`/`pad_quantize` output. Samples without a computed variant park at the end regardless of direction. |\n| `similarity` | Similarity rank against the reference in `where`. Only supported as the primary clause; requires `reference` in `where`. When `reference` is set and no `order` is given, `similarity` desc is assumed automatically. |\n| `beat_match` | Cosine similarity between a user-supplied `pattern:` (a list of numbers in `[0, 1]` per beat) and the sample's per-beat energy profile. Requires a `stretch_quantize` or `pad_quantize` step in the same assignment; samples without a quantized variant are excluded from the result. See [Beat-pattern matching](#beat-pattern-matching) below for the full semantics. |\n\n#### Implicit defaults\n\nThe parser fills in a few defaults that are easy to miss - they make the\ncommon case concise, but it helps to know which ones are on:\n\n| Omitted key | Default applied | When |\n|---|---|---|\n| `order` | `[{ by: age, dir: desc }]` (newest first) | No `where.reference` set |\n| `order` | `[{ by: similarity, dir: desc }]` | `where.reference` **is** set |\n| `pick` | `1` (best match) for the first note; incremented per note thereafter | Multi-note assignment without `repitch` |\n| `pick` | `1` for every note | Multi-note assignment with `repitch` in `process` |\n| `where` | Empty (all samples match) | `where` block omitted |\n| `process` | Empty (unprocessed playback) | `process` block omitted |\n| `grid` | `16` (sixteenth-note) | `stretch_quantize` / `pad_quantize` without explicit grid |\n| `tempo` | Session `target_bpm` from `config.yaml` | `stretch_quantize` / `pad_quantize` without explicit tempo |\n| `one_shot` | `true` | Omitted from assignment |\n| `gain` | `0.0` dB | Omitted from assignment |\n| `pan` | Identity routing | Omitted from assignment |\n| `output` | Outputs `1..N` | Omitted from assignment |\n\nThe `where.reference` → `similarity desc` coupling is worth calling out: if\nyou set `reference` and then later add another filter like\n`duration: { gte: 1.0 }`, the ordering is still similarity - there's nothing\nvisible in the YAML telling you so. Add an explicit `order:` clause if you\nwant a different sort.\n\n`pick` is 1-indexed. Default: 1 (first match). For multi-note assignments\nwithout explicit `pick`, each note gets the next position (rank distribution) -\nso `notes: [36, 35]` gives note 36 pick 1 (best match) and note 35 pick 2.\n\n#### Beat-pattern matching\n\n`beat_match` is the shape-based companion to `similarity`: where `similarity`\nranks by spectral/timbral closeness to a reference sample, `beat_match` ranks\nby *rhythmic* closeness to a user-defined pattern.\n\n**Applies only to quantized samples.** `beat_match` scores the per-beat energy\nprofile that `stretch_quantize` and `pad_quantize` produce as a by-product of\nsnapping onsets to a beat grid. Any assignment that uses `beat_match` in its\n`order:` must therefore include one of those processors in its `process:`\nblock - without a quantize step, no sample has an energy profile to compare\nagainst, and the result set is empty.\n\n```yaml\nselect:\n  where:\n    duration: { gte: 1.0 }\n    onsets:   { gte: 4 }\n  order:\n    - { by: beat_match, pattern: [1, 0, 1, 0, 1, 0, 1, 0] }\nprocess:\n  - stretch_quantize: { grid: 16 }\n```\n\n**The pattern.** A list of numbers in `[0, 1]`, one per beat. Values are\nrelative — only the shape matters, not the absolute magnitudes. Examples:\n\n| Pattern | Intent |\n|---|---|\n| `[1, 0, 1, 0]` | energy on every other beat (back-beat feel) |\n| `[0, 1, 0, 1]` | energy on the off-beats |\n| `[1, 0.9, 0.8, 0.7, 0.6, 0.5]` | gentle decay from beat 1 to beat 6 |\n| `[0, 0, 1, 1, 0, 0, 1, 1]` | double-hits on beats 3-4 and 7-8 |\n\n**How a sample is scored.** Each quantized sample has a *grid energy profile* —\nper-slot RMS computed after the quantize step. `beat_match` mean-pools that\nprofile down to per-beat energy (so an 8th-note grid and a 16th-note grid\nboth reduce to the same per-beat values — cross-grid invariance), then\ncomputes cosine similarity between the pattern and the profile over\n`min(len(pattern), len(beats))` elements (left-aligned). Samples with no\nquantized variant score `None` and are excluded.\n\n**Behaviour summary:**\n\n- `dir: desc` (default) = best match first. `dir: asc` = worst match first.\n- Shape-sensitive, level-insensitive: `[1, 0, 1, 0]` perfectly matches a\n  sample with energy `[0.5, 0, 0.5, 0]` (score 1.0).\n- Length mismatches are truncated left-aligned — no resampling, no padding.\n- Ints and floats are both accepted in the pattern list; values outside\n  `[0, 1]` are rejected at parse time.\n\n#### Fallback chains\n\n`select` can be a list of specs tried in order. The first that returns a\nresult wins:\n\n```yaml\nselect:\n  - where: { name: my-favourite-kick }                               # try specific sample first\n  - where: { reference: samples/reference/GM36_BassDrum1.wav }       # fall back to similarity match\n```\n\n#### Legacy `order_by:` syntax\n\nThe pre-2026-04 `order_by:` key with a bare-string token is still accepted\nindefinitely — the parser translates it into the equivalent `order:` clause.\nThese two forms produce identical results:\n\n```yaml\n# Legacy (still accepted)\nselect:\n  where: { pitched: true }\n  order_by: pitch_desc\n\n# Preferred (new form)\nselect:\n  where: { pitched: true }\n  order:\n    - { by: pitch, dir: desc }\n```\n\nLegacy tokens map as follows: `newest` → `{by: age, dir: desc}`, `oldest` →\n`{by: age, dir: asc}`, `duration_desc` → `{by: duration, dir: desc}`,\n`loudest` → `{by: level, dir: desc}`, `quietest` → `{by: level, dir: asc}`,\n`quantized_beats_desc` → `{by: quantized_beats, dir: desc}`, `similarity` →\n`{by: similarity, dir: desc}`, and so on — field name without the `_asc`/\n`_desc` suffix goes into `by`, the suffix determines `dir`. Mixing both keys\non the same `select` entry is an error.\n\n#### Examples\n\n```yaml\n# GM kicks - ranked by similarity to a kick reference\nselect:\n  where:\n    reference: samples/reference/GM36_BassDrum1.wav\n  order:\n    - { by: similarity, dir: desc }\n\n# Pitched keyboard - oldest tonal sample, repitched per note\nselect:\n  where:\n    pitched: true\n  order:\n    - { by: age, dir: asc }\n  pick: 1\n\n# Rhythmic loops - recent, long, with enough beats\nselect:\n  where:\n    min_duration: 1.0\n    min_onsets: 4\n  order:\n    - { by: age, dir: desc }\n\n# Longest sample, with onset count breaking ties\nselect:\n  where:\n    pitched: true\n  order:\n    - { by: duration, dir: desc }\n    - { by: onsets,   dir: desc }\n  pick: 1\n```\n\n### Process - how to present the sample\n\nThe optional `process` block declares an ordered list of audio processors\napplied after sample selection. Omit it entirely for unprocessed playback.\n\n```yaml\nprocess:\n  - filter_low: { freq: 800, resonance: 6 }   # low-pass, then\n  - repitch: true                               # pitch-shift, then\n  - saturate: { drive: 4 }                      # saturation\n```\n\nProcessors execute in the order you declare them - different orderings\nproduce different results. The full chain is pre-computed and cached.\n\nAvailable processors:\n\n| Processor | Parameters | Description |\n|-----------|-----------|-------------|\n| `repitch: true` | none | Pitch-shift to match the triggering MIDI note |\n| `repitch: { note: C4 }` | target note | Pitch-shift to a fixed note |\n| `stretch_quantize: true` | grid (default 16), tempo (config target_bpm), strength (default 1.0) | Time-stretch to session `target_bpm` with all defaults |\n| `stretch_quantize: { grid: 16 }` | as above, grid overridden | Time-stretch to session `target_bpm` |\n| `stretch_quantize: { tempo: 120, grid: 8 }` | explicit tempo + grid | Time-stretch to a specific tempo |\n| `stretch_quantize: { strength: 0.5 }` | 0.0-1.0 (default 1.0) | Partial quantize - onsets move partway to the grid for a looser feel |\n| `pad_quantize: true` | grid (default 16), tempo (config target_bpm), strength (default 1.0) | Silence-pad onsets with all defaults |\n| `pad_quantize: { grid: 16 }` | as above, grid overridden | Onset-aligned silence padding - snaps onsets to the beat grid by inserting silence between segments rather than time-stretching. No pitch/speed change. Ideal for speech. |\n| `pad_quantize: { strength: 0.75 }` | 0.0-1.0 (default 1.0) | Partial quantize - same as stretch_quantize strength but for silence-pad mode |\n| `filter_low: true` | freq (Hz, default 16000), resonance (dB, default 0) | Low-pass filter (console-style default) |\n| `filter_high: true` | freq (Hz, default 80), resonance (dB, default 0) | High-pass filter (console-style default) |\n| `filter_band: true` | freq (Hz, default 1000), q (default 0.7), resonance (dB, default 0) | Band-pass filter (Q sets width) |\n| `reverse: true` | none | Reverse the audio |\n| `saturate: true` | drive (default 6 dB) | Soft-clip saturation with level compensation (moderate default warmth) |\n| `saturate: { drive: 6 }` | drive (dB) | Soft-clip saturation with explicit drive |\n| `compress: true` | threshold (auto), ratio (4:1), attack (auto), release (auto), knee (6 dB), makeup (0 dB), lookahead (0 ms) | Dynamic range compressor (adapts to each sample) |\n| `limit: true` | threshold (-1 dB), release (50 ms), lookahead (5 ms) | Brickwall limiter (ratio 100:1, instant attack) |\n| `hpss: { keep: harmonic }` | keep (required: `harmonic` or `percussive`) | Keep only harmonic/tonal content (remove percussion) |\n| `hpss: { keep: percussive }` | as above | Keep only percussive/transient content (remove harmonics) |\n| `gate: true` | threshold (auto), attack (auto), release (auto), hold (auto), lookahead (auto) | Noise gate - silences audio below the noise floor. All parameters auto-adapt: threshold from noise floor, attack/release/hold from onset and decay character. |\n| `distort: true` | mode (hard_clip), drive (auto), mix (1.0), tone (auto), bit_depth (8), downsample_factor (4) | Waveshaping distortion with four modes: hard_clip, fold, bit_crush, downsample. Drive adapts to crest factor; tone adapts to spectral rolloff. |\n| `reshape: true` | attack (preserve), hold (0), decay (preserve), sustain (1.0), release (auto) | ADSR envelope reshaping. Default auto-tightens the tail. Set attack, decay, sustain, release to reshape specific phases. |\n| `transient: true` | gain (auto, dB signed) | Transient enhancement/taming via HPSS rebalancing. Auto-adapts from crest factor: peaky samples are tamed, dull samples enhanced. |\n| `transient: { gain: 6 }` | gain (dB, signed: +/- enhance/tame) | Explicit dB of transient enhancement or taming |\n| `vocoder: { carrier: reference }` | carrier (required), bands (24), depth (1.0), formant_shift (0) | Channel vocoder cross-synthesis. Imposes the sample's spectral envelope onto a carrier signal. `carrier: reference` uses this note's reference sample; or specify a file path. |\n\nAll three filters can be used without parameters - they default to classic\nconsole channel-strip values:\n\n```yaml\nprocess:\n  - filter_high: true    # 80 Hz high-pass  (rumble filter)\n  - filter_low: true     # 16 kHz low-pass  (analog warmth roll-off)\n  - filter_band: true    # 1 kHz band-pass, Q 0.7 (wide mid sweep)\n```\n\nOverride any parameter to taste. All filters are 2nd-order (12 dB/octave),\nflat Butterworth by default. Add resonance for a peak at the cutoff\n(Chebyshev Type I, max 24 dB). Band-pass Q controls width: lower = wider\n(0.7 = gentle sweep), higher = narrower (4.0 = surgical).\n\nThe compressor and limiter share the same DSP back-end (Giannoulis et al.\nfeed-forward design with soft knee and look-ahead). `compress: true` adapts to\neach sample automatically using the analysis data:\n\n- **threshold** - set 6 dB below the sample's peak level (always engages)\n- **attack** - slow for percussive samples (lets the transient punch through),\n  fast for gradual onsets (no transient to protect)\n- **release** - short for quick-decay samples (recovers before the next hit),\n  long for sustained sounds (avoids pumping)\n\n```yaml\nprocess:\n  - compress: true                                        # adapts to each sample\n  - compress: { threshold: -30, ratio: 10, attack: 0.5 } # explicit - squash + raise tail\n  - compress: { attack: 5 }                               # explicit attack, rest auto\n  - limit: true                                           # brickwall at -1 dBFS\n```\n\nSet any parameter explicitly to override its auto value. Fixed parameters\n(ratio, knee, makeup, lookahead) always use their defaults unless set.\n\nThe noise gate, distortion, and envelope reshaper follow the same pattern -\n`true` gives you intelligent auto defaults, explicit parameters override:\n\n```yaml\nprocess:\n  - gate: true                              # auto noise gate\n  - gate: { threshold: -40, hold: 20 }      # explicit threshold\n  - distort: true                            # hard-clip with auto drive\n  - distort: { mode: fold, drive: 12 }       # foldback distortion\n  - distort: { mode: bit_crush, bit_depth: 4, mix: 0.5 }\n  - reshape: true                            # auto tail-tightening\n  - reshape: { attack: 5, release: 100 }     # fast attack, controlled release\n  - reshape: { sustain: 0.5, release: 50 }   # half sustain, tight tail\n  - transient: true                          # auto: normalises punch from crest factor\n  - transient: { gain: 6 }                  # enhance transients by 6 dB\n  - transient: { gain: -3 }                 # tame transients by 3 dB\n  - pad_quantize: { tempo: 120, grid: 8 }   # silence-pad onsets to eighth-note grid\n  - vocoder: { carrier: reference }           # cross-synthesise with this note's reference\n  - vocoder: { carrier: samples/reference/GM36_BassDrum1.wav, bands: 16, depth: 0.8 }\n```\n\nFor the opposite of snappy drums (bring up room ambience and reverb tails), use\na fast attack (\u003c 1 ms), high ratio (10:1+), and low threshold (-30 dB) to\nsquash transients and raise the relative level of the sustain/decay.\n\nHPSS (Harmonic/Percussive Source Separation) decomposes audio into sustained\ntonal content and transient clicks/hits. Useful as a pre-filter before repitch\n(avoids pitch-shifting drum bleed) or stretch_quantize (cleaner grid alignment).\n\nWhen `repitch` is in the process list, all notes in a multi-note assignment\nshare pick 1 (same sample, pitched per note). Without `repitch`, each note gets\nthe next rank.\n\n#### Legacy `amount:` parameter (still accepted)\n\nFour processors previously shared an `amount:` parameter with wildly different\nunits (dB for `saturate` and `transient`, 0-1 fraction for the two quantizers).\nThe parameter has been renamed per processor so the unit is obvious at the\ncall site. The old `amount:` key still works indefinitely - the parser\ntranslates each one to the appropriate canonical name:\n\n| Processor | Legacy | New (preferred) | Unit |\n|---|---|---|---|\n| `saturate` | `amount` | `drive` | dB |\n| `transient` | `amount` | `gain` | dB (signed: +enhance, -tame) |\n| `stretch_quantize` | `amount` | `strength` | 0.0-1.0 fraction |\n| `pad_quantize` | `amount` | `strength` | 0.0-1.0 fraction |\n\nMixing both names on the same step is rejected at parse time.\n\nThe two separate HPSS processor names `hpss_harmonic: true` and\n`hpss_percussive: true` have been unified into `hpss: { keep: harmonic }` /\n`hpss: { keep: percussive }`. The legacy names still work and translate\ninternally.\n\n`stretch_quantize` and `pad_quantize` now accept `tempo:` instead of `bpm:` -\nmatching the `tempo:` where-predicate. Legacy `bpm:` is translated to\n`tempo:` at parse time.\n\nThe processor formerly named `beat_quantize` is now `stretch_quantize`: both\nthe new name and its companion `pad_quantize` describe *how* each quantizer\naligns onsets to a grid - one stretches audio in time, the other pads with\nsilence between segments. The legacy name `beat_quantize` still works -\nthe parser translates it to `stretch_quantize` at parse time, preserving any\nparams.\n\n### Pan\n\n`pan` is a list of per-channel weights. The values are **relative**, not\npercentages: only the ratio between channels matters. `[50, 50]`, `[1, 1]`,\nand `[100, 100]` all produce centre. The raw weights are normalised to\nconstant-power gains at mix time, so perceived loudness stays equal across\npan positions.\n\n```yaml\npan: [50, 50]    # centre (default)\npan: [100, 0]    # hard left\npan: [75, 25]    # left of centre\n```\n\nChannel order follows SMPTE: `[L, R]` for stereo; `[L, R, C, LFE, Ls, Rs]` for\n5.1; `[L, R, C, LFE, BL, BR, SL, SR]` for 7.1. Set `player.audio.channels` in\nconfig to match your output device (default: stereo). Samples of any channel\ncount are automatically mapped to the output layout using ITU-R BS.775 downmix\ncoefficients (surround to stereo, etc.) or conservative upmix (stereo to 5.1\nuses front pair only). Pan weights define a target layout - if the output has\nfewer channels, standard downmix is applied automatically.\n\n#### Output routing\n\nOn a multi-channel interface you can route each instrument to specific physical\noutputs. Numbers are 1-indexed, matching the labels on your hardware:\n\n```yaml\nkick:\n  pan: [50, 50]\n  output: [1, 2]       # main monitors (default when omitted)\n\nsnare:\n  pan: [50, 50]\n  output: [3, 4]       # separate outputs for external processing\n\npad:\n  output: [5, 6]       # stereo sample sent to outputs 5-6\n```\n\nSet `player.audio.channels` in config to match your device (e.g. 8 for a\nFocusrite Scarlett 18i20). When `output` is omitted, instruments route to the\nfirst N outputs as before - stereo users see no change.\n\n---\n\n**Going further.** The sections that follow cover the optional advanced\nfeatures: multichannel/ambisonic capture, bank switching for live kit swaps\nvia MIDI Program Change, and MIDI CC control for any numeric processor\nparameter. None of this is needed for a basic setup - skip ahead if you're\njust building a drum kit or pitched keyboard.\n\n---\n\n### Ambisonic capture\n\nFour-capsule tetrahedral mics (such as the Rode NT-SF1) and pre-encoded\nB-format files are supported as first-order ambisonic content. Samples are\nconverted to canonical AmbiX B-format (channel order W, Y, Z, X; SN3D) at\ncapture time and decoded at playback time through a virtual speaker array\nsized to match `player.audio.channels` (mono, stereo, quad, 5.1, or 7.1).\n\nEnable ambisonic capture in `config.yaml`:\n\n```yaml\nrecorder:\n  audio:\n    channels: 4\n    ambisonic_format: a_nt_sf1   # or a_generic, b_fuma, b_ambix\n\nambisonic:\n  decoder: basic                  # basic | max_re | inphase\n  yaw_degrees: 0.0                # rotate before decoding\n  pitch_degrees: 0.0\n  roll_degrees: 0.0\n```\n\nFormat options:\n\n- `a_nt_sf1` - Rode NT-SF1 A-format. Applies a capsule-matching HF shelf\n  pre-matrix and a post-matrix HF shelf on X/Y/Z to compensate for\n  capsule-spacing loss. Best choice for this mic.\n- `a_generic` - Generic tetrahedral A-format with the standard Gerzon\n  matrix, capsule order FLU/FRD/BLD/BRU. No capsule calibration applied.\n- `b_fuma` - Pre-encoded B-format in FuMA order (W, X, Y, Z), MaxN.\n  Reordered and renormalised to AmbiX on read.\n- `b_ambix` - Pre-encoded B-format already in AmbiX order - stored\n  unchanged.\n\nDecoder choice affects the spatial character: `basic` has sharp lobes and\nthe best low-frequency behaviour, `max_re` trades some front-energy for\ntighter localisation in the sweet spot, and `inphase` has the softest\nlobes and works best when listening from off-axis positions. Rotation\n(yaw/pitch/roll) is applied before the decoder and is project-wide - all\nambisonic samples rotate together.\n\nAnalysis runs on the W (omnidirectional) channel only, so spectral and\nrhythmic fingerprints reflect the sound-field sum rather than a\ndirectionally biased mix. Pad-quantize and beat-quantize work on\nambisonic samples using Rubber Band's phase-coherent multichannel engine\n- inter-channel relationships survive time-stretching within tolerance.\n\n### Banks - switching instrument sets via MIDI\n\nThe MIDI map can optionally declare multiple instrument directories (\"banks\")\nthat are all loaded at startup. Switch between them at runtime using MIDI\nProgram Change messages - no restart, no disk I/O, instant switching:\n\n```yaml\nbanks:\n  - name: \"Acoustic Kit\"\n    directory: samples/acoustic\n    program: 0\n  - name: \"Electronic Kit\"\n    directory: samples/electronic\n    program: 1\n\nbank_channel: 10    # MIDI channel for PC messages (1-16, or 0 = any)\ndefault_bank: 0     # program number to activate at startup (default: first in list)\n```\n\nWhen `banks:` is absent, the single `instrument.directory` from config.yaml is\nused as before. When present, it overrides `instrument.directory`. Each bank\ngets its own sample library, similarity index, and transform cache.\n\nAssignments are bank-agnostic - they query whichever bank is active. Named\nsamples (`where: { name: X }`) that only exist in one bank silently produce no\nmatch in other banks; rule-based selects (`reference:`, `pitched:`, etc.) work\nnaturally against whatever samples are present.\n\n#### Banks vs directory predicate\n\nBanks and `where: { directory: ... }` both load samples from a directory, but\nthey solve different problems:\n\n- **Banks** swap the entire sample pool at once. Only one bank is active at a\n  time - a MIDI Program Change switches all assignments to a new set of samples.\n  Use banks when you want the same MIDI map rules to evaluate against completely\n  different sample collections (e.g. \"Acoustic Kit\" vs \"Electronic Kit\").\n\n- **`where: { directory: ... }`** filters within the active pool. It is\n  per-assignment, and multiple assignments can each reference a different\n  directory simultaneously. Use it when different notes in the same map need\n  samples from different directories at the same time (e.g. kicks from one\n  folder, hi-hats from another).\n\n| | Banks | `where: { directory }` |\n|---|---|---|\n| Scope | All assignments share one active bank | Per-assignment filter |\n| Switching | MIDI Program Change swaps the whole pool | Always active |\n| Simultaneous directories | No (one bank at a time) | Yes (each assignment can use a different directory) |\n| Use case | Swap entire kits | Mix sources within one kit |\n\n### CC mapping - real-time parameter control\n\nAny numeric processor parameter can be controlled by a MIDI CC message.\nReplace the scalar value with a CC binding:\n\n```yaml\nprocess:\n  - pad_quantize: { grid: 16, strength: { cc: 1 } }\n  - stretch_quantize: { tempo: { cc: 2, min: 60, max: 180 }, grid: 16 }\n  - filter_low: { freq: { cc: 74, min: 200, max: 16000 } }\n```\n\n| Field | Required | Default | Description |\n|-------|----------|---------|-------------|\n| `cc` | yes | | CC number (0-127) |\n| `min` | no | `0.0` | Output value when CC = 0 |\n| `max` | no | `1.0` | Output value when CC = 127 |\n| `default` | no | midpoint | Value before any CC is received |\n| `channel` | no | any | MIDI channel (1-16); omit for omni |\n\nWhen a mapped CC changes, new variants are enqueued after a 200 ms debounce.\nUntil the new variant is ready, the previous processed variant continues to\nplay - giving smooth transitions for gradual changes.\n\n**Important:** use stepped/discrete controllers (knobs, faders, buttons) for CC\nmapping. Do not use pitch bend, aftertouch, or high-resolution continuous\ncontrollers - these generate hundreds of messages per second and would flood the\ntransform queue. Each distinct CC value produces a new variant; the transform\ncache evicts the oldest when its memory budget is exceeded.\n\n### Vocabulary reference\n\nEvery enum-string value the MIDI map accepts, in one place:\n\n| Where | Valid values |\n|---|---|\n| `where` operators | `gte` `lte` `gt` `lt` `eq` |\n| Order `dir` | `asc` `desc` |\n| Order `by` | `age` `duration` `pitch` `onsets` `tempo` `level` `quantized_beats` `similarity` `beat_match` |\n| `notes` range | `\u003clow\u003e..\u003chigh\u003e` (e.g. `C2..C4` or `36..60`) |\n| `pitch` predicate value | Hz float (`440`) or note name (`A4`, `C#3`, `Db5`) |\n| `distort` `mode` | `hard_clip` `fold` `bit_crush` `downsample` |\n| Quantize `segment` | `round_robin` `random` or integer (1-indexed) |\n| `vocoder` `carrier` | `reference` (the note's reference sample) or a file path |\n| Legacy `order_by` tokens | `newest` `oldest` `duration_asc` `duration_desc` `pitch_asc` `pitch_desc` `onsets_asc` `onsets_desc` `tempo_asc` `tempo_desc` `loudest` `quietest` `similarity` `quantized_beats_asc` `quantized_beats_desc` |\n| Legacy numeric-predicate keys | `min_duration` `max_duration` `min_onsets` `max_onsets` `min_tempo` `max_tempo` `min_pitch` `max_pitch` `min_quantized_beats` `max_quantized_beats` |\n| Legacy processor names | `beat_quantize` (→ `stretch_quantize`) `hpss_harmonic` (→ `hpss: { keep: harmonic }`) `hpss_percussive` (→ `hpss: { keep: percussive }`) |\n| Legacy processor param names | `amount` (→ `drive` / `gain` / `strength` per processor) `bpm` (→ `tempo` in quantizers) |\n\n## Performance\n\n### Zero-latency playback\n\nWhen a sample enters the library, a background worker immediately produces a\npre-rendered copy at the output device's sample rate and format. Tonal samples\nalso receive a full set of pitch-shifted variants. By the time the first MIDI\nnote fires, the work is already done - playback is a memory copy into the mix\nbuffer, not an on-the-fly calculation. A three-tier fallback guarantees playback\nis never blocked:\n\n1. **Process variant** - pre-computed with the full declared chain (pitch, filter, saturate, reverse, time-stretch, etc.)\n2. **Base variant** - pre-normalised, no DSP (all samples)\n3. **On-the-fly render** - last resort on the very first trigger only\n\n### End-to-end 32-bit float\n\nEvery audio sample is converted to float32 immediately after capture and stays in\nthat format between pipeline stages - analysis, normalisation, pitch shifting,\ngain staging, polyphonic mixing. Precision-sensitive operations (IIR filters,\ncompressor/gate envelope followers, gain curve generation) promote to float64\ninternally and return float32. The only integer conversion is a single pack to\nthe hardware's native bit depth at the output. This approach matches professional\nDAW practice, and means that peak-normalising a quiet recording or pitch-shifting\nit across two octaves introduces no measurable quality loss.\n\n### Non-blocking capture\n\nThe audio input thread does minimal work and returns immediately. Analysis runs\nin a separate auto-scaled worker pool, so back-to-back sounds are captured\nreliably even when spectral analysis is slow. This is critical for USB audio\ndevices, which use isochronous transfers and are sensitive to timing jitter.\n\n### Professional gain staging\n\nEvery voice is RMS-normalised so a quiet recording and a loud one play at\ncomparable levels at the same MIDI velocity. A tanh soft-limiter on the mix bus\nsmoothly compresses peaks that approach 0 dBFS - the output never clips, no\nmatter how many voices overlap, and the character of the sound is preserved.\n\n### Pitch shifting quality\n\nPitch variants are produced using the Rubber Band library's offline finer engine,\nthe highest quality pitch-shifting algorithm available. Variants are pre-computed\nin the background by a worker pool; no latency is added at trigger time.\n\n## Similarity engine\n\nEvery new sample is scored against every reference using cosine similarity on a\n58-element composite feature vector built from five groups: spectral shape (14\ndimensions), sustained timbre (12), timbre dynamics (12), attack character (12),\nand band energy (8). Each group is independently normalised and scaled by a\nconfigurable weight (`similarity.weight_*`), so you can emphasise whichever\nacoustic qualities matter most for your material.\n\nThe key insight: **the same comparison method works for both percussive and tonal\nsounds without needing to classify them first.** A kick drum naturally scores\nhigh on attack character; a violin scores high on sustained timbre. No\nclassifier, no training data, no labelling - just geometry.\n\nFor each reference, an in-memory ranked list of matches is maintained and updated\nincrementally as new recordings arrive or old ones are evicted. See\n[Architecture](#architecture) for the full vector breakdown.\n\n## Transforms\n\nTonal samples with a stable, confident pitch are automatically pitch-shifted to\nevery MIDI note in the assigned note range (e.g. all 128 notes for a full-keyboard\nassignment). Variants are produced in the background by a worker pool and cached\nin a memory-bounded store with parent-priority FIFO eviction - when a variant\nfamily would exceed the memory budget, the entire oldest family is evicted\ntogether, keeping remaining families intact and playable.\n\nVariants are also persisted to a disk cache (`samples/variant-cache/` by default) so\nthey survive restarts. Each variant is stored as a single binary file named by a\nSHA-256 hash of the source audio, transform chain, output sample rate, and\nanalysis version - any change to any of these produces a different key, so stale\ncache hits are impossible. Recently-used files are kept warm (LRU by modification\ntime); oldest files are evicted when the disk budget is exceeded. Quantized\nvariants also store a grid energy profile - per-grid-slot RMS energy normalized\nto [0, 1] - alongside the audio, enabling future complementary pattern matching.\n\nSamples with detected rhythmic content can be time-stretched to a target tempo\nusing the `stretch_quantize` processor in a MIDI map assignment. Detected attacks are\nsnapped to a quantized beat grid and the entire mapping is applied in a single\npass using Rubber Band's offline finer engine. Time-stretch variants are produced\non-demand when an assignment requests them - no global startup cost.\n\n### Attack-accurate onset detection\n\nStandard spectral onset detection (as used by librosa and most audio analysis\ntools) identifies the frame where spectral energy changes most rapidly - the\npeak of the onset strength envelope. For percussive sounds this peak typically\nlags the actual attack by 10-30 ms, which is enough to make beat-quantized\nhits sound noticeably off the grid.\n\nSubsample refines each detected onset to sample-accurate precision using a\ntwo-stage approach:\n\n1. **Coarse detection** - librosa's onset detector finds approximate positions\n   at frame resolution (~11.6 ms at 44100 Hz / hop 512).\n2. **Attack refinement** - for each onset, a short-window amplitude envelope\n   (32 samples, ~0.7 ms) is searched backward to find the inter-hit valley\n   (quietest point between consecutive transients), then forward to find where\n   energy first rises above 20% of the local peak. This threshold crossing is\n   the perceptual attack start - the moment a musician would tap along.\n\nThe search is bounded by the midpoint to the previous onset (preventing bleed\ninto the prior hit's tail) and a maximum of 50 ms (the physical upper bound on\nSTFT detection lag). The result is stored as `attack_times` in the analysis\nsidecar alongside the original `onset_times`, giving the time-stretch handler\nprecise alignment points without sacrificing the coarse onsets that other\nsubsystems rely on.\n\nAll DSP runs at the sample's native rate so filters and nonlinear processors\n(distortion, saturation) operate at full resolution. The final downsample to\nthe output device rate uses very-high-quality conversion (soxr_vhq) whose\nanti-alias filter catches any above-Nyquist content generated by the\nprocessing chain. The playback path never pays a conversion cost at trigger\ntime.\n\n## Quick start\n\n```bash\n# Install system dependencies (PortAudio + Rubber Band)\n# Debian/Ubuntu:\nsudo apt install portaudio19-dev rubberband-cli\n# Fedora/RHEL:\nsudo dnf install portaudio-devel rubberband\n# macOS:\nbrew install portaudio rubberband\n\n# Clone and install\ngit clone https://github.com/simonholliday/subsample.git\ncd subsample\npip install -e .\n\n# Run with built-in defaults (no config file needed)\nsubsample\n\n# Or process audio files through the detection pipeline\nsubsample recording.wav                # Single file\nsubsample ./recordings/*.wav           # Multiple files (glob expansion)\n```\n\nSubsample works out of the box with sensible defaults from `config.yaml.default`.\nTo customise, create a `config.yaml` containing only the settings you want to\noverride - everything else is inherited automatically. See\n[Configuration](#configuration) for details.\n\n**Live capture mode:** Subsample lists available audio input devices and lets you\nchoose one (or auto-selects if only one is present). It calibrates ambient noise\nfor a few seconds before listening for events.\n\n**File input mode:** Each file is processed at its native sample rate, bit depth,\nand channel count. Detected segments are saved to the output directory.\n\n## Configuration\n\nSubsample always loads `config.yaml.default` as the base, then deep-merges\nyour `config.yaml` on top. Your config only needs the settings you want to\nchange - everything else is inherited from the defaults automatically.\n\nThe most common overrides:\n\n- **First run:** set `recorder.audio.device` (your microphone) and `output.directory`\n- **For MIDI playback:** set `player.enabled: true`, `player.midi_device` or `player.virtual_midi_port`, and `player.audio.device`\n- **If you hear clipping:** raise `player.max_polyphony`; the `limiter_threshold_db` and `limiter_ceiling_db` defaults protect against distortion automatically\n- **If recordings miss quiet sounds or trigger on noise:** tune `detection.snr_threshold_db`\n\nEverything else - chunk sizes, buffer lengths, transform settings, similarity\nweights - is optional and rarely needs changing.\n\n| Setting | Default | Description |\n|---|---|---|\n| `max_memory_mb` | auto | Total cache memory budget. Auto-detect: min(25% of system RAM, 1024 MB). Split: 60% instruments, 35% transforms, 5% carrier |\n| `recorder.enabled` | `true` | Enable live audio capture; set to `false` to process files only |\n| `recorder.audio.device` | `none` | Audio input device name (substring match); if unset, auto-select or prompt |\n| `recorder.audio.sample_rate` | `44100` | Sample rate in Hz |\n| `recorder.audio.bit_depth` | `16` | Bit depth (16, 24, or 32) |\n| `recorder.audio.channels` | auto | 1 = mono, 2 = stereo. Omit (or `null`) to auto-detect from device |\n| `recorder.audio.input` | `null` | Physical input channels (1-indexed list). `[3, 4]` records from inputs 3-4 |\n| `recorder.audio.chunk_size` | `512` | Frames per buffer read |\n| `recorder.audio.audio_format` | `wav` | Output container: `wav` (uncompressed, 16/24/32-bit) or `flac` (lossless compressed, ~40-60% smaller, 16/24-bit). See [Storage format](#storage-format) for behaviour around mixed bit depths |\n| `recorder.previews` | `true` | Emit a `.preview.png` thumbnail sidecar (1024x256, ~15-25 KB) and embed a compact `preview` data block in `.analysis.json` so the Supervisor dashboard can render a scalable SVG on demand. See [Sample previews](#sample-previews) |\n| `recorder.buffer.max_seconds` | `60` | Circular buffer length |\n| `player.enabled` | `false` | Enable the MIDI player |\n| `player.midi_map` | `none` | Path to MIDI routing map YAML; required for player. Use `midi-map-gm-drums.yaml` for a complete GM kit |\n| `player.max_polyphony` | `8` | Max simultaneous voices; per-voice gain = 1/max\\_polyphony. Raise if clipping; lower for louder individual voices |\n| `player.limiter_threshold_db` | `-1.5` | Safety limiter threshold (dBFS); signals below this pass untouched |\n| `player.limiter_ceiling_db` | `-0.1` | Maximum output level (dBFS) the limiter allows; must exceed threshold |\n| `player.midi_device` | `none` | MIDI input device name (substring match); if unset, auto-select or prompt |\n| `player.audio.device` | `none` | Audio output device name for playback |\n| `player.audio.sample_rate` | auto | Output sample rate; defaults to recorder rate. Do not set higher than source. |\n| `player.audio.bit_depth` | auto | Output bit depth (16, 24, or 32); defaults to recorder bit depth |\n| `player.audio.channels` | `null` | Output channels (2=stereo, 6=5.1, 8=7.1); null defaults to stereo. SMPTE ordering |\n| `player.virtual_midi_port` | `none` | Name for a virtual MIDI input port; overrides `player.midi_device` |\n| `player.watch_midi_map` | `false` | Monitor the `midi_map` file for changes and reload assignments on save (see Live-coding) |\n| `detection.snr_threshold_db` | `12.0` | dB above ambient to trigger recording |\n| `detection.hold_time` | `0.5` | Seconds to hold recording open after signal drops |\n| `detection.warmup_seconds` | `1.0` | Calibration period before detection activates |\n| `detection.ema_alpha` | `0.1` | Ambient noise adaptation speed (lower = slower) |\n| `detection.trim_pre_samples` | `10` | Samples to keep before signal onset (S-curve fade applied) |\n| `detection.trim_post_samples` | `90` | Samples to keep after signal end (S-curve fade applied) |\n| `output.directory` | `./samples/captures` | Where WAV files are saved |\n| `output.filename_format` | `%Y-%m-%d_%H-%M-%S-%3f` | strftime format for filenames (`%3f` = 3-digit milliseconds) |\n| `analysis.start_bpm` | `120.0` | Tempo prior for beat detection (BPM) |\n| `analysis.tempo_min` | `30.0` | Minimum tempo considered by pulse detector (BPM) |\n| `analysis.tempo_max` | `300.0` | Maximum tempo considered by pulse detector (BPM) |\n| `instrument.max_memory_mb` | auto | Max audio memory for in-memory samples; overrides global split. Oldest evicted (FIFO) |\n| `instrument.directory` | `samples/captures` | Directory of instrument samples to load at startup (overridden by `banks:` in the MIDI map when present) |\n| `instrument.clean_orphaned_sidecars` | `true` | Auto-delete `.analysis.json` sidecars whose audio file has been deleted |\n| `instrument.watch` | `false` | Monitor `instrument.directory` (or each bank directory) at runtime for new audio files from any source - another Subsample instance, a DAW, or any application that writes audio (see Watching for new samples) |\n| `similarity.weight_spectral` | `1.0` | Weight for the spectral shape group (14 metrics) |\n| `similarity.weight_timbre` | `1.0` | Weight for sustained MFCC timbre (coefficients 1-12) |\n| `similarity.weight_timbre_delta` | `0.5` | Weight for delta-MFCC timbre trajectory |\n| `similarity.weight_timbre_onset` | `1.0` | Weight for onset-weighted MFCC attack character |\n| `similarity.weight_band_energy` | `1.0` | Weight for the band energy group (4 per-band energy fractions + 4 decay rates) |\n| `transform.max_memory_mb` | auto | Memory budget (MB) for transform variants; overrides global split |\n| `transform.auto_pitch` | `true` | Pre-compute pitch variants for every MIDI note in the assigned range. Requires `rubberband-cli`. Disable if rubberband is unavailable or you prefer on-the-fly rendering (pitch still works, higher CPU at trigger time) |\n| `transform.target_bpm` | `0.0` | Target BPM for automatic time-stretch variants; 0.0 disables. When \u003e 0, qualifying samples (detected tempo + enough onsets) are beat-quantized to the target tempo |\n| `transform.quantize_resolution` | `16` | Grid subdivision for time-stretch onset alignment: 1 (whole), 2 (half), 4 (quarter), 8 (eighth), 16 (sixteenth) |\n| `transform.variant_cache_dir` | `samples/variant-cache` | Directory for persistent disk cache of transform variants. Empty string or null disables |\n| `transform.max_disk_mb` | auto | Max disk space (MB) for cached variant files; defaults to 3x memory budget. 0 disables |\n| `supervisor.enabled` | `false` | Enable the Supervisor web dashboard (broadcasts state via WebSocket for live monitoring). Requires `pip install subsample[supervisor]` |\n| `supervisor.port` | `9003` | WebSocket port the Supervisor server listens on |\n| `osc.enabled` | `false` | Enable OSC integration (send sample events, optionally receive import requests). Requires `pip install subsample[osc]` |\n| `osc.send_host` | `127.0.0.1` | Destination host for outgoing `/sample/captured` and `/sample/loaded` messages |\n| `osc.send_port` | `9000` | Destination UDP port for outgoing OSC messages |\n| `osc.receive_enabled` | `false` | Listen for `/sample/import` messages to load audio files into the in-memory library from other apps (reads in place, does not copy) |\n| `osc.receive_port` | `9002` | UDP port the OSC receiver listens on |\n| `recorder.audio.ambisonic_format` | `null` | Enable ambisonic capture. One of `a_nt_sf1`, `a_generic`, `b_fuma`, `b_ambix`; requires `channels: 4`. Converts capture to canonical AmbiX B-format on disk (see [Ambisonic capture](#ambisonic-capture)) |\n| `ambisonic.decoder` | `basic` | Decoder weight mode: `basic` (flat velocity), `max_re` (tighter lobes, best HF), or `inphase` (softest lobes, no back-lobes) |\n| `ambisonic.yaw_degrees` | `0.0` | Yaw rotation (degrees) applied to the B-format signal before decoding |\n| `ambisonic.pitch_degrees` | `0.0` | Pitch rotation (degrees) applied to the B-format signal before decoding |\n| `ambisonic.roll_degrees` | `0.0` | Roll rotation (degrees) applied to the B-format signal before decoding |\n| `ambisonic.max_order` | `1` | Reserved for future higher-order support; currently must be 1 |\n\n## Output\n\nRecordings are saved as 16, 24, or 32-bit audio files (depending on\n`recorder.audio.bit_depth`) in the configured output directory.  Container\nformat is controlled by `recorder.audio.audio_format` — `wav` (uncompressed,\nthe default) or `flac` (lossless compressed, see [Storage format](#storage-format)\nbelow).\n\n**Live capture mode** - filenames from the datetime the recording ended:\n\n```\nsamples/\n  2026-03-17_14-32-01-472.wav\n  2026-03-17_14-35-44-091.wav\n```\n\n**File input mode** - filenames from the original audio file's stem plus a\nsegment index:\n\n```\nsamples/\n  field_recording_1.wav\n  field_recording_2.wav\n```\n\nBoth modes write to the same output directory. Point `instrument.directory` at\nthe same path to get a persistent library that grows on disk across sessions.\n\n### Storage format\n\n`recorder.audio.audio_format` decides whether new captures land as `.wav` or\n`.flac`:\n\n- `wav` (default) - uncompressed PCM.  Works at 16, 24, or 32-bit.\n- `flac` - lossless compressed (around 40-60% smaller on typical material,\n  decoded audio is bit-identical).  Works at 16 or 24-bit.\n\n**The rule when formats don't quite line up:**\n\n| Capture scenario | Extension written |\n|---|---|\n| `audio_format: wav`, any bit depth | `.wav` |\n| `audio_format: flac`, live capture at 16 or 24-bit | `.flac` |\n| `audio_format: flac`, 32-bit source (e.g. imported file) | `.wav` for that file, with an INFO log explaining why |\n| `audio_format: flac` combined with `bit_depth: 32` (live capture) | Rejected at startup — set one or the other |\n\nSo if you flip `audio_format: flac` and then process a mix of 16/24-bit and\n32-bit source files, you'll see a mix of `.flac` and `.wav` in your output\ndirectory.  This is correct behaviour: the 32-bit fallback preserves full\nprecision rather than silently truncating.  Live captures share one bit\ndepth per session, so they stay consistent within a run.\n\n**Existing libraries.** Upgrading to a subsample build with FLAC support does\nnot touch your existing `.wav` samples — they continue to load unchanged.\nNo migration, no bulk conversion.  FLAC only affects what gets written for\n*new* captures once you flip the flag.\n\n### Sample previews\n\nWhen `recorder.previews: true` (the default), every captured or imported sample\nalso produces two visual-preview artefacts alongside the audio and analysis\nsidecar:\n\n- **`\u003csample\u003e.preview.png`** — a fixed 1024x256 raster thumbnail (RGB, around\n  15-25 KB) for browsing the library in an OS file manager.  The composition\n  layers a 4-band frequency skyline behind a mirrored waveform envelope, with\n  short vertical ticks at each detected onset and (when the sample is\n  rhythmic) a dashed beat grid.  Stratum heights scale with each band's\n  share of total energy (same four bands as `band_energy.energy_fractions`),\n  so a bass-heavy kick looks bottom-heavy at a glance and a cymbal\n  looks top-heavy — every band keeps at least a small minimum height\n  so its temporal shape stays readable.  A bottom-right badge shows\n  pitch (when tonal), BPM (when rhythmic), and duration.\n- **A `preview` block embedded in `\u003csample\u003e.analysis.json`** (around 4 KB) —\n  the same composition's inputs (envelopes, band strata, onset/beat times,\n  accent colour, badge text) in a compact form.  The Supervisor dashboard\n  calls `subsample.preview.render_svg(data, width, height)` at request time\n  to materialise a scalable vector preview at whatever size the layout wants.\n\n\u003e File managers on macOS, Windows, and Linux do **not** treat sibling PNGs as\n\u003e the audio file's own icon — the `.preview.png` appears as a separate file\n\u003e in the directory listing.  This is deliberate: embedding cover art would\n\u003e mutate the audio container, which subsample does not do.  Browse the\n\u003e previews alongside the audio files, or use the Supervisor dashboard for\n\u003e in-browser thumbnails.\n\nVisual design (stroke weights, colours, layout) can be iterated later without\nany schema bump — the `preview` block stores the underlying data, not the\nrendered output.  Only a change in envelope resolution or band count\nrequires a `preview.version` bump.  Existing samples with no `preview`\nblock simply render nothing in the dashboard; they continue to play back\nand analyse identically.\n\nSet `recorder.previews: false` to skip both artefacts and save around\n15-25 KB per PNG plus 4 KB of JSON per sample.\n\n## Instrument sample library\n\nEvery recording is automatically added to an in-memory instrument library\nalongside its full analysis data. A configurable memory cap prevents unbounded\ngrowth; the oldest samples are evicted when a new one would exceed the limit.\nThe budget is auto-detected by default (60% of the global memory allocation -\nsee `max_memory_mb` in the configuration table) and can be overridden via\n`instrument.max_memory_mb`. WAV files on disk are never deleted.\n\n### Persistent library across sessions\n\n```yaml\noutput:\n  directory: samples/captures\n\ninstrument:\n  directory: samples/captures\n```\n\nOn startup, Subsample pre-loads all existing WAV files from `./samples/captures`.\nAs new recordings arrive they are written to disk and added to memory in one step.\nThe memory cap keeps only the most recent window of captures in RAM; the full\narchive on disk is unaffected.\n\n### Watching for new samples\n\nSet `instrument.watch: true` to monitor the instrument directory for new audio\nfiles at runtime and load them without restarting. The watcher detects audio\nfiles from any source - another Subsample instance, a DAW, an SDR recorder, a\nscript, or any other application that writes audio to the watched directory.\n\nTwo detection paths run in parallel:\n\n1. **Sidecar path** - watches for `.analysis.json` sidecar files (fastest).\n   When a sidecar appears, its corresponding audio file is loaded immediately\n   without re-analysing. This is the path taken when the source is another\n   Subsample instance, which always writes the WAV first and the sidecar second.\n\n2. **Audio file path** - watches for audio files (`.wav`, `.flac`, `.aiff`,\n   `.ogg`, `.mp3`) from any source. After a short grace period to see if a\n   sidecar follows (in case the source is Subsample), checks that the file is\n   no longer being written (file-size stability), runs the full analysis\n   pipeline, writes a sidecar, and loads the sample.\n\nThe audio file path handles the common case where another application writes\nan audio file without any sidecar. The file-size stability check ensures that\nlong recordings still being written are not loaded prematurely - the watcher\nwaits until the file size stops changing before attempting to read it.\n\nSupported audio formats: WAV, FLAC, AIFF, OGG, MP3/MPEG (anything supported\nby libsndfile).\n\n### Multi-machine setup (remote recorder + player)\n\nSubsample can be split across two machines: one captures and analyses audio, the\nother plays it back via MIDI. The two machines share a directory (network drive,\nDropbox, or any folder sync tool). The recorder writes samples there; the player\nwatches the same directory and loads new samples as they arrive - no restart\nrequired.\n\nThis separation is useful when the recording and playback environments are\ndifferent: a field recorder capturing environmental sound in one location, a\nperformance machine somewhere else; a backstage capture machine feeding a front-\nof-house playback rig; or simply keeping CPU-intensive audio analysis on a\ndedicated host.\n\n**Recorder machine** (`config.yaml`):\n```yaml\nrecorder:\n  enabled: true\n\nplayer:\n  enabled: false\n\noutput:\n  directory: \"/mnt/shared/samples\"\n```\n\n**Player machine** (`config.yaml`):\n```yaml\nrecorder:\n  enabled: false\n\nplayer:\n  enabled: true\n\ninstrument:\n  directory: \"/mnt/shared/samples\"\n  watch: true\n```\n\nThe recorder writes each detected sample as a WAV file plus an `.analysis.json`\nsidecar containing the pre-computed feature data. The player monitors the shared\ndirectory for new sidecar files; when one arrives, it loads the sample pair\ndirectly without re-analysing. The sidecar's arrival is used as the ready signal\nbecause the recorder always writes the WAV first - a sidecar appearing means both\nfiles are present and complete.\n\nAudio files from non-Subsample sources (no sidecar) are also detected\nautomatically. After a brief grace period, the player analyses the file, writes\na sidecar for next time, and loads the sample into memory.\n\nNew samples become available for MIDI playback within a second or two of the\nsidecar landing on disk (a short debounce window to accommodate network sync\ntools), or within about 10 seconds for audio files without sidecars (debounce +\ngrace period + analysis). If the WAV has not yet arrived or is still being\nwritten, the player retries automatically.\n\n## Live-coding the MIDI map\n\nYou can edit the MIDI routing map while the player is running and have changes\ntake effect immediately - no restart required. Set `player.watch_midi_map: true`\nand point `player.midi_map` at your working copy:\n\n```yaml\nplayer:\n  enabled: true\n  midi_map: midi-map.yaml\n  watch_midi_map: true\n```\n\nWhen you save the file, Subsample re-parses it and swaps the active note map\nwithin about half a second. If the YAML has a syntax error, the current map is\nkept and a warning is logged - playback is never interrupted. Rapid saves from\ntext editors are debounced into a single reload.\n\n## Reference sample library\n\nReference samples define the canonical sound classes you want to match against -\nkick drum, snare, hi-hat, etc. Each reference is represented by its\n`.analysis.json` sidecar file alongside the original audio. References are\ndeclared as path-based `where: { reference: ... }` predicates in the MIDI map:\n\n```yaml\n- name: Bass Drum\n  notes: 36\n  select:\n    where:\n      reference: samples/reference/GM36_BassDrum1.wav\n```\n\nDuring player startup, each path-based reference is loaded from its sidecar and\nadded to the similarity matrix. If a WAV file exists but its `.analysis.json`\nsidecar is missing, Subsample generates it automatically - you can point at any\nWAV file as a reference without pre-processing. For every instrument sample,\nSubsample computes cosine similarity against every reference and maintains a\nranked list per reference - most similar instrument first. When a sample is\nevicted from the instrument\nlibrary, it is also removed from the ranked lists.\n\nQuery the ranked lists programmatically:\n\n```python\n# Most kick-like instrument in memory\nsample_id = similarity_matrix.get_match(\"GM36_BassDrum1\", rank=0)\n\n# Second-most kick-like (for a separate kick_2 mapping)\nsample_id = similarity_matrix.get_match(\"GM36_BassDrum1\", rank=1)\n```\n\nLookup is case-insensitive.\n\n## Virtual MIDI\n\nSet `player.virtual_midi_port: \"Subsample Virtual MIDI\"` to create a named\nvirtual MIDI input port at startup instead of connecting to a hardware device.\nThis is the primary way to drive Subsample from another application running on\nthe same machine - for example, a Python sequencer such as\n[Subsequence](https://github.com/simonholliday/subsequence) can send a drum\npattern directly to Subsample's virtual port without any physical MIDI hardware.\nFrom the sequencer's side, Subsample's port appears as a MIDI output destination\nwhile Subsample is running. Overrides `player.midi_device`.\n\n\u003e **Performance note:** running a MIDI sequencer and Subsample simultaneously on\n\u003e the same machine means two real-time workloads compete for CPU and I/O. This\n\u003e works well on a modern multi-core machine but may cause xruns or timing drift\n\u003e on lower-powered hardware. If you experience dropouts, reduce\n\u003e `recorder.audio.chunk_size`, lower the sequencer's buffer size, or disable the\n\u003e recorder (`recorder.enabled: false`) to run Subsample in playback-only mode.\n\n## OSC integration\n\nSubsample can send and receive [OSC (Open Sound Control)](https://opensoundcontrol.stanford.edu/)\nmessages, so it can talk to sequencers, visualisers, custom scripts, or any\nother OSC-compatible application. OSC support is an optional extra: install it\nwith\n\n```bash\npip install -e \".[osc]\"\n```\n\nthen enable it in `config.yaml`:\n\n```yaml\nosc:\n  enabled: true\n  send_host: \"127.0.0.1\"\n  send_port: 9000\n  receive_enabled: true\n  receive_port: 9002\n```\n\n### Outgoing messages\n\nWhen `osc.enabled` is true, Subsample sends two events:\n\n| Address | When | Arguments |\n|---|---|---|\n| `/sample/captured` | A new live recording has been analysed | `filepath:str, duration:float, pitch_hz:float, pitch_class:int, tempo_bpm:float, onset_count:int` |\n| `/sample/loaded` | A sample has been added to the instrument library (live capture, hot-load, or OSC import) | `name:str, duration:float, pitch_hz:float, pitch_class:int` |\n\n`pitch_class` is `0..11` for tonal samples (C=0, C#=1, ..., B=11) or `-1`\nwhen no stable pitch is detected. `pitch_hz` is `0.0` when unpitched.\n\n### Incoming messages\n\nWhen `osc.receive_enabled` is also true, Subsample listens on\n`osc.receive_port` for one address:\n\n| Address | Effect | Arguments |\n|---|---|---|\n| `/sample/import` | Read the file at the given path, analyse it, and load it into the in-memory instrument library for immediate playback. The file is read in place - it is not copied or moved. The sample is available until the next restart; for persistence, place the file in `instrument.directory` instead (or as well). | `file_path:str` |\n\nThis is more targeted than the directory watcher and lets external applications\nload specific files into the library on demand - for example, a radio scanner\nor bird detector that wants its captures to become MIDI-playable instruments.\n\n### Use cases\n\n- **Drive a sequencer from incoming sounds.** A Subsequence pattern can react\n  when Subsample captures a snare-like sound, triggering a fill or changing\n  density.\n- **Visualise the library in real time.** A TouchDesigner or Processing patch\n  subscribed to `/sample/loaded` can show new samples as they arrive, mapped\n  by pitch, tempo, or duration.\n- **Cross-app sample handoff.** Any other tool that produces audio files can\n  push them into Subsample with a single `/sample/import` message - no shared\n  filesystem watching required.\n\n## Works with Subsequence\n\n[Subsequence](https://github.com/simonholliday/subsequence) is a sister\nproject: a generative MIDI sequencer and algorithmic composition engine for\nPython with rock-solid timing (typical pulse jitter \u003c 5 μs on Linux).\nTogether, they form part of a fully open-source generative sampler\nworkstation - Subsequence drives the patterns, Subsample provides the\nsounds.\n\nThe two communicate over standard MIDI. The simplest setup is to give\nSubsample a named [virtual MIDI port](#virtual-midi) and have Subsequence send\nto it - no hardware MIDI cabling required, no audio routing on the host:\n\n```yaml\n# config.yaml\nplayer:\n  enabled: true\n  virtual_midi_port: \"Subsample Virtual MIDI\"\n```\n\nFrom the sequencer side, Subsample's port appears as a MIDI output\ndestination while Subsample is running - Subsequence connects with\n`composition.midi_output(\"Subsample Virtual MIDI\")`, no special configuration\nneeded.\n\nFor richer integration, enable [OSC](#osc-integration) on both sides.\nSubsample will forward `/sample/captured` and `/sample/loaded` events to a\nSubsequence OSC listener, so a pattern can respond musically to incoming\nsamples - trigger a fill when a snare-like sound arrives, raise pattern\ndensity when a busy loop is captured, or update a visualiser. Subsequence can\nalso send `/sample/import` messages back to Subsample to push specific files\ninto its library.\n\nEach project is independently useful and has no dependency on the other.\n\n## Scripts\n\n### Analyzing recorded files\n\n```bash\npython scripts/analyze_file.py samples/2026-03-17_14-32-01.wav\n```\n\nOutput:\n```\nrhythm:   tempo=120.2bpm  beats=4  pulses=12  onsets=4\nspectral: duration=2.00s  flatness=0.001  attack=0.000  release=0.812  centroid=0.018  bandwidth=0.001  zcr=0.120  harmonic=0.821  contrast=0.310  voiced=0.940  log_attack=0.000  flux=0.312  rolloff=0.451  slope=0.023\npitch:    pitch=440.0Hz  chroma=A  pitch_conf=0.89  stability=0.120st  voiced_frames=86\nlevel:    peak -1.2dBFS  rms -12.6dBFS  crest 11.4dB  floor -42.3dBFS\n```\n\nSpectral metrics (all [0, 1]):\n- **flatness** - 0 = tonal, 1 = noisy\n- **attack** - 0 = instant/percussive, 1 = gradual build-up\n- **release** - 0 = sudden stop, 1 = long decay tail\n- **centroid** - 0 = bassy, 1 = trebly\n- **bandwidth** - 0 = pure tone, 1 = spectrally complex\n- **zcr** - zero crossing rate: 0 = smooth, 1 = maximally noisy\n- **harmonic** - 0 = percussive, 1 = harmonic/tonal (HPSS)\n- **contrast** - 0 = flat spectrum, 1 = strong spectral peaks\n- **voiced** - fraction of frames with detected pitch\n- **log_attack** - 0 = instant spectral onset, 1 = very slow\n- **flux** - 0 = static spectrum, 1 = rapidly evolving\n- **rolloff** - 0 = energy concentrated low, 1 = energy extends to Nyquist\n- **slope** - 0 = flat spectrum, 1 = steeply tilted\n\nPitch data (raw values):\n- **pitch** - dominant fundamental frequency in Hz, or \"none\" for unpitched audio\n- **chroma** - dominant pitch class (C-B), or \"none\"\n- **pitch_conf** - pyin confidence [0, 1]; use with `voiced` to judge reliability\n- **stability** - pitch stability in semitones; lower = more stable\n- **voiced_frames** - number of frames with detected pitch\n\nAmplitude metadata:\n- **peak** - peak level in dBFS\n- **rms** - RMS loudness in dBFS; drives playback gain normalisation\n- **crest** - crest factor (peak-to-RMS ratio) in dB\n- **floor** - noise floor in dBFS (shown when detectable)\n\nThree MFCC timbre fingerprints are stored in the sidecar (used for similarity,\nnot shown in script output): `mfcc` (mean, average timbre), `mfcc_delta`\n(first-order trajectory), and `mfcc_onset` (onset-weighted, attack emphasis).\n\n### Importing pre-trimmed samples\n\nImport audio files from any source (SDR captures, commercial sample packs, field\nrecordings) directly into the capture library, bypassing the detection pipeline.\nFiles are silence-trimmed, safety-faded, re-encoded as standard PCM WAV, fully\nanalyzed, and saved with sidecar JSON.\n\n```bash\npython scripts/import_samples.py /path/to/samples/*.wav\npython scripts/import_samples.py --to samples/captures /path/to/sample-pack/*.wav\npython scripts/import_samples.py --force \"/mnt/sdr/audio/2026-01-15/*.wav\"\n```\n\n- `--to DIR` - target directory (default: `output.directory` from config.yaml)\n- `--force` - overwrite existing files in target directory\n\nHandles WAV, BWF (Broadcast Wave Format), FLAC, AIFF, OGG, and any other format\nsupported by libsndfile. BWF and non-WAV sources are re-encoded as standard PCM WAV\nso the rest of the pipeline can load them reliably.\n\n### Similarity report\n\n```bash\npython scripts/similarity_report.py           # top 5 per reference (default)\npython scripts/similarity_report.py --top 10  # top 10 per reference\n```\n\nExample output:\n```\nReference: GM36_BassDrum1\n  1.  #5     0.9412  GM36_BassDrum1  ./samples/kick_deep.wav\n  2.  #7     0.8134  kick_hard       ./samples/kick_hard.wav\n  3.  #8     0.7601  kick_soft       ./samples/kick_soft.wav\n```\n\n### Extracting GM percussion references\n\nRender all 47 General MIDI percussion instruments from a SoundFont file into the\nreference sample directory. Requires `fluidsynth` CLI tool and a GM SoundFont.\n\n```bash\npython scripts/extract_gm_drums.py /path/to/FluidR3_GM.sf2\npython scripts/extract_gm_drums.py /path/to/FluidR3_GM.sf2 --output samples/reference/\n```\n\nProduces WAV files + analysis sidecars. Only the sidecars are committed to the\nrepository; audio files are local-only and .gitignored.\n\n## Roadmap\n\n### MIDI expressiveness\n\n- **Mute groups** - notes in a named group silence each other when triggered.\n  Classic use: closed hi-hat silences open hi-hat.\n- **Per-trigger sample variation** - cycle through alternative samples on\n  repeated triggers to avoid the machine-gun effect on rapid notes. (Distinct\n  from the existing per-hit segment round-robin, which cycles through detected\n  hits inside a single sliced loop.)\n\n### Playback and sound design\n\n- **Loop playback** - sustain loops for pads, drones, and textures that play\n  continuously while a key is held.\n\n### Sample management\n\n- **Auto-slicing** - chop loops and long recordings into individual hits by\n  transient detection, then add each slice to the library as a separate sample.\n- **Similar-to-this query** - \"find more sounds like this one\" by exposing the\n  similarity engine as a user-facing search.\n\n### Additional select/process features\n\n- **Random selection** - `order_by: random` to pick a different sample on each\n  trigger.\n\n### Monitoring\n\n- **Web dashboard** - a lightweight local web UI showing active bank, loaded\n  samples, CC state, voice activity, and transform queue progress. Read-only\n  visibility into what the engine is doing, without requiring a terminal.\n\n## Architecture\n\nSubsample is built around three concurrent pipelines that interact through\nthread-safe shared state.\n\n### Live capture pipeline\n\n```\nPortAudio callback → raw PCM bytes → unpack_audio() → CircularBuffer\n                                                               ↓\n                                              LevelDetector.process_chunk()\n                                              (EMA ambient tracking + SNR gate)\n                                                               ↓\n                                              trim_silence() → segment PCM\n                                                               ↓\n                                              SampleProcessor worker pool\n                                              (auto-scaled: (cpu_count - 2) / 2)\n                                                               ↓\n                           to_mono_float() → analyze_all() → WAV + sidecar + SampleRecord\n```\n\nThe input thread is never blocked waiting for analysis. Back-to-back sounds are\ncaptured reliably even when analysis is slow - worker threads handle each\nrecording concurrently and independently.\n\n### Similarity engine\n\nEvery new instrument sample is scored against every reference using cosine\nsimilarity on a 58-element composite vector. The vector is split into five\ngroups, each independently L2-normalised so that no single group dominates by\nscale:\n\n```\nGroup 1 (x14): spectral shape   [flatness, attack, release, centroid, bandwidth, zcr,\n                                  harmonic, contrast, voiced, log_attack, flux,\n                                  spectral_rolloff, spectral_slope, crest_factor]\nGroup 2 (x12): sustained MFCC   [mean timbre, coefficients 1-12]\nGroup 3 (x12): delta-MFCC       [timbre trajectory, coefficients 1-12]\nGroup 4 (x12): onset-weighted   [attack character, coefficients 1-12]\nGroup 5 (x8):  band energy      [sub-bass/low-mid/high-mid/presence fractions + decay rates]\n```\n\nEach group is scaled by a configurable weight (`similarity.weight_*`). This\ndesign means the same comparison method works for both percussive (attack\ncharacter dominates) and tonal (sustained timbre dominates) sounds without\nneeding to classify them first.\n\n### Transform pipeline\n\n```\nSampleRecord added to library\n    → TransformManager.on_sample_added()\n        → enqueue base variant (always)             ← float32 peak-normalised copy\n        → enqueue pitch variants (tonal only)       ← Rubber Band offline finer engine\n        → enqueue time-stretch (if BPM set + enough onsets) ← beat-quantized timemap_stretch\n            → TransformProcessor worker pool\n                → TransformCache (parent-priority FIFO eviction, 50 MB default)\n```\n\nThe base variant (identity spec: no DSP) is produced for every sample -\npercussive and tonal alike - so the playback path never pays the float32\nconversion cost at trigger time. Pitch and time-stretch variants are additional\ncache entries, derived from the same PCM source.\n\nWhen a variant set for a parent sample would exceed the memory budget, the entire\noldest parent's variant family is evicted together, keeping the remaining\nfamilies intact and playable.\n\n### Playback path\n\n```\nMIDI note_on\n    → query engine: filter → order → pick → sample_id\n        (fallback: try each select spec in order)\n    → transform_manager.get_pitched()  → pitch variant (repitch assignments)\n    → transform_manager.get_at_bpm()   → time-stretch variant (stretch_quantize assignments)\n    → transform_manager.get_base()     → base variant (all samples)\n    → _render()                        → on-the-fly fallback (first trigger only)\n    → _render_float(): apply gain · velocity² · anti-clip ceiling\n    → append _Voice (float32 stereo, pre-rendered)\n    ↓\nPyAudio callback (PortAudio high-priority thread)\n    → sum all active voices (float32 addition)\n    → clip to [-1, 1]\n    → float32_to_pcm_bytes(mixed, output_bit_depth)  → int16/24/32 bytes to hardware\n```\n\nAll mixing happens in float32; precision-sensitive DSP (IIR filters, envelope\nfollowers) promotes to float64 internally. The only integer conversion is the\nfinal output packing. Multiple simultaneous voices are summed correctly regardless of the\noutput device's bit depth.\n\n## Requirements\n\n- Python 3.12+\n- PortAudio (required by PyAudio - `apt install portaudio19-dev`, `dnf install portaudio-devel`, or `brew install portaudio`)\n- Rubber Band (required by pyrubberband - `apt install rubberband-cli`, `dnf install rubberband`, or `brew install rubberband`)\n\n**Windows users:** install and run Subsample inside [WSL2](https://learn.microsoft.com/en-us/windows/wsl/install)\n(Windows Subsystem for Linux). This gives you a real Linux environment where\nthe `apt` instructions above just work. Audio devices need to be exposed to\nWSL - see the [WSL audio guide](https://learn.microsoft.com/en-us/windows/wsl/connect-usb)\nfor USB passthrough or use a network audio bridge if your interface supports\none. Subsample is not currently tested against native Windows Python.\n\n## Tests\n\n```bash\npip install -e \".[dev]\"\npytest\n```\n\n## Type Checking\n\n```bash\npip install -e \".[dev]\"\nmypy subsample\n```\n\n## Dependencies and Credits\n\nSubsample makes use of these excellent open-source libraries:\n\n| Library | Purpose | License |\n|---------|---------|---------|\n| [PyAudio ↗](https://people.csail.mit.edu/hubert/pyaudio/) | Audio device I/O (PortAudio bindings) | MIT |\n| [PyYAML ↗](https://github.com/yaml/pyyaml) | YAML config loading | MIT |\n| [NumPy ↗](https://numpy.org/) | Numerical array operations | BSD-3-Clause |\n| [librosa ↗](https://librosa.org/) | Audio analysis (spectral, rhythm, pitch) | ISC |\n| [SciPy ↗](https://scipy.org/) | Signal processing (onset detection, filtering) | BSD-3-Clause |\n| [SoundFile ↗](https://python-soundfile.readthedocs.io/) | WAV file reading for library pre-load | BSD-3-Clause |\n| [mido ↗](https://github.com/mido/mido) | MIDI message parsing and I/O | MIT |\n| [python-rtmidi ↗](https://github.com/SpotlightKid/python-rtmidi) | MIDI device access (RtMidi bindings) | MIT |\n| [pyrubberband ↗](https://github.com/bmcfee/pyrubberband) | Pitch shifting and time-stretching (Rubber Band wrapper) | ISC |\n| [watchdog ↗](https://github.com/gorakhargosh/watchdog) | Filesystem monitoring for multi-machine sample hot-loading | Apache-2.0 |\n| [PyMidiDefs ↗](https://github.com/simonholliday/PyMidiDefs) | MIDI constant definitions (notes, CC, drums, GM) | MIT |\n\n### Academic references\n\nThe compressor/limiter DSP is based on the feed-forward design described in:\n\n\u003e D. Giannoulis, M. Massberg, and J. D. Reiss, \"Digital Dynamic Range Compressor Design - A Tutorial and Analysis,\" *Journal of the Audio Engineering Society*, vol. 60, no. 6, pp. 399-408, 2012.\n\n## About the Author\n\nSubsample was created by me, Simon Holliday ([simonholliday.com ↗](https://simonholliday.com/)), a senior technologist and a junior (but trying) musician. From running an electronic music label in the 2000s to prototyping new passive SONAR techniques for defence research, my work has often explored the intersection of code and sound.\n\n## License\n\nSubsample is released under the [GNU Affero General Public License v3.0](LICENSE) (AGPLv3).\n\nYou are free to use, modify, and distribute this software under the terms of the AGPL. If you run a modified version of Subsample as part of a network service, you must make the source code available to its users.\n\nAll runtime dependencies are permissively licensed (MIT, ISC, BSD-3-Clause) and compatible with AGPLv3.\n\n## Commercial licensing\n\nIf you wish to use Subsample in a proprietary or closed-source product without the obligations of the AGPL, please contact [simon.holliday@protonmail.com] to discuss a commercial license.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsimonholliday%2Fsubsample","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsimonholliday%2Fsubsample","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsimonholliday%2Fsubsample/lists"}