https://github.com/georgemandis/cacophony
Sound classification CLI powered by native macOS SoundAnalysis. Classify 300+ sound types from audio files or live microphone input. Written in Zig.
https://github.com/georgemandis/cacophony
apple audio cli command-line-tool macos native sound sound-classification zig
Last synced: 2 days ago
JSON representation
Sound classification CLI powered by native macOS SoundAnalysis. Classify 300+ sound types from audio files or live microphone input. Written in Zig.
- Host: GitHub
- URL: https://github.com/georgemandis/cacophony
- Owner: georgemandis
- Created: 2026-05-28T23:55:36.000Z (about 1 month ago)
- Default Branch: main
- Last Pushed: 2026-06-13T00:06:31.000Z (17 days ago)
- Last Synced: 2026-06-16T12:33:31.568Z (14 days ago)
- Topics: apple, audio, cli, command-line-tool, macos, native, sound, sound-classification, zig
- Language: Zig
- Size: 14.6 KB
- Stars: 1
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# cacophony
Sound classification from the command line, powered by native macOS SoundAnalysis.
Classify 300+ types of sounds from audio files or live microphone input. Dogs barking, glass breaking, sirens, laughter, musical instruments, speech, rain — all recognized on-device using Apple's built-in sound classifier. No API keys, no downloads, no network calls.
Written in Zig. Uses Apple's SoundAnalysis and AVFoundation frameworks via Objective-C runtime bindings.
## Install
### Homebrew
```bash
brew install georgemandis/tap/cacophony
```
### From source
Requires [Zig 0.16+](https://ziglang.org/download/) and macOS.
```bash
git clone https://github.com/georgemandis/cacophony.git
cd cacophony
zig build -Doptimize=ReleaseFast
```
## Usage
### Classify an audio file
```bash
$ cacophony classify recording.wav
0.9792 speech
0.5980 sigh
0.5331 mosquito_buzz
0.4602 insect
0.4339 click
$ cacophony classify song.mp3 --top=3 --json
[{"label":"music","confidence":0.9812},{"label":"guitar","confidence":0.7234},{"label":"singing","confidence":0.6891}]
```
### Listen to the microphone
```bash
$ cacophony listen
Listening for 5.0s...
0.5530 sneeze
0.4091 wind_instrument
0.3979 speech
0.3402 saxophone
0.3345 music
$ cacophony listen --duration=3000 --threshold=0.5 --top=3
Listening for 3.0s...
0.8234 dog_bark
0.6012 animal
0.5891 domestic_animal
```
### List all sound categories
```bash
$ cacophony categories | head -10
speech
shout
yell
battle_cry
children_shouting
screaming
whispering
laughter
baby_laughter
giggling
$ cacophony categories | wc -l
303
$ cacophony categories --json | jq '.[0:5]'
["speech","shout","yell","battle_cry","children_shouting"]
```
## Composability
```bash
# Check if a recording contains speech
cacophony classify recording.wav --top=1 | grep speech
# Monitor for specific sounds
cacophony listen --duration=10000 --json | jq '.[] | select(.label == "dog_bark")'
# Search for a sound category
cacophony categories | grep drum
```
## Options
```
cacophony [options]
Commands:
classify Classify sounds in an audio file
listen Classify sounds from the microphone
categories List all recognized sound categories
Options:
--top=N Show top N classifications (default: 5)
--threshold=N Minimum confidence 0.0-1.0 (default: 0.0)
--duration=MS Listen duration in ms (default: 5000)
--json Output as JSON
--help, -h Show this help message
--version, -v Show version
```
## Requirements
- macOS 10.15+ (Catalina or later)
- Zig 0.16+
## How It Works
cacophony uses Apple's [SoundAnalysis](https://developer.apple.com/documentation/soundanalysis) framework with the built-in Version 1 sound classifier, which recognizes 303 sound categories.
- **File analysis:** `SNAudioFileAnalyzer` loads audio files via `AVAudioFile` and runs classification synchronously
- **Live mic:** `AVAudioEngine` captures microphone input, feeding audio buffers to `SNAudioStreamAnalyzer` via an audio tap
- **Observer pattern:** Runtime class creation (`objc_allocateClassPair`) implementing `SNResultsObserving` protocol
- **Block ABI:** ObjC block construction (`_NSConcreteStackBlock`) for the audio engine tap callback
## Related Projects
- [lingua](https://github.com/georgemandis/lingua) — NLP CLI (NaturalLanguage framework)
- [tezcatl](https://github.com/georgemandis/tezcatl) — Headless web rendering CLI (WebKit)
- [loupe](https://github.com/georgemandis/loupe) — Computer vision CLI (Vision framework)
- [whereami](https://github.com/georgemandis/whereami) — Location CLI (CoreLocation)
- [nearme](https://github.com/georgemandis/nearme) — Local search CLI (MapKit)
## Credits
Created by [George Mandis](https://george.mand.is) during [Recurse Center](https://www.recurse.com/).