https://github.com/rzru/nightingale

Machine learning powered Karaoke app (with scores!)
https://github.com/rzru/nightingale

ai bevy-engine demucs karaoke karaoke-application karaoke-game machine-learning machine-learning-algorithms ml party react rust shadcn tailwind tauri tauri-app whisper whisper-ai whisperx

Last synced: 2 days ago
JSON representation

Machine learning powered Karaoke app (with scores!)

Host: GitHub
URL: https://github.com/rzru/nightingale
Owner: rzru
License: gpl-3.0
Created: 2026-03-15T14:48:09.000Z (about 2 months ago)
Default Branch: master
Last Pushed: 2026-05-10T12:02:10.000Z (3 days ago)
Last Synced: 2026-05-10T12:33:03.159Z (3 days ago)
Topics: ai, bevy-engine, demucs, karaoke, karaoke-application, karaoke-game, machine-learning, machine-learning-algorithms, ml, party, react, rust, shadcn, tailwind, tauri, tauri-app, whisper, whisper-ai, whisperx
Language: TypeScript
Homepage: https://nightingale.cafe
Size: 158 MB
Stars: 1,102
Watchers: 11
Forks: 72
Open Issues: 13
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

awesome-starred - rzru/nightingale - Machine learning powered Karaoke app (with scores!) (TypeScript)

README

Karaoke from any song in your music library, powered by neural networks.

---

Nightingale scans your music folder, separates lead vocals from instrumentals using the [UVR Karaoke model](https://github.com/Anjok07/ultimatevocalremovergui) (or [Demucs](https://github.com/facebookresearch/demucs)), transcribes lyrics with word-level timestamps via [WhisperX](https://github.com/m-bain/whisperX), and plays it all back with synchronized highlighting, pitch scoring, key/tempo controls, profiles, and dynamic backgrounds.

Ships as a single binary. No manual installation of Python, ffmpeg, or ML models required — everything is downloaded and bootstrapped automatically on first launch.

## Features

🎤 **Stem Separation** — isolates lead vocals from instrumentals using the UVR Karaoke model (default) or Demucs, with adjustable guide vocal volume. The karaoke model preserves backing vocals in the instrumental for a more natural sound

📝 **Word-Level Lyrics** — automatic transcription with alignment, or fetched from [LRCLIB](https://lrclib.net) when available. CJK songs (Japanese / Chinese / Korean) get per-character forced alignment and romanized readings (Hepburn / pinyin / Revised Romanization) shown above each token

🗣️ **Pluggable ASR Engines** — choose Whisper (default, broad language coverage) or **Parakeet v3 (experimental)** for ~25 European languages, with NeMo on CUDA and ONNX Runtime everywhere else

🎼 **UltraStar Deluxe Songs (experimental)** — drop USDX song folders (`.txt` or `.usdx` plus sibling audio/vocals/instrumental/video) into your library; pitch and lyric data come from the file directly, no analyzer pass needed. See [docs/usdx](site/docs/src/usdx.md)

🎯 **Pitch Scoring** — real-time microphone input with pitch detection, star ratings, and per-song scoreboards

🎚️ **Key & Tempo Shifts** — adjust song key and tempo after analysis, with cached playback variants for quick retries

👤 **Profiles** — create and switch between player profiles; scores are tracked per profile

🎬 **Video Files** — drop video files (`.mp4`, `.mkv`, etc.) into your music folder; vocals are separated from the audio track and the original video plays as a synchronized background

🌌 **Audio-Reactive Backgrounds** — 10 GPU shaders that react to your microphone in real time (Plasma, Waves, Nebula, Starfield, Sonar, Voronoi, Vortex, Metaballs, Spectrum, Oscilloscope), Pixabay video loops in 5 flavors (Nature, Underwater, Space, City, Countryside), plus source-video playback for video files

🧭 **Sidebar + Library Filters** — quick filters, metadata cleanup buckets, artist/album groups, and an **Analyze All** action for bulk analysis

🎙️ **Mic Mirroring** — optionally route your live mic into playback for low-latency practice and monitoring, with an adjustable monitor gain (0–200%) in Settings

🎮 **Gamepad Support** — full navigation and control via gamepad (D-pad, sticks, face buttons)

📺 **Adaptive UI Scaling** — scales to any resolution including 4K TVs

📦 **Self-Contained** — ffmpeg, uv, Python, PyTorch, and ML packages are downloaded automatically during setup. Video backgrounds are pre-downloaded so the first session is ready to go

⬆️ **In-App Updates** — auto-checks for new releases at launch, badges the sidebar avatar when one is available, and downloads and installs signed updates with one click on Linux, macOS, and Windows

## Quick start

Download the latest release for your platform from the [Releases](../../releases) page and run it. On first launch, Nightingale shows setup steps, lets you pick a data folder, then installs the Python environment and ML models automatically.

## Updates

Nightingale checks for new releases once at launch. When one is available, the sidebar avatar grows a small green dot and the **Update** entry in the dropdown menu opens a dialog with the release notes. Click **Install & Restart** and the app downloads the signed bundle, installs it, and relaunches.

The in-app updater works for Linux, macOS, and Windows. On Windows the installer runs in `passive` mode — a small progress window flashes and the app comes back automatically once the install finishes.

### macOS

macOS quarantines files downloaded from the internet. Since Nightingale isn't signed with an Apple Developer ID, Gatekeeper will block it with a message like _"app is damaged and can't be opened"_. To fix this, remove the quarantine attribute after moving the Nightingale.app to Applications:

```bash
xattr -cr /Applications/Nightingale.app
```

### Supported formats

Audio: `.mp3`, `.flac`, `.ogg`, `.wav`, `.m4a`, `.aac`, `.wma`. Video: `.mp4`, `.mkv`, `.avi`, `.webm`, `.mov`, `.m4v`. UltraStar: `.usdx`, plus `.txt` files whose contents look like USDX.

## Controls

### Navigation

| Action | Keyboard | Gamepad |
| ---------------- | -------------- | ------------------ |
| Move | Arrow keys | D-pad / Left stick |
| Confirm / Select | Enter | A (South) |
| Back / Cancel | Escape | B (East) / Start |
| Switch panel | Tab | — |
| Search songs | Type to filter | — |

### Playback

| Action | Keyboard | Gamepad |
| ----------------------- | ----------------- | --------- |
| Pause / Resume | Space | Start |
| Exit to menu | Escape | B (East) |
| Toggle guide vocals | G | — |
| Guide volume up/down | + / - | — |
| Cycle background theme | T | — |
| Cycle video flavor | F | — |
| Toggle microphone | M | — |
| Next microphone | N | — |
| Toggle mic mirroring | R | — |
| Toggle fullscreen | F11 | — |
| Skip Intro / Skip Outro | On-screen buttons | A (South) |

## How it works

```mermaid
flowchart TD
A["Audio or video file"] --> B["UVR Karaoke / Demucs"]
A2["USDX bundle (.txt / .usdx)"] --> E["Tauri App (Rust + React)"]
B -->|"vocals + instrumental"| C["LRCLIB"]
C -->|"synced lyrics if available"| D["WhisperX or Parakeet v3 (exp.)"]
D -->|"word-level alignment, CJK reading"| E
E --> F["Plays instrumental + synced lyrics with pitch scoring, key/tempo, mic mirroring, audio-reactive backgrounds"]
```

The analyzer runs as a persistent local process: Nightingale starts it once and talks to it over a token-authenticated loopback TCP socket using newline-delimited JSON, so per-song startup overhead (model load, CUDA init) is paid only once.

Analysis results are cached using blake3 file hashes. Re-analysis only happens if the source file changes, the user triggers it manually, or you choose to shift key/tempo and create playback variants. USDX songs skip stem separation entirely when `#VOCALS` and `#INSTRUMENTAL` are provided.

## Hardware

The Python analyzer uses PyTorch and auto-detects the best backend:

| Backend | Device | Notes |
| ------- | ------------- | ------------------------------------------- |
| CUDA | NVIDIA GPU | Fastest |
| MPS | Apple Silicon | macOS; WhisperX alignment falls back to CPU |
| CPU | Any | Slowest but always works |

The UVR Karaoke model uses ONNX Runtime and enables CUDA acceleration automatically on NVIDIA GPUs, or CoreML on Apple Silicon.

A song typically takes 2–5 minutes on GPU, 10–20 minutes on CPU.

## Data storage

During setup, you can choose where Nightingale stores data (default: `~/.nightingale`). Most runtime data is stored in that selected data folder, while `config.json` and `nightingale.log` remain in `~/.nightingale`.

Typical selected data folder layout:

```
/
├── cache/ # Stems, transcripts, lyrics, shifted variants, covers, playable videos
├── songs.db # SQLite song library and analysis metadata
├── profiles.json # Player profiles and scores
├── videos/ # Cached Pixabay video backgrounds
├── sounds/ # Sound effects (celebration)
├── vendor/
│ ├── ffmpeg # Downloaded ffmpeg binary
│ ├── uv # Downloaded uv binary
│ ├── python/ # Python 3.10 installed via uv
│ ├── venv/ # Virtual environment with ML packages
│ ├── analyzer/ # Extracted analyzer Python scripts
│ └── .ready # Marker indicating setup is complete
└── models/
├── torch/ # Demucs model cache
├── huggingface/ # WhisperX model cache
└── audio_separator/ # UVR Karaoke model cache
```

`~/.nightingale/config.json` stores app settings, including the selected data folder path.

### Video backgrounds

Pixabay video backgrounds use the [Pixabay API](https://pixabay.com/api/docs/). The API key is embedded in release builds. For development, create a `.env` file at the project root:

```
PIXABAY_API_KEY=your_key_here
```

## Building from source

### Prerequisites

| Tool | Version |
| ---------- | --------------------------------------------------------------------------------------------------------------------- |
| Rust | 1.85+ (edition 2024) |
| Node.js | 20+ |
| pnpm | latest |
| Linux only | `libwebkit2gtk-4.1-dev`, `libssl-dev`, `libayatana-appindicator3-dev`, `librsvg2-dev`, `libxdo-dev`, `libasound2-dev` |

### Development

```bash
git clone nightingale
cd nightingale
cargo desktop dev
```

### Release build

```bash
cargo desktop build
```

## Supported platforms

| Platform | Target |
| -------------- | --------------------------- |
| Linux x86_64 | `x86_64-unknown-linux-gnu` |
| Linux aarch64 | `aarch64-unknown-linux-gnu` |
| macOS ARM | `aarch64-apple-darwin` |
| macOS Intel | `x86_64-apple-darwin` |
| Windows x86_64 | `x86_64-pc-windows-msvc` |

## License

GPL-3.0-or-later — see [LICENSE](LICENSE).

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/rzru/nightingale

Awesome Lists containing this project

README