https://github.com/kylehowells/demucs-mlx-swift

Last synced: 2 months ago
JSON representation
Host: GitHub
URL: https://github.com/kylehowells/demucs-mlx-swift
Owner: kylehowells
License: mit
Created: 2026-03-10T01:54:43.000Z (3 months ago)
Default Branch: master
Last Pushed: 2026-03-16T20:45:27.000Z (3 months ago)
Last Synced: 2026-03-29T16:07:31.913Z (3 months ago)
Language: Swift
Size: 7.4 MB
Stars: 4
Watchers: 1
Forks: 1
Open Issues: 1
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project

README

          # demucs-mlx-swift

Swift package for Demucs music source separation on Apple Silicon, using the [MLX](https://github.com/ml-explore/mlx-swift) GPU framework.

Separates audio into stems: drums, bass, other, vocals (and guitar + piano for the 6-stem model).

## Features

- `DemucsMLX` library target (importable from macOS/iOS apps)

- `demucs-mlx-swift` CLI demo

- All 8 pretrained Demucs models supported (HTDemucs, HDemucs, Demucs v3)

- Chunked overlap-add inference with configurable segment/overlap/batch/shifts

- Async separation API with progress reporting, ETA estimation, and cancellation

- Multi-format output: WAV (16/24/32-bit), FLAC, ALAC, AAC

- Two-stem mode (e.g. vocals + no_vocals)

- Automatic model download from Hugging Face

## Supported Models

All 8 pretrained Demucs models are supported. Benchmarks on a 3:19 track (M1 Pro, batch_size=8):

| Model | Type | Stems | Time |

|-------|------|-------|------|

| `htdemucs` | HTDemucs (hybrid transformer) | 4 | 14.8s |

| `htdemucs_ft` | HTDemucs (fine-tuned, bag of 4) | 4 | 95.8s |

| `htdemucs_6s` | HTDemucs (6-stem) | 6 | 17.7s |

| `hdemucs_mmi` | HDemucs (hybrid, bag of 4) | 4 | 44.5s |

| `mdx` | Demucs v3 + HDemucs (bag of 4) | 4 | 140.7s |

| `mdx_extra` | HTDemucs (bag of 4) | 4 | 163.7s |

| `mdx_q` | Demucs v3 + HDemucs (bag of 4) | 4 | 120.6s |

| `mdx_extra_q` | HTDemucs (bag of 4) | 4 | 153.3s |

Model files are downloaded automatically from [Hugging Face](https://huggingface.co/iky1e/demucs-mlx) on first use.

## Requirements

- Swift 6.2+

- macOS 14+ or iOS 17+

- Xcode 15+

- Apple Silicon

## Installation (SPM)

```swift

dependencies: [

    .package(url: "https://github.com/kylehowells/demucs-mlx-swift", branch: "master")

]

```

Then add product dependency and import:

```swift

import DemucsMLX

```

## Library Usage

### Synchronous

```swift

import DemucsMLX

let separator = try DemucsSeparator(modelName: "htdemucs")

let result = try separator.separate(fileAt: URL(fileURLWithPath: "song.mp3"))

for (source, audio) in result.stems {

    let url = URL(fileURLWithPath: "\(source).wav")

    try AudioIO.writeAudio(audio, to: url, format: .wav(bitDepth: .int16))

}

```

### Async with Progress and Cancellation

```swift

let separator = try DemucsSeparator(modelName: "htdemucs")

let cancelToken = DemucsCancelToken()

separator.separate(

    fileAt: inputURL,

    cancelToken: cancelToken,

    progress: { progress in

        // Called on main queue

        print("\(Int(progress.fraction * 100))% - \(progress.stage)")

        if let eta = progress.estimatedTimeRemaining {

            print("ETA: \(Int(eta))s")

        }

    },

    completion: { result in

        // Called on main queue

        switch result {

        case .success(let separation):

            for (source, audio) in separation.stems {

                try? AudioIO.writeAudio(audio, to: outputDir.appendingPathComponent("\(source).wav"))

            }

        case .failure(let error):

            print("Error: \(error)")

        }

    }

)

// To cancel:

cancelToken.cancel()

```

### Custom Parameters

```swift

let separator = try DemucsSeparator(

    modelName: "htdemucs_ft",

    parameters: DemucsSeparationParameters(

        shifts: 2,         // shift augmentations (improves quality, multiplies time)

        overlap: 0.25,     // overlap ratio between segments

        split: true,       // chunked overlap-add inference

        segmentSeconds: nil, // nil = use model default

        batchSize: 1,      // chunks processed in parallel (1 = lowest memory)

        seed: 42           // deterministic shifts

    ),

    modelDirectory: URL(fileURLWithPath: "/path/to/models")

)

```

## CLI Demo

Build:

```bash

swift build -c release

```

Run:

```bash

.build/release/demucs-mlx-swift track.mp3 -o separated

```

Options:

| Option | Description |

|--------|-------------|

| `-n, --name` | Model name (default: `htdemucs`) |

| `-o, --out` | Output directory (default: `separated`) |

| `--model-dir` | Local model directory |

| `--segment` | Segment length in seconds |

| `--overlap` | Overlap ratio [0, 1) (default: 0.25) |

| `--shifts` | Shift augmentations (default: 1) |

| `--seed` | Random seed for deterministic shifts |

| `-b, --batch-size` | Chunk batch size (default: 1) |

| `--no-split` | Disable chunked overlap-add |

| `--two-stems` | Output one stem + complement (e.g. `vocals`) |

| `--async` | Use async API with progress reporting |

| `--list-models` | List available models |

| `--mp3` | Output as AAC in .m4a |

| `--flac` | Output as FLAC lossless |

| `--alac` | Output as Apple Lossless in .m4a |

| `--int24` | Output 24-bit integer WAV |

| `--float32` | Output 32-bit float WAV |

Examples:

```bash

# Separate vocals only

.build/release/demucs-mlx-swift song.mp3 --two-stems vocals -o out

# Use fine-tuned model with FLAC output

.build/release/demucs-mlx-swift song.mp3 -n htdemucs_ft --flac -o out

# 6-stem separation (drums, bass, other, vocals, guitar, piano)

.build/release/demucs-mlx-swift song.mp3 -n htdemucs_6s -o out

# Async with progress bar

.build/release/demucs-mlx-swift song.mp3 --async -o out

```

## Performance Tuning

The default `batchSize=1` provides the best balance of speed and memory for most models. Larger batch sizes increase memory usage and are often slower due to exceeding GPU cache limits.

Benchmarks on a 3:13 track (M1 Pro):

| Model | bs=1 | bs=4 | bs=8 |

|-------|------|------|------|

| `htdemucs` | 9.3s / 1.2GB | **8.5s** / 1.8GB | 10.3s / 2.1GB |

| `htdemucs_6s` | **9.3s / 1.5GB** | 10.3s / 2.5GB | 45.1s / 2.1GB |

| `hdemucs_mmi` | **9.2s / 2.1GB** | 19.6s / 3.2GB | 32.2s / 3.6GB |

For memory-constrained environments (iOS apps), also consider limiting MLX's memory cache:

```swift

import MLX

// Reduce MLX cache to prevent holding onto large GPU allocations after use.

// Default is unlimited; 2 MB keeps peak RSS under ~1.3 GB with no speed penalty

// for single-model inference. Larger caches (unlimited) can give ~25% speedup

// on some workloads at the cost of ~5x memory retention.

Memory.cacheLimit = 2 * 1024 * 1024

```

## Model Resolution

Models are resolved in this order:

1. Explicit `--model-dir` (or library `modelDirectory` parameter)

2. `DEMUCS_MLX_SWIFT_MODEL_DIR` environment variable

3. `~/.cache/demucs-mlx-swift-models/`

4. Local paths: `.scratch/models/`, `Models/`, `./`

5. Hugging Face download (default repo: `iky1e/demucs-mlx`)

Environment overrides:

- `DEMUCS_MLX_SWIFT_MODEL_REPO` — set to a Hub repo ID (`org/repo`) or URL.

## Metal Shader Library (Required for MLX Inference)

MLX inference requires `mlx.metallib`.

After `swift build`, generate it with:

```bash

./scripts/build_mlx_metallib.sh release

```

If you run an `xcodebuild`/DerivedData binary, place `mlx.metallib` next to that executable.

If you see `missing Metal Toolchain`, run:

```bash

xcodebuild -downloadComponent MetalToolchain

```

## Make Targets

```bash

make build   # release + mlx.metallib

make debug   # debug + mlx.metallib

make test

make clean

```

## Exporting Models from PyTorch

A script is included to export all 8 pretrained models directly from the original PyTorch Demucs package:

```bash

pip install demucs safetensors numpy

python scripts/export_from_pytorch.py --out-dir ~/.cache/demucs-mlx-swift-models

```
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/kylehowells/demucs-mlx-swift

Awesome Lists containing this project

README