https://github.com/deezer/skey

Self-supervised key estimation model that matches performance with supervised state-of-the-art model.
https://github.com/deezer/skey

Last synced: 7 months ago
JSON representation

Self-supervised key estimation model that matches performance with supervised state-of-the-art model.

Host: GitHub
URL: https://github.com/deezer/skey
Owner: deezer
License: mit
Created: 2025-01-13T15:32:01.000Z (12 months ago)
Default Branch: main
Last Pushed: 2025-06-06T20:29:28.000Z (7 months ago)
Last Synced: 2025-06-06T21:30:46.286Z (7 months ago)
Language: Python
Size: 1.11 MB
Stars: 2
Watchers: 3
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          # S-KEY

`skey` is a Python package for state-of-the-art automatic **musical key detection** from audio recordings, based on the S-KEY model proposed by Yuexuan Kong et al. The package provides an efficient pipeline for loading audio and inferring musical key using a trained deep learning model ChromaNet. It will be made into a PyPI package soon.

- 📄 [S-KEY: Self-supervised Learning of Major and Minor Keys from Audio](https://arxiv.org/abs/2501.12907)  

- ✅ Accepted at [ICASSP 2025](https://ieeexplore.ieee.org/xpl/conhome/10887540/proceeding)

## Features

- 🎼 End-to-end musical key detection from raw audio

- 🧠 A open-sourced pretrained model

- ⚙️  Simple CLI and Python API

- 💽 Support for `.wav`, `.mp3`, etc.

- 🔌 CPU and GPU support

## Installation

```bash

poetry install

```

## Usage

### 🔧 Command Line Interface (CLI)

```bash

poetry skey path/to/audio --device cpu

```

This will run key detection on the specified audio file or directory using the default model and settings. The prediction will be printed if the path is an audio file, will be saved into a .csv file if the path is a directory.

To specify additional options, use the following arguments:

```bash

poetry skey path/to/audio --checkpoint path/to/model.pt --ext mp3 --device cpu

```

- `--checkpoint`: Path to a custom model checkpoint (`.pt`). If not provided, the default model is used.

- `--ext`: Audio file extension (default: `wav`), if `path/to/audio` is a directory. Else infers the extension from the file. Supports all formats readable by torchaudio.

- `--device`: Device to run on (default: `cpu`, e.g., `cuda`, `mps`).

**Arguments**:

| Argument                | Description                                               |

| ----------------------- | --------------------------------------------------------- |

| `path/to/audio`     | Path to directory with audio files or a single audio file |

| `--checkpoint`          | Path to model checkpoint (`.pt`). Loads default if not provided. |

| `--ext`                 | Audio file extension (default: `wav`, supports all formats that can be read by torchaudio) if `path/to/audio` is a directory |

| `--device`              | Device to run on (default: `cpu`, e.g., `cuda`, `mps`)                  |

### 🐍 Python API

```python

from skey import detect_key

detect_key(

    audio_dir="path/to/audio",

    extension="mp3",

    device="cpu"

)

```

**Parameters**:

* `audio_dir` (str): Path to the audio file or directory containing audio files

* `ckpt_path` (str or None, optional): Path to the model checkpoint file. If `None`, the default model is used.

* `extension` (str, optional): File extension (default: `"wav"`)

* `device` (str, optional): Device to run on (default: `cpu`)

## 🗂️ Code organization

```

skey

├── Dockerfile

├── LICENSE

├── Makefile

├── poetry.lock

├── pyproject.toml

├── README.md

├── skey

│   ├── __init__.py

│   ├── chromanet.py

│   ├── cli.py

│   ├── convnext.py

│   ├── hcqt.py

│   ├── key_detection.py

│   ├── models

│   │   └── skey.pt

├── tests

│   ├── nocturne_n02_in_e-flat_major.mp3

│   └── test_detect_key.py

└── training_utils

    ├── config

    │   └── skey.gin

    ├── skey_loss.py

    └── skey.py

```

⚠️ The `training_utils/` directory is **not used** in the `skey` package for inference. However, it is **essential** if you plan to **retrain the model**. It contains:

* full model definition

* loss functions

* Configuration file (`skey.gin`)

To retrain, you will need to plug in your own dataloader and training loop using this codebase as a foundation.

## 📚 Reference

If you use this work in your research, please cite:

```

@INPROCEEDINGS{kongskey2025,

  author={Kong, Yuexuan and Meseguer-Brocal, Gabriel and Lostanlen, Vincent and Lagrange, Mathieu and Hennequin, Romain},

  booktitle={ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)}, 

  title={S-KEY: Self-supervised Learning of Major and Minor Keys from Audio}, 

  year={2025},

  pages={1-5},

  doi={10.1109/ICASSP49660.2025.10890222}}

```

## 📄 License

The code of **SKEY** is [MIT-licensed](LICENSE).

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/deezer/skey

Awesome Lists containing this project

README