Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

https://github.com/sevagh/pitch-detection

autocorrelation-based O(NlogN) pitch detection
https://github.com/sevagh/pitch-detection

autocorrelation dsp fft mpm pitch-detection pitch-estimation pitch-tracking pyin yin

Last synced: about 2 months ago
JSON representation

autocorrelation-based O(NlogN) pitch detection

Host: GitHub
URL: https://github.com/sevagh/pitch-detection
Owner: sevagh
License: mit
Created: 2015-04-27T21:44:44.000Z (about 9 years ago)
Default Branch: master
Last Pushed: 2023-12-27T13:49:09.000Z (7 months ago)
Last Synced: 2024-03-15T16:12:21.586Z (4 months ago)
Topics: autocorrelation, dsp, fft, mpm, pitch-detection, pitch-estimation, pitch-tracking, pyin, yin
Language: C++
Homepage:
Size: 2.62 MB
Stars: 548
Watchers: 27
Forks: 66
Open Issues: 2
Metadata Files:
- Readme: README.md
- License: LICENSE

Lists

my-awesome-stars - sevagh/pitch-detection - autocorrelation-based O(NlogN) pitch detection (C++)

README

        # pitch-detection

Autocorrelation-based C++ pitch detection algorithms with **O(nlogn) or lower** running time:

* McLeod pitch method - [2005 paper](http://miracle.otago.ac.nz/tartini/papers/A_Smarter_Way_to_Find_Pitch.pdf) - [visualization](./misc/mcleod)

* YIN(-FFT) - [2002 paper](http://audition.ens.fr/adc/pdf/2002_JASA_YIN.pdf) - [visualization](./misc/yin)

* Probabilistic YIN - [2014 paper](https://www.eecs.qmul.ac.uk/~simond/pub/2014/MauchDixon-PYIN-ICASSP2014.pdf)

* Probabilistic MPM - [my own invention](./misc/probabilistic-mcleod)

The size of the FFT used is the same as the size of the input waveform, such that the output is a single pitch for the entire waveform.

Librosa (among other libraries) uses the STFT to create _frames_ of the input waveform, and applies pitch tracking to each frame with a fixed FFT size (typically 2048 or some other power of two). If you want to track the temporal evolution of pitches in sub-sections of the waveform, you have to handle the waveform splitting yourself (look at [wav_analyzer](./wav_analyzer/wav_analyzer.cpp) for more details).

## :postal_horn: Latest news :newspaper: 

Dec 27, 2023 :santa: release:

* Removed SWIPE' algorithm

    * It is not based on autocorrelation, I skipped it in all of the tests, and my implementation was basically copy-pasted from [kylebgorman/swipe](https://github.com/kylebgorman/swipe): just use their code instead!

* Fix autocorrelation (in YIN and MPM) for power-of-two sizes in FFTS (see [ffts issue #65](https://github.com/anthonix/ffts/issues/65)) by using r2c/c2r transforms (addresses [bug #72](https://github.com/sevagh/pitch-detection/issues/72) reported by jeychenne)

* Fix PYIN bugs to pass all test cases (addresses jansommer's comments in [pull-request #84](https://github.com/sevagh/pitch-detection/pull/84#issuecomment-1843623594))

* Added many more unit tests, all passing (228/228)

## Other programming languages

* Go: [Go implementation of YIN](./misc/yin) in this repo (for tutorial purposes)

* Rust: [Rust implementation of MPM](./misc/mcleod) in this repo (for tutorial purposes)

* Python: [transcribe](https://github.com/sevagh/transcribe) is a Python version of MPM for a proof-of-concept of primitive pitch transcription

* Javascript (WebAssembly): [pitchlite](https://github.com/sevagh/pitchlite) has WASM modules of MPM/YIN running at realtime speeds in the browser, and also introduces sub-chunk detection to return the overall pitch of the chunk and the temporal sub-sequence of pitches within the chunk

## Usage

Suggested usage of this library can be seen in the utility [wav_analyzer](./wav_analyzer) which divides a wav file into chunks of 0.01s and checks the pitch of each chunk. Sample output of wav_analyzer:

```

std::vector chunk; // chunk of audio

float pitch_mpm = pitch::mpm(chunk, sample_rate);

float pitch_yin = pitch::yin(chunk, sample_rate);

```

## Tests

### Unit tests

There are unit tests that use sinewaves (both generated with `std::sin` and with [librosa.tone](https://librosa.org/doc/main/generated/librosa.tone.html)), and instrument tests using txt files containing waveform samples from the [University of Iowa MIS](http://theremin.music.uiowa.edu/MIS.html) recordings:

```

$ ./build/pitch_tests

Running main() from ./googletest/src/gtest_main.cc

[==========] Running 228 tests from 22 test suites.

[----------] Global test environment set-up.

[----------] 2 tests from MpmSinewaveTestManualAllocFloat

[ RUN      ] MpmSinewaveTestManualAllocFloat.OneAllocMultipleFreqFromFile

[       OK ] MpmSinewaveTestManualAllocFloat.OneAllocMultipleFreqFromFile (38 ms)

...

[----------] 5 tests from YinInstrumentTestFloat

...

[ RUN      ] YinInstrumentTestFloat.Acoustic_E2_44100

[       OK ] YinInstrumentTestFloat.Acoustic_E2_44100 (1 ms)

[ RUN      ] YinInstrumentTestFloat.Classical_FSharp4_48000

[       OK ] YinInstrumentTestFloat.Classical_FSharp4_48000 (58 ms)

[----------] 5 tests from YinInstrumentTestFloat (174 ms total)

...

[----------] 5 tests from MpmInstrumentTestFloat

[ RUN      ] MpmInstrumentTestFloat.Violin_A4_44100

[       OK ] MpmInstrumentTestFloat.Violin_A4_44100 (61 ms)

[ RUN      ] MpmInstrumentTestFloat.Piano_B4_44100

[       OK ] MpmInstrumentTestFloat.Piano_B4_44100 (24 ms)

...

[==========] 228 tests from 22 test suites ran. (2095 ms total)

[  PASSED  ] 228 tests.

```

### Degraded audio tests

All testing files are [here](./misc/degraded_audio_tests) - the progressive degradations are described by the respective numbered JSON file, generated using [audio-degradation-toolbox](https://github.com/sevagh/audio-degradation-toolbox). The original clip is a Viola playing E3 from the [University of Iowa MIS](http://theremin.music.uiowa.edu/MIS.html). The results come from parsing the output of wav_analyzer to count how many 0.1s slices of the input clip were in the ballpark of the expected value of 164.81 - I considered anything 160-169 to be acceptable:

| Degradation level | MPM # correct | YIN # correct |

| ------------- | ------------- | ------------- |

| 0 | 26 | 22 |

| 1 | 23 | 21 |

| 2 | 19 | 21 |

| 3 | 18 | 19 |

| 4 | 19 | 19 |

| 5 | 18 | 19 |

## Build and install

You need Linux, cmake, and gcc (I don't officially support other platforms). The library depends on [ffts](https://github.com/anthonix/ffts) and [mlpack](https://www.mlpack.org/). The tests depend on [libnyquist](https://github.com/ddiakopoulos/libnyquist), [googletest](https://github.com/google/googletest), and [google benchmark](https://github.com/google/benchmark). Dependency graph:

![dep-graph](./misc/deps.png)

Build and install with cmake:

```bash

cmake -S . -B build -DCMAKE_BUILD_TYPE=Release

cmake --build "build"

# install to your system

cd build && make install

# run tests and benches 

./build/pitch_tests

./build/pitch_bench

# run wav_analyzer

./build/wav_analyzer

```

### Docker

To simplify the setup, there's a [Dockerfile](./Dockerfile) that sets up a Ubuntu container with all the dependencies for compiling the library and running the included tests and benchmarks:

```bash

# build

$ docker build --rm --pull -f "Dockerfile" -t pitchdetection:latest "."

$ docker run --rm --init -it pitchdetection:latest

```

**n.b.** You can pull the [esimkowitz/pitchdetection](https://hub.docker.com/repository/docker/esimkowitz/pitchdetection) image from DockerHub, but I can't promise that it's up-to-date.

## Detailed usage

Read the [header](./include/pitch_detection.h) and the example [wav_analyzer program](./wav_analyzer).

The namespaces are `pitch` and `pitch_alloc`. The functions and classes are templated for `` and `` support.

The `pitch` namespace functions perform automatic buffer allocation, while `pitch_alloc::{Yin, Mpm}` give you a reusable object (useful for computing pitch for multiple uniformly-sized buffers):

```c++

#include 

std::vector audio_buffer(8192);

double pitch_yin = pitch::yin(audio_buffer, 48000);

double pitch_mpm = pitch::mpm(audio_buffer, 48000);

double pitch_pyin = pitch::pyin(audio_buffer, 48000);

double pitch_pmpm = pitch::pmpm(audio_buffer, 48000);

pitch_alloc::Mpm ma(8192);

pitch_alloc::Yin ya(8192);

for (int i = 0; i < 10000; ++i) {

        auto pitch_yin = ya.pitch(audio_buffer, 48000);

        auto pitch_mpm = ma.pitch(audio_buffer, 48000);

        auto pitch_pyin = ya.probabilistic_pitch(audio_buffer, 48000);

        auto pitch_pmpm = ma.probabilistic_pitch(audio_buffer, 48000);

}

```