Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/trishume/popclick
Detecting lip popping noises to trigger an action.
https://github.com/trishume/popclick
algorithm audio-recognition spectrograph
Last synced: 3 months ago
JSON representation
Detecting lip popping noises to trigger an action.
- Host: GitHub
- URL: https://github.com/trishume/popclick
- Owner: trishume
- Created: 2014-11-26T02:36:10.000Z (about 10 years ago)
- Default Branch: master
- Last Pushed: 2016-03-17T22:08:10.000Z (almost 9 years ago)
- Last Synced: 2023-03-12T09:53:44.416Z (almost 2 years ago)
- Topics: algorithm, audio-recognition, spectrograph
- Language: C++
- Size: 46.9 KB
- Stars: 14
- Watchers: 3
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: Readme.md
Awesome Lists containing this project
README
# Lip Popping Recognizer
This project implements a simple algorithm for recognizing when the user
makes a popping noise with their lips. The plan is to use this in concert
with my [eye tracker](https://theeyetribe.com/) as a way to click without
using my hands.It is currently being rewritten as a [Vamp plugin](http://vamp-plugins.org/) in C++.
I develop the Vamp plugin in Sublime Text, compile it to a dylib and then load it in [Sonic Visualizer](http://www.sonicvisualiser.org/)
which is a great tool for developing audio recognition algorithms.I plan on expanding this to recognize other mouth noises. This is an unsolved
problem in audio recognition as it has very different goals from other domains like speech recognition:1. Noises are much easier to distinguish from each other. You can look at a spectrograph and identify if something is the noise you want, not so for a spoken word.
2. Goal is to be resource efficient and real-time. Generally speech recognizers don't do this because good results require heavy processing.
3. Highly reliable, very low false positives only a few false negatives. This is only possible because of the ease of distinguishing.## Sonic Visualizer Screenshot
![Screenshot](http://i.imgur.com/2UsEBmQ.png)
The top row is a spectrograph, the middle is a debug visualization of the algorithm state, the bottom row is the waveform and some parameters.
The red lines on the bottom row are when a lip pop can be recognized (if it was real time), the three lip pops and three non-pops are correctly classified.This is the layout I use for debugging the algorithm, [Sonic Visualizer](http://www.sonicvisualiser.org/) allows me to inspect values and scroll and zoom around the test audio file.
## Building
Run `make -f Makefile.osx` on osx, change the `.osx` to the correct Makefile if you are on another platform.