An open API service indexing awesome lists of open source software.

https://github.com/agarnung/cv-theremin

A handtracking and 2D camera based digital theremin
https://github.com/agarnung/cv-theremin

computer-vision handtracking image-processing music python theremin

Last synced: 21 days ago
JSON representation

A handtracking and 2D camera based digital theremin

Awesome Lists containing this project

README

          

# cv-theremin
A handtracking and 2D camera based digital theremin 🎶.

The Theremin operates in one (and only one) of the following modes:

🖐️ **Naïve Method** – Maps 2D structural and morphological features of the hand to tones.

🧠 **Fuzzy Logic** – Uses hand pose and geometry to intelligently determine tones.

📏 **Pinhole Depth Estimation** – Calculates hand distance from the camera and translates it into sound (see [theory](./cameraCalibration/README.md)).

Tools used

- [Python v3.10.12](https://docs.python-guide.org/starting/install3/linux/) for programming
- [`opencv_contrib_python`](https://github.com/opencv/opencv-python?tab=readme-ov-file#installation-and-usage) for camera usage and frames manipulation
- [`mediapipe`](https://github.com/google-ai-edge/mediapipe) for hand tracking
- [`pyo`](https://github.com/belangeo/pyo) for audio managment
- [`pipreqs`](https://github.com/bndr/pipreqs) for listing all requirements
- [`skfuzzy`](https://pythonhosted.org/scikit-fuzzy/) for fuzzy sets implementation of tone control

Usage of the theremin in a virtual environment

Install venv, e.g.
`sudo apt install python3.10-venv `

Create a virtual environment:
`python3 -m venv test`

Activate it:
`source ./bin/activate`

Open VS Code:
`code .`

Be sure to use the correct Python interpreter, e.g.:
`../bin/python3`

Install required dependencies:
`python3 -m pip install -r requirements.txt`

Run main program:
`python3 main.py`

To run tests

`pytest tests/camera_test.py -v --tb=short --camera 0`

`python3 tests/handtracking_test.py`

Other libraries considered but not used

- [pygame](https://www.pygame.org/news)
- [sounddevice](https://python-sounddevice.readthedocs.io/en/0.5.1/)
- [pyaudio](https://people.csail.mit.edu/hubert/pyaudio/)

Interesting links

- https://mediapipe.readthedocs.io/en/latest/solutions/hands.html
- https://medium.com/@stevehiehn/how-to-generate-music-with-python-the-basics-62e8ea9b99a5
- https://www.youtube.com/watch?v=m_rmwcUREeY
- https://gist.github.com/sahithyen/b20922c902620e5bd6fd926263a93836
- https://splice.com/blog/how-theremin-works/

References


- Bartelt, T. L. M. (2001). Industrial Control Electronics: Devices, Systems & Applications (2ª ed.). Thomson Delmar Learning (p. 73).

TODO

- Versión para móvil o web.

- Hacer interfaz estática tipo HUD como [aqui](https://www.linkedin.com/posts/francastano_visiaejnartificial-inteligenciaartificial-activity-7307811734059151362-2_2m?utm_source=share&utm_medium=member_desktop&rcm=ACoAAFXLzVABTf15btKvx3DmtCu91bAxFIwl-gs).

- Otro modo de funcionamiento: monocular dephtmap estimation con DL, en vez de la aproximación pinhole que hice, ver [esto](https://stackoverflow.com/questions/64685185/is-there-a-way-to-generate-real-time-depthmap-from-single-camera-video-in-python) y [esto](https://openaccess.thecvf.com/content_ICCV_2019/html/Godard_Digging_Into_Self-Supervised_Monocular_Depth_Estimation_ICCV_2019_paper.html).

- Usar barras de estado de [este estilo](https://www.linkedin.com/posts/siddhartha-reddy-054321341_python-opencv-mediapipe-activity-7360493210772598784-o8ko?utm_source=share&utm_medium=member_android&rcm=ACoAAFXLzVABTf15btKvx3DmtCu91bAxFIwl-gs)

- Hacer otro plugin que use kalman filter pal seguimiento mano theremin (tamaño posición,..), ver pag 183 simon prince