https://github.com/agarnung/cv-theremin
A handtracking and 2D camera based digital theremin
https://github.com/agarnung/cv-theremin
computer-vision handtracking image-processing music python theremin
Last synced: 21 days ago
JSON representation
A handtracking and 2D camera based digital theremin
- Host: GitHub
- URL: https://github.com/agarnung/cv-theremin
- Owner: agarnung
- License: mit
- Created: 2025-01-01T12:14:51.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2025-02-25T11:48:41.000Z (over 1 year ago)
- Last Synced: 2025-02-25T12:35:32.627Z (over 1 year ago)
- Topics: computer-vision, handtracking, image-processing, music, python, theremin
- Language: Python
- Homepage:
- Size: 43 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# cv-theremin
A handtracking and 2D camera based digital theremin 🎶.
The Theremin operates in one (and only one) of the following modes:
🖐️ **Naïve Method** – Maps 2D structural and morphological features of the hand to tones.
🧠 **Fuzzy Logic** – Uses hand pose and geometry to intelligently determine tones.
📏 **Pinhole Depth Estimation** – Calculates hand distance from the camera and translates it into sound (see [theory](./cameraCalibration/README.md)).
Tools used
- [Python v3.10.12](https://docs.python-guide.org/starting/install3/linux/) for programming
- [`opencv_contrib_python`](https://github.com/opencv/opencv-python?tab=readme-ov-file#installation-and-usage) for camera usage and frames manipulation
- [`mediapipe`](https://github.com/google-ai-edge/mediapipe) for hand tracking
- [`pyo`](https://github.com/belangeo/pyo) for audio managment
- [`pipreqs`](https://github.com/bndr/pipreqs) for listing all requirements
- [`skfuzzy`](https://pythonhosted.org/scikit-fuzzy/) for fuzzy sets implementation of tone control
Usage of the theremin in a virtual environment
Install venv, e.g.
`sudo apt install python3.10-venv `
Create a virtual environment:
`python3 -m venv test`
Activate it:
`source ./bin/activate`
Open VS Code:
`code .`
Be sure to use the correct Python interpreter, e.g.:
`../bin/python3`
Install required dependencies:
`python3 -m pip install -r requirements.txt`
Run main program:
`python3 main.py`
To run tests
`pytest tests/camera_test.py -v --tb=short --camera 0`
`python3 tests/handtracking_test.py`
Other libraries considered but not used
- [pygame](https://www.pygame.org/news)
- [sounddevice](https://python-sounddevice.readthedocs.io/en/0.5.1/)
- [pyaudio](https://people.csail.mit.edu/hubert/pyaudio/)
Interesting links
- https://mediapipe.readthedocs.io/en/latest/solutions/hands.html
- https://medium.com/@stevehiehn/how-to-generate-music-with-python-the-basics-62e8ea9b99a5
- https://www.youtube.com/watch?v=m_rmwcUREeY
- https://gist.github.com/sahithyen/b20922c902620e5bd6fd926263a93836
- https://splice.com/blog/how-theremin-works/
References
- Bartelt, T. L. M. (2001). Industrial Control Electronics: Devices, Systems & Applications (2ª ed.). Thomson Delmar Learning (p. 73).
TODO
- Versión para móvil o web.
- Hacer interfaz estática tipo HUD como [aqui](https://www.linkedin.com/posts/francastano_visiaejnartificial-inteligenciaartificial-activity-7307811734059151362-2_2m?utm_source=share&utm_medium=member_desktop&rcm=ACoAAFXLzVABTf15btKvx3DmtCu91bAxFIwl-gs).
- Otro modo de funcionamiento: monocular dephtmap estimation con DL, en vez de la aproximación pinhole que hice, ver [esto](https://stackoverflow.com/questions/64685185/is-there-a-way-to-generate-real-time-depthmap-from-single-camera-video-in-python) y [esto](https://openaccess.thecvf.com/content_ICCV_2019/html/Godard_Digging_Into_Self-Supervised_Monocular_Depth_Estimation_ICCV_2019_paper.html).
- Usar barras de estado de [este estilo](https://www.linkedin.com/posts/siddhartha-reddy-054321341_python-opencv-mediapipe-activity-7360493210772598784-o8ko?utm_source=share&utm_medium=member_android&rcm=ACoAAFXLzVABTf15btKvx3DmtCu91bAxFIwl-gs)
- Hacer otro plugin que use kalman filter pal seguimiento mano theremin (tamaño posición,..), ver pag 183 simon prince