{"id":21424410,"url":"https://github.com/bilovodskyi/gesture-based-object-control","last_synced_at":"2026-05-21T10:02:29.365Z","repository":{"id":263199620,"uuid":"889254325","full_name":"Bilovodskyi/Gesture-Based-Object-Control","owner":"Bilovodskyi","description":"Control 3D objects using webcam and hands. Python, Machine Learning, React-Three-Fiber project. More info inside.","archived":false,"fork":false,"pushed_at":"2024-11-19T00:27:15.000Z","size":1470,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-01-23T07:14:05.297Z","etag":null,"topics":["blender","machine-learning","python3","react-three-fiber","websocket"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Bilovodskyi.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-11-15T23:06:38.000Z","updated_at":"2024-11-19T00:27:18.000Z","dependencies_parsed_at":"2024-11-16T23:17:54.455Z","dependency_job_id":"cbceb9da-f48a-43a4-84b6-7abf285b2b64","html_url":"https://github.com/Bilovodskyi/Gesture-Based-Object-Control","commit_stats":null,"previous_names":["bilovodskyi/gesture-based-object-control"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Bilovodskyi%2FGesture-Based-Object-Control","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Bilovodskyi%2FGesture-Based-Object-Control/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Bilovodskyi%2FGesture-Based-Object-Control/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Bilovodskyi%2FGesture-Based-Object-Control/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Bilovodskyi","download_url":"https://codeload.github.com/Bilovodskyi/Gesture-Based-Object-Control/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":243933374,"owners_count":20370988,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["blender","machine-learning","python3","react-three-fiber","websocket"],"created_at":"2024-11-22T21:21:55.353Z","updated_at":"2026-05-21T10:02:29.295Z","avatar_url":"https://github.com/Bilovodskyi.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# 🍿 Video\n\nFinal version: \n\n\nhttps://github.com/user-attachments/assets/ffb03f69-9e46-4c35-ad09-14b49d5dfee3\n\n\n# 📚 About\n\nThis project allows for controlling 3D objects on the frontend, created using **React, react-three-fiber, Blender, and Figma**(The Coke can used in this project is part of my previous project. You can find more details about it [here](https://github.com/Bilovodskyi/3D-coca-cola)). The backend utilizes **Python**, **machine learning** (to train models to recognize gestures using a webcam), **OpenCV, and MediaPipe** (to capture hand gestures). It is connected via **WebSockets** for real-time data transfer between the frontend and backend.\n\n# 🛠️ Tech Stack\n\n-   React, React-Three-Fiber, Blender, Figma\n-   Python, OpenCV, MediaPipe\n-   scikit-learn (for machine learning)\n-   WebSockets\n\n# ⚡ Main chalenges\n\n## Why machine learning ? \n\nThe first intuitive approach to rotate our 3D object (I picked the OK gesture, which is intuitive for grabbing objects) was to use the Mediapipe library. The idea was to measure the distance between the index fingertip and thumb tip. If the distance was less than 0.1 px, it meant we were showing the OK gesture. This approach worked but had a problem: it also falsely recognized other gestures where these two points were close together.\n\nTo fix this, I used the Scikit-learn library, which allows training a model to recognize only specific combinations of points as a separate gesture.\n\nThe first video demonstrates the initial approach and the issues that came with it:\n\n\nhttps://github.com/user-attachments/assets/cf8f2d6c-b1aa-4822-baae-572c780b8eb1\n\n\nThe second video shows the process of collecting samples (4 x 100) for training the model. The first two gestures represent the base case, showing the model how the hand might look when the gesture is not happening. Gestures 3 and 4 are used for rotating and changing the position of the 3D object.\n\n\nhttps://github.com/user-attachments/assets/81477491-cb6e-496e-ae08-12aff8c768a2\n\n\n## Move and rotate object issue\n\nAs you can see in the first video, the recognition of gestures is not the only issue. Since I was simply collecting the `x` and `y` values and passing them to the frontend, every time I moved my hand with an active gesture, then returned to the initial position with a non-active gesture to rotate the object further, it would jump back to the initial value.\n\nHere is a picture illustrating how Mediapipe collects `x` and `y` points:\n\u003cimg width=\"953\" alt=\"Screenshot 2024-11-18 at 4 14 32 PM\" src=\"https://github.com/user-attachments/assets/e1b0d0af-6ca9-4180-8c16-36c24ed5f094\"\u003e\n\nFor example, if we want:\n\n- `x = 0` (initial position, even if the gesture starts at the center of the screen)  \n- Move the `x` axis by `0.4`  \n- Return the hands to the initial state  \n- Then activate the gesture and move another `0.4`  \n\nThe expected result is `x = 0.8`.  \n\nHowever, the actual behavior is different:\n\n- Since the gesture starts at the center of the screen, `x = 0.4`  \n- Move the `x` axis by `0.4` to reach `0.8`  \n- Return the hands to the initial state with a non-active gesture  \n- Activate the gesture again, and it jumps back to `x = 0.4`\n\nTo fix this, I added the following lines of code:\n\n```\nrotation_gesture_active = False\n\nindex_tip = hand_landmarks.landmark[mp_hands.HandLandmark.INDEX_FINGER_TIP]\ndata = {\"rotation_x\": 0, \"rotation_y\": 0, \"position_x\": 0, \"position_y\": 0}\n\nif int(prediction[0]) == 2:\n   if not rotation_gesture_active:\n      rotation_gesture_active = True\n      prev_real_x = index_tip.x\n      prev_real_y = index_tip.y\n      data[\"rotation_x\"] = 0\n      data[\"rotation_y\"] = 0\n   else:\n      real_x = index_tip.x - prev_real_x\n      real_y = index_tip.y - prev_real_y\n      prev_real_x = index_tip.x\n      prev_real_y = index_tip.y\n      data[\"rotation_x\"] = real_x\n      data[\"rotation_y\"] = real_y\n\nelse:\n   if rotation_gesture_active:\n      rotation_gesture_active = False\n   if position_gesture_active:\n      position_gesture_active = False\n```\n\n# 🔍 Conclusions \n\nIt was an interesting project. Since JavaScript is my primary language, it was really nice to play with Python and improve my skills. The future belongs to AI and machine learning, so stepping into this was kind of interesting. Also, using React for the visual part of the project instead of a Python library added additional fun, especially when connecting the two using WebSockets.\n\n\n\n\n\n\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbilovodskyi%2Fgesture-based-object-control","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbilovodskyi%2Fgesture-based-object-control","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbilovodskyi%2Fgesture-based-object-control/lists"}