{"id":26621487,"url":"https://github.com/estaheri7/handspeak","last_synced_at":"2025-04-10T10:30:20.448Z","repository":{"id":283411552,"uuid":"932896340","full_name":"Estaheri7/HandSpeak","owner":"Estaheri7","description":"A real-time American Sign Language (ASL) recognition system","archived":false,"fork":false,"pushed_at":"2025-03-20T04:31:22.000Z","size":386,"stargazers_count":3,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-03-20T05:26:53.478Z","etag":null,"topics":["convolutional-neural-networks","deep-learning","opencv-python","python3","pytorch","resnet","signlanguagerecognition"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Estaheri7.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2025-02-14T18:10:44.000Z","updated_at":"2025-03-20T04:31:26.000Z","dependencies_parsed_at":"2025-03-20T05:37:04.691Z","dependency_job_id":null,"html_url":"https://github.com/Estaheri7/HandSpeak","commit_stats":null,"previous_names":["estaheri7/handspeak"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Estaheri7%2FHandSpeak","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Estaheri7%2FHandSpeak/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Estaheri7%2FHandSpeak/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Estaheri7%2FHandSpeak/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Estaheri7","download_url":"https://codeload.github.com/Estaheri7/HandSpeak/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248199136,"owners_count":21063641,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["convolutional-neural-networks","deep-learning","opencv-python","python3","pytorch","resnet","signlanguagerecognition"],"created_at":"2025-03-24T09:16:38.165Z","updated_at":"2025-04-10T10:30:20.428Z","avatar_url":"https://github.com/Estaheri7.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# HandSpeak: Real-Time American Sign Language Recognition System\n\nHandSpeak is a real-time American Sign Language (ASL) recognition system that uses computer vision and deep learning to identify hand gestures and translate them into text. The system leverages hand landmarks, ResNet for group classification, and multiple MLP models for specific letter predictions.\n\n---\n\n## Features\n- **Real-Time Gesture Recognition**: Detects and classifies ASL gestures in real-time using a webcam.\n- **Hierarchical Classification**: Uses a ResNet model to classify gestures into broad groups and MLP models for specific letter predictions.\n- **Landmark-Based Detection**: Utilizes hand landmarks for robust recognition across varying lighting conditions and backgrounds.\n- **GUI Integration**: Displays recognized letters and words in a user-friendly Tkinter-based interface.\n\n---\n\n## Table of Contents\n- [HandSpeak: Real-Time American Sign Language Recognition System](#handspeak-real-time-american-sign-language-recognition-system)\n  - [Features](#features)\n  - [Table of Contents](#table-of-contents)\n  - [Hand Gestures](#hand-gestures)\n  - [Outcome Video](#outcome-video)\n  - [Project Structure](#project-structure)\n  - [Installation](#installation)\n  - [Usage](#usage)\n  - [Data Preparation](#data-preparation)\n  - [Training the Model](#training-the-model)\n  - [Models](#models)\n    - [ResNet](#resnet)\n    - [MLP](#mlp)\n  - [Labels](#labels)\n  - [Utilities](#utilities)\n    - [`measure_funcs.py`](#measure_funcspy)\n  - [Data Collection Pipeline](#data-collection-pipeline)\n  - [Letter Prediction Pipeline](#letter-prediction-pipeline)\n    - [ResNet Architecture](#resnet-architecture)\n    - [MLPs Architecture](#mlps-architecture)\n  - [Conclusion](#conclusion)\n  - [To-Do](#to-do)\n  - [Contributing](#contributing)\n  - [License](#license)\n\n## Hand Gestures\n\nDifferent Fingerspelling in this project\n\n![ASL](assets/ASL.png)\n\n---\n\n## Outcome Video\n\nWatch the outcome video demonstrating the HandSpeak system in action:\n\nhttps://github.com/user-attachments/assets/25a0ad2d-a39e-4b76-ac6a-d45995f8c5f6\n\n---\n\n## Project Structure\n```\nHandSpeak/\n├── main.py                # Entry point for the application\n├── src/\n│   ├── handspeak.py       # Core logic for gesture detection and classification\n│   ├── models/\n│   │   ├── resnet.py      # ResNet model for group classification\n│   │   ├── mlp.py         # MLP models for specific letter prediction\n│   ├── notebooks/\n│   │   ├── prepare_data.ipynb  # Data preparation and collection\n│   │   ├── train_model.ipynb   # Model training scripts\n│   ├── utils/\n│   │   ├── measure_funcs.py    # Utility functions for distance and angle calculations\n│   ├── labels.json        # Encoded labels for gesture groups\n```\n---\n\n## Installation\n\n1. Clone the repository:\n    ```bash\n    git clone https://github.com/yourusername/HandSpeak.git\n    cd HandSpeak\n    ```\n\n2. Create a virtual environment and activate it:\n    ```bash\n    python -m venv venv\n    source venv/bin/activate  # On Windows use `venv\\Scripts\\activate`\n    ```\n\n3. Install the required packages:\n    ```bash\n    pip install -r requirements.txt\n    ```\n---\n\n## Usage\n\n1. Prepare the dataset by following the instructions in the [Data Preparation](#data-preparation) section.\n2. Train the model by following the instructions in the [Training the Model](#training-the-model) section.\n3. Run the application:\n    ```bash\n    python main.py\n    ```\n---\n\n## Data Preparation\n\n1. Open the `src/notebooks/prepare_data.ipynb` notebook.\n2. Follow the instructions to create directories for training and testing data.\n3. Collect hand gesture images using OpenCV and process them before saving.\n\n## Training the Model\n\n1. Open the `src/notebooks/train_model.ipynb` notebook.\n2. Follow the instructions to load the dataset, define the model, and train it.\n3. Save the trained model weights.\n\n---\n\n## Models\n### ResNet\n- Used for classifying gestures into broad groups (`AMNSTE`, `DFBUVLKRW`, `COPQZX`, `GHYJI`).\n- Implements residual blocks for efficient feature extraction.\n\n### MLP\n- Predicts specific letters within each group.\n- Lightweight and optimized for real-time inference.\n\n---\n\n## Labels\nThe `labels.json` file maps ASCII codes to gesture groups:\n```json\n{\n     \"amnste\": { \"97\": 0, \"101\": 1, \"109\": 2, \"110\": 3, \"115\": 4, \"116\": 5 },\n     ...\n}\n```\n\n---\n\n## Utilities\n### `measure_funcs.py`\n- **`euclidean_distance`**: Calculates the distance between two points.\n- **`calculate_angle`**: Computes the angle between two points.\n- **`is_above`**: Determines if one point is above another.\n\n---\n\n## Data Collection Pipeline\n\nThe diagram below shows how images are collected, you can find more details in `src/notebooks/prepare_data.ipynb`.\n\n![Collect work flow](assets/workflow/v2/collect_workflow_diagram.png)\n\n---\n\n## Letter Prediction Pipeline\nThe two outputs of the previous section, `landmark features` and `landmarks drawn onto a pure white image` serve as inputs for this section to predict the specific letter.\n\n![Letter prediction pipeline](assets/workflow/v2/predict_workflow_diagram.png)\n\n### ResNet Architecture\n\n![ResNetArch](assets/workflow/v2/ResNet_Architecture.png)\n\nCreated with [PlotNeuralNet](https://github.com/HarisIqbal88/PlotNeuralNet)\n\n### MLPs Architecture\n\n![MLPArch](assets/workflow/v2/MLP_architecture.png)\n\nFirst layer (# features)-\u003e 15 neurons\nSecond layer (hidden layer) -\u003e 128 neurons\nThird layer (hidden layer) -\u003e 64 neurons\nFourth layer (hidden layer) -\u003e 32 neurons\nFinal layer (predicted letters) -\u003e depends on group letters, For example in this image there are 6 neurons in final layer\n\n---\n\n## Conclusion\n\nThe training process demonstrates that the ResNet model effectively learns the dataset, achieving near-perfect accuracy on the test set.\n\nThe following table presents the final loss values for each of the MLP models after training:\n\n| Model Name   | Final Loss |\n|-------------|------------|\n| **amnste**   | 0.3044     |\n| **dfbuvlkrw** | 0.3017     |\n| **ghyji**    | 0.0293     |\n| **copqzx**   | 1.6452     |\n\nYou can find more details in `src/notebooks/train_model.ipynb`.\n\n---\n\n## To-Do\n\n- [x] Optimize the data processing pipeline.\n- [ ] Explore alternative model architectures.\n- [x] Use MediaPipe to get full focus on the hand by detecting hand landmarks, cropping the hand region, and ensuring it remains the primary focus in the frame.\n- [ ] Better word transition between predicts.\n- [ ] Add more and better numeric features to get better accuracy for each MLP model.\n- [ ] Add AutoCorrect system for typed text.\n\n---\n\n## Contributing\nContributions are welcome! Feel free to submit issues or pull requests.\n\n---\n\n## License\nThis project is licensed under the MIT License.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Festaheri7%2Fhandspeak","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Festaheri7%2Fhandspeak","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Festaheri7%2Fhandspeak/lists"}