Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/cyberfantics/form_digitilization
https://github.com/cyberfantics/form_digitilization
Last synced: 20 days ago
JSON representation
- Host: GitHub
- URL: https://github.com/cyberfantics/form_digitilization
- Owner: cyberfantics
- Created: 2024-12-04T16:23:35.000Z (28 days ago)
- Default Branch: main
- Last Pushed: 2024-12-04T17:15:57.000Z (28 days ago)
- Last Synced: 2024-12-04T18:25:02.279Z (28 days ago)
- Language: Python
- Size: 929 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Form Digitization Using CV2 and AI 🎥🤖
This project leverages OpenCV and the **Gemini 1.5 Flash API** for real-time form digitization. It provides an interactive interface to digitize forms, detect hand gestures for form type selection, and overlay extracted data on a visually appealing custom background.
## Features 🌟
- **Real-Time Video Feed:** Captures live input from the webcam.
- **Hand Gesture Recognition:** Uses finger gestures to select the form type (e.g., Student Card, Challan, General Form).
- **Gemini 1.5 Flash API Integration:** Extracts text from images of forms uploaded by the user.
- **Dynamic Data Display:** Displays extracted data (like name, CNIC, department, etc.) directly on the application interface.
- **Customizable UI:** Implements a styled UI with background overlays and region-specific displays.
- **Video Recording:** Allows recording of sessions for future reference.
- **Multi-Mode Support:** Toggle between instruction view, live capture, and extracted text view.---
## Prerequisites 📋
Make sure you have the following installed and set up before running the application:- **Python 3.8+**
- **OpenCV** (`cv2`)
- **cvzone** (for hand gesture detection)
- **Pillow** (for image processing)
- **Gemini 1.5 Flash API** credentials
- A webcam for live video input---
## Installation 🛠️
1. Clone the repository:
```bash
git clone https://github.com/cyberfantics/form_digitilization.git
cd form_digitilization
```
2. Install the required dependencies:
```
pip install -r requirements.txt
```
3. Add your `Gemini 1.5 Flash API` key in the `extract.py` script.---
## How to Use ▶️
### Run the application:
```bash
python main.py
```
### Follow the On-Screen Instructions 📋- **Press `p`**: Toggle video recording.
- **Press `i`**: View the instructions screen.
- **Press `c`**: Enter live capture mode.
- **Press `s`**: Send the frame to the Gemini 1.5 Flash API for text extraction.
- **Press `h`**: Activate hand gesture detection for form type selection.
- **Press `q`**: Exit the application.---
## Hand Gesture Rules ✋
- **Five fingers open**: Select **General Form Mode**.
- **Two fingers open (peace sign)**: Select **Fee Challan Mode**.
- **Five fingers closed (fist)**: Select **Student Card Mode**.---
## Output 📊
- The extracted data (e.g., **Name**, **CNIC**, **Gender**) is displayed directly on the UI.
- You can view the processed data in live video frames and save the session as a video file.---
## File Structure 📂
- **`main.py`**: Core application logic.
- **`extract.py`**: Contains API integration for text extraction.
- **`resources/`**: Contains images for the UI (e.g., background and instructions).
- **`requirements.txt`**: List of dependencies.---
## Screenshots 📸
---
## Contributing 🤝
Contributions are welcome! Feel free to submit a pull request or open an issue to suggest improvements.
---
## License 📜
This project is licensed under the **MIT License**.
---
## Author
**Syed Mansoor ul Hassan Bukhari**
[GitHub](https://github.com/cyberfantics) | [LinkedIn](https://www.linkedin.com/in/mansoor-bukhari/)---
### **Happy Digitizing!** 🎉