https://github.com/iprajwaal/enhanced-vision-assistant

An AI-powered vision assistant for real-time navigation and awareness.
https://github.com/iprajwaal/enhanced-vision-assistant

gemini-pro opencv vertex-ai vertex-ai-gemini-api vertexaisprint vision-api

Last synced: about 1 year ago
JSON representation

An AI-powered vision assistant for real-time navigation and awareness.

Host: GitHub
URL: https://github.com/iprajwaal/enhanced-vision-assistant
Owner: iprajwaal
License: apache-2.0
Created: 2025-02-14T12:17:41.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2025-02-26T09:28:55.000Z (over 1 year ago)
Last Synced: 2025-05-14T10:34:08.906Z (about 1 year ago)
Topics: gemini-pro, opencv, vertex-ai, vertex-ai-gemini-api, vertexaisprint, vision-api
Language: Jupyter Notebook
Homepage:
Size: 6.02 MB
Stars: 6
Watchers: 1
Forks: 1
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# 🚀 Enhanced Vision Assistant

A **state-of-the-art** computer vision application designed to assist **visually impaired individuals** with real-time navigation and situational awareness. The system leverages cutting-edge AI technologies (**Vertex AI, Gemini Pro, Google Cloud Vision API, and Google Text-to-Speech API**) to **identify objects, evaluate risks, and deliver smart audio directions** through voice commands and natural language processing.

## 🎥 Demo Video

https://github.com/user-attachments/assets/fe943124-3d3b-4ba1-ab87-324b21469421

## ✨ Features

- **Real-time object detection** and **depth estimation**
- **Intelligent scene analysis** and **risk assessment**
- **Priority-based audio guidance system**
- **Context-aware navigation assistance**
- **Dynamic hazard detection** and **avoidance**
- **Advanced motion tracking** and **trajectory analysis**
- **Voice-activated commands and responses**
- **Natural language scene description**
- **Spatial awareness and proximity alerts**
- **Debug visualization for development purposes**

## 🛠️ Technologies Used

- **Computer Vision**: OpenCV, Google Cloud Vision API
- **AI/ML**: Google Vertex AI (**Gemini Pro**)
- **Speech Synthesis**: Google Cloud Text-to-Speech
- **Audio Processing**: Pygame
- **Additional Libraries**: NumPy, SciPy

## 📋 Requirements

- 🐍 **Python 3.7+**
- ☁️ **Google Cloud Platform account** with the following APIs enabled:
- Cloud Vision API
- Text-to-Speech API
- Vertex AI API
- 📷 **Webcam** or compatible camera device
- 🎧 **Audio output device**

## 🚀 Installation

1️⃣ **Clone the repository:**
```bash
git clone https://github.com/yourusername/enhanced-vision-assistant.git
cd enhanced-vision-assistant
```

2️⃣ **Install required packages:**
```bash
pip install opencv-python pygame google-cloud-vision google-cloud-texttospeech vertexai numpy scipy
```

3️⃣ **Set up Google Cloud credentials:**
- **Create a service account** and download the **JSON key file**
- **Set the path** to your credentials in the `CREDENTIALS_PATH` variable
- **Configure your** Google Cloud **Project ID** in the `PROJECT_ID` variable

## ⚙️ Configuration

Update the following variables in the `EnhancedVisionAssistant` class:

```python
self.PROJECT_ID = 'your-project-id'
self.REGION = 'your-region'
self.CREDENTIALS_PATH = 'path/to/your/credentials.json'
```

## 🎯 How It Works

1️⃣ **Initialize** the camera and audio systems
2️⃣ **Detect** objects in real time
3️⃣ **Analyze** the environment and provide **smart audio guidance**
4️⃣ **Display** a debug window showing detected objects and their priorities

**Press 'q' to quit the application.**

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/iprajwaal/enhanced-vision-assistant

Awesome Lists containing this project

README