https://github.com/mitesh-kumavat/sightmate
SightMate: AI-Powered Companion for the Visually Impaired.
https://github.com/mitesh-kumavat/sightmate
ai fastapi groq-api nextjs python react sqlite tts-api typescript uvicorn webscraping
Last synced: 2 days ago
JSON representation
SightMate: AI-Powered Companion for the Visually Impaired.
- Host: GitHub
- URL: https://github.com/mitesh-kumavat/sightmate
- Owner: Mitesh-Kumavat
- Created: 2025-04-13T09:03:12.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2025-08-23T04:50:32.000Z (11 months ago)
- Last Synced: 2025-10-30T13:40:45.962Z (8 months ago)
- Topics: ai, fastapi, groq-api, nextjs, python, react, sqlite, tts-api, typescript, uvicorn, webscraping
- Language: TypeScript
- Homepage: https://sightmate.vercel.app
- Size: 233 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# π **SightMate β Your AI-Powered Companion for the Visually Impaired**
Visually impaired individuals face significant challenges navigating public spaces, accessing information, and staying informed. SightMate solves this by combining vision, audio, and language models into an accessible AI-first platform.
---
## π― Objective
SightMate is an AI-driven, voice-first assistant designed to empower blind and visually impaired individuals. It helps users with:
- Real-time road guidance
- Document & currency reading
- Personalized voice interactions
- Artistic scene understanding
- Daily news summaries
**Approach:**
Blind people deserve a modern, reliable, voice-first experience that goes beyond basic OCR or TTS. SightMate combines powerful LLMs with real-time computer vision and speech processing to truly assist in daily life.
---
## π οΈ Tech Stack
### π¦ Core Stack:
- **Frontend:** Next.js, TailwindCSS, ShadCN, Framer Motion
- **Backend:** FastAPI (Python)
- **Database:** SQLite
- **Hosting:** Vercel (Frontend) + Render (Backend)
- **Groq:** Used for ultra-fast inference with:
- `LLaVA` for image-based understanding
- `Mixtral` for LLM-based Q&A
- `TTS` for expressive voice generation
---
## β¨ Key Features
### π£οΈ Real-Time Scene Monitoring & Road Guidance
- Live camera stream interpreted using LLaVA
- Alerts user with audio feedback about obstacles or road conditions
### π° Daily News Summarizer
- Fetches real-time news
- Summarizes with Mixtral LLM
- Reads out top headlines in seconds
### π Document & Handwriting Reader
- Users show documents to the camera
- Extracted, summarized, and read out loud
### π° Indian Currency Recognition
- Detects INR denominations
- Adds up total and reads out count
### π¨ Artistic Scene Description
- AI describes camera view in poetic or creative style
- Designed to create joyful interaction with the environment
---
## π½οΈ Demo & Deliverables
- π₯ **Demo Video:** *[YouTube](https://youtu.be/tH8MsqGeQG0)*
---
## π§ͺ How to Run the Project
### Requirements
- Python β₯ 3.9
- Node.js β₯ 18.x
- Groq API Key
## π§± Environment Setup
- Copy the `.env.sample` file to `.env` and add your Groq API key.
- Make sure your Groq API key has enough credits to run `playai-tts` model and `meta-llama/llama-4-scout-17b-16e-instruct`
model.
### Clone the repository
```bash
git clone https://github.com/mitesh-kumavat/sightmate
cd sightmate
```
### Backend Setup
- open a terminal and run the following commands:
```bash
pip install -r requirements.txt
# Start FastAPI backend
uvicorn app.main:app --reload
```
### Frontend Setup
- open a new terminal and run the following commands:
```bash
cd frontend
npm install
npm run dev
```
---
## π± Future Scope
- π Smart Glasses Integration (Raspberry Pi or ESP32 Cam)
- π§ Indoor Navigation with Beacons
- π± Android App with voice activation
- π§βπ€βπ§ SOS Caretaker Dashboard
- π Multi-language translation for regional adoption
- π Object Finder (e.g. βFind my keysβ)
---
## π Resources & Acknowledgements
- [Groq API](https://console.groq.com/)
- FastAPI, Uvicorn, SQLModel
---
## π Final Words
> SightMate isnβt just a project β itβs a mission.
> A mission to make the world more inclusive, one intelligent voice at a time.
---