An open API service indexing awesome lists of open source software.

https://github.com/allen-reji/voz-ai

Voz-AI is an AI-powered RAG based voice assistant that is designed to help customers inquire about queries and retrieves the more accurate relevant information back to them. This project was made for the AIQoD Hackathon 2025, eventually winning us the runner-up position.
https://github.com/allen-reji/voz-ai

ai-assistant artificial-intelligence flask-backend llama3-1 ngrok-server ollama twilio-voice

Last synced: 4 months ago
JSON representation

Voz-AI is an AI-powered RAG based voice assistant that is designed to help customers inquire about queries and retrieves the more accurate relevant information back to them. This project was made for the AIQoD Hackathon 2025, eventually winning us the runner-up position.

Awesome Lists containing this project

README

          

# Voz-AI: End-to-End Bookstore Voice Assistant with Knowledge Retrieval

Complete Demo Video:
https://youtu.be/kwgYwYUD6BE?si=yhapbnhmFvuNM-nh

This project made for **AIQoD Hackathon 2025**, is an **AI-powered voice assistant** designed to help customers inquire about books, availability, pricing, stocks, and more at a bookstore. Customers can call the assistant, ask natural questions via voice, and receive **dynamic, AI-generated responses** based on real-time book inventory data powered by a RAG-based (Retrieval-Augmented Generation) system. The system leverages:

- **Twilio Programmable Voice (Speech-to-Text & Text-to-Speech)**
- **Ollama with LLaMA 3.1 8B** for natural language understanding and generation.
- **Contextual Memory and Friendly Learning** to improve user experience.
- **Flask Backend** is used to handle webhooks and inventory lookup.
- **Ngrok** Exposes the locally running Flask application to the public internet, allowing Twilio to access webhook endpoints.

### NOTE: This project currently uses a bookstore inventory (books.csv) as its knowledge base, but the system can be easily adapted to other domains like e-commerce, education, healthcare, or travel by replacing the dataset and fine-tuning the model for domain-specific queries and conversational responses.
---

## ✨ Key Features

### πŸ—£οΈ Voice Interface via Twilio
- Customers interact with the assistant entirely via voice.
- Twilio **Speech-to-Text (STT)** converts speech into text.
- Twilio **Text-to-Speech (TTS)** reads responses back to the caller.

### πŸ“š Real-Time Knowledge Retrieval
- Current book inventory is stored in `books.csv`.
- Questions about pricing, availability, reservation, and purchase are answered using this data.

### 🧠 Contextual Memory
- The assistant maintains **short-term memory** during a call session.
- If interrupted, it can recover gracefully and ask the user to repeat or clarify.
- It remembers the current topic within the session (e.g., book being discussed) to provide follow-up information smoothly.

### πŸšͺ Interruption Handling
- If the user interrupts the assistant while it's speaking, the assistant can pause and listen for the new input.
- This enhances the **natural flow** of conversation.

---

## πŸ“‚ Project Structure

```
πŸ“‚ project-root/
β”œβ”€β”€ app.py # Main Flask application
β”œβ”€β”€ requirements.txt # Python dependencies
β”œβ”€β”€ books.csv # Sample book inventory
└── README.md # This documentation
```

---

## βš™οΈ Setup Instructions

### 1. Clone the Repository

```bash
git clone https://github.com/allen-reji/Voz-AI
cd project-root
```

### 2. Set up Virtual Environment (recommended)

```bash
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
```

### 3. Install Dependencies

```bash
pip install -r requirements.txt
```

### 4. Prepare the Inventory Data (csv knowledge base of your choice, We have provided books.csv for bookstore inventory)

Create a `books.csv` file in the root folder with the following format:

```csv
book_name,author,genre,quantity_available,price,rating,format,language,pages,discount,bestseller
The Great Gatsby,F. Scott Fitzgerald,Classic,15,9.99,4.5,Paperback,English,180,10,Yes
1984,George Orwell,Dystopian,20,12.99,4.7,Hardcover,English,328,15,Yes
To Kill a Mockingbird,Harper Lee,Classic,18,14.99,4.8,Paperback,English,281,5,Yes
Pride and Prejudice,Jane Austen,Romance,12,11.99,4.6,Hardcover,English,432,10,Yes
```

This is the data the assistant will reference when answering user queries.

---

## πŸƒ Running the Application

### Start Flask App
```bash
python app.py
```

The app will run on:
```
http://localhost:5000
```

### Start Ollama
Ensure **Ollama** is running with the required LLaMA model:
```bash
ollama serve
```
You may also need to pull the model if it's not available:
```bash
ollama pull llama3.1
```

---

## πŸ“ž Twilio Integration

### Set Up a Programmable Voice Number
1. Get a free trial **Twilio phone number**.
2. Use **ngrok** to expose your local Flask server if running locally:
```bash
ngrok http 5000
```
3. In Twilio Console, set the number’s **Voice Webhook URL** to:
```
https://your-ngrok-url/voice
```

---

## πŸ“Š System Flow

1. πŸ“ž **User Calls Twilio Number**
The assistant greets the user and invites them to ask a question about books.

2. πŸ—£οΈ **User Asks a Question**
Twilio converts **speech to text** and sends the transcript to `/handle-input`.

3. πŸ“š **Knowledge Retrieval & AI Response**
- The assistant queries `books.csv` for inventory.
- The query and data are passed to **LLaMA 3.1** via Ollama.
- The AI generates a friendly, natural response.

4. πŸ”Š **AI Response Read Back**
Twilio’s **TTS** reads the AI-generated response to the caller.

5. πŸ”„ **Context Management & Follow-ups**
The assistant offers to answer follow-up questions and maintains short-term memory about the current topic.

6. πŸšͺ **Graceful Interruption Handling**
If the user interrupts while the assistant is speaking, the assistant will stop, listen, and respond to the new query.

---

## πŸ”§ Example Call Flow

```
πŸ‘€ User: "Hi, do you have The Great Gatsby?"
πŸ€– Assistant: "Yes! The Great Gatsby is available for $10.99, and we have 5 copies in stock. Would you like me to check another book?"
πŸ‘€ User: "Actually, do you have 1984?"
πŸ€– Assistant: "Sure! 1984 is available for $8.99, with 10 copies in stock."
πŸ‘€ User: (interrupts) "What about Moby Dick?"
πŸ€– Assistant: "Moby Dick is available for $12.50, and we have 3 copies left. Anything else I can help with?"
```

---

## πŸ“¦ Requirements

Dependencies (in `requirements.txt`):
```
flask==2.0.1
twilio==7.16.0
ollama==0.5.13
pandas==2.0.3
numpy==1.21.6
```

---

## πŸš€ Future Enhancements

- βœ… Add **multi-language support** using Twilio’s language options.
- βœ… Multi type voice support for user experience.
- βœ… Deploy to cloud platforms

---

## πŸ“ License

This project is licensed under the **MIT License**.

---
## Contact

- Allen Reji - [allenreji@gmail.com](allenreji@gmail.com)
- Nathania Rachael - [nathaniarachael@gmail.com](nathaniarachael@gmail.com)
- Kavin Karthik - [kavinkarthivs@gmail.com](kavinkarthivs@gmail.com)

---
*Developed for AIQoD Hackathon 2025*