https://github.com/muhammadusman-khan/youtube-chatbot

🎙️ An intelligent RAG-based chatbot that learns from any YouTube video you provide automatically transcribing, embedding, and answering your questions contextually.
https://github.com/muhammadusman-khan/youtube-chatbot

ai chatbot embeddings flask gemini llm openai rag rag-chatbot retrieval-augmented-generation youtube

Last synced: 3 months ago
JSON representation

🎙️ An intelligent RAG-based chatbot that learns from any YouTube video you provide automatically transcribing, embedding, and answering your questions contextually.

Host: GitHub
URL: https://github.com/muhammadusman-khan/youtube-chatbot
Owner: MuhammadUsman-Khan
Created: 2025-11-02T17:19:27.000Z (8 months ago)
Default Branch: main
Last Pushed: 2025-11-02T17:51:04.000Z (8 months ago)
Last Synced: 2025-11-02T19:19:00.134Z (8 months ago)
Topics: ai, chatbot, embeddings, flask, gemini, llm, openai, rag, rag-chatbot, retrieval-augmented-generation, youtube
Language: Python
Homepage:
Size: 18.6 KB
Stars: 1
Watchers: 0
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# YouTube Chatbot

An intelligent, fully dynamic Retrieval-Augmented Generation (RAG) chatbot that answers questions about any YouTube video you provide.
It uses **OpenAI Whisper** for transcription, **OpenAI embeddings (small)** for semantic search, and **Gemini** for generating responses, all via API keys. No local models are required.

---

## 🚀 Key Features
- **Dynamic YouTube Video Learning**: Enter any video URL and the bot automatically processes it.
- **Audio Conversion & Transcription**: Video is converted to audio, then transcribed using **OpenAI Whisper**.
- **JSON Storage**: Transcriptions are stored in JSON format for structured processing.
- **OpenAI Embeddings**: Transcript chunks are converted into embeddings for semantic retrieval.
- **User Query Matching**: Queries are converted into embeddings and matched using cosine similarity.
- **Gemini API Responses**: Generates context-aware answers based on the relevant transcript; replies with "Information not available in the provided video" if no match exists.
- **Flask Web Interface**: Interactive and user-friendly chat interface.

---

## 🛠️ Tech Stack
- **Python 3**
- **Flask** for web backend
- **HTML/CSS/JS** for frontend templates
- **OpenAI Whisper API** for transcription
- **OpenAI Embeddings (small)** for vectorization
- **Gemini API** for LLM responses
- **Cosine Similarity** for Nearest neighbor search

---

## 📁 Project Structure

```
youtube-chatbot/
├── app.py
├── backend/
│ ├── embeddings.py
│ ├── mp3_to_json.py
│ ├── process_incomings.py
│ └── yt_to_mp3.py
├── templates/
│ └── index.html
├── static/
│ ├── style.css
│ └── script.js
├── requirements.txt
└── README.md
```

---

## 🧭 How It Works
1. User provides a **YouTube URL**.
2. Video is converted to **audio** using `yt_to_mp3.py`.
3. Audio is transcribed via **OpenAI Whisper** using `mp3_to_json.py` and stored as JSON.
4. Transcript chunks are converted into embeddings with `embeddings.py` using OpenAI Embeddings (small).
5. User query is also converted into embeddings.
6. **Cosine similarity** is computed between query and transcript embeddings via `process_incomings.py`.
7. The most relevant transcript chunk is sent to **Gemini API** for a response.
8. If no relevant information exists, the bot replies:
> "Information not available in the provided video."
9. Response is displayed in the Flask web UI (`index.html`).

---

## 📌 Setup Instructions
1. Clone the repository:
```
git clone https://github.com/MuhammadUsman-Khan/youtube-chatbot.git
cd youtube-chatbot
```
2. Install dependencies:
```
pip install -r requirements.txt

```

3. Set environment variables for API keys:
```
export OPENAI_API_KEY="your_openai_key"
export GEMINI_API_KEY="your_gemini_key"

```

4. Run the Flask app:
```
python app.py

```

5. Open your browser at http://127.0.0.1:5000/ and start chatting with the bot.

---

## 📌 requirements.txt

```
Flask==3.1.2
imageio_ffmpeg==0.6.0
joblib==1.5.0
numpy==2.3.4
openai==2.7.1
openai_whisper==20250625
pandas==2.3.3
python-dotenv==1.2.1
Requests==2.32.5
scikit_learn==1.7.2
```

## ✅ Notes & Improvements

- Fully dynamic; no local models required.

- Transcript quality depends on YouTube captions and audio clarity.

- Reduce Time Complexity and make it faster.

- Frontend can be enhanced with chat history, typing indicators, and UI themes.

## 🤝 Contributing

Feel free to fork the repo, submit issues, or create pull requests. Contributions are welcome!

## 🖊️ Author & Developer

#### Muhammad Usman Khan

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/muhammadusman-khan/youtube-chatbot

Awesome Lists containing this project

README