https://github.com/shaheennabi/multi-lingual-ai-assistant-with-gtts-and-gemini-pro
๐ Multi-lingual AI Assistant with gTTS & Gemini Pro ๐ An end-to-end AI assistant using gTTS for multi-lingual text-to-speech and Gemini Pro API for smart responses. ๐๐ฌ Experience seamless voice interaction in various languages with continuous updates and improvements! ๐โจ
https://github.com/shaheennabi/multi-lingual-ai-assistant-with-gtts-and-gemini-pro
ai assistant end-to-end-project google-generative-ai gtts multilingual speech-recognition streamlit
Last synced: about 1 month ago
JSON representation
๐ Multi-lingual AI Assistant with gTTS & Gemini Pro ๐ An end-to-end AI assistant using gTTS for multi-lingual text-to-speech and Gemini Pro API for smart responses. ๐๐ฌ Experience seamless voice interaction in various languages with continuous updates and improvements! ๐โจ
- Host: GitHub
- URL: https://github.com/shaheennabi/multi-lingual-ai-assistant-with-gtts-and-gemini-pro
- Owner: shaheennabi
- License: mit
- Created: 2024-11-12T16:08:11.000Z (6 months ago)
- Default Branch: main
- Last Pushed: 2024-12-31T13:58:44.000Z (5 months ago)
- Last Synced: 2024-12-31T14:39:33.873Z (5 months ago)
- Topics: ai, assistant, end-to-end-project, google-generative-ai, gtts, multilingual, speech-recognition, streamlit
- Language: Python
- Homepage:
- Size: 1.36 MB
- Stars: 3
- Watchers: 1
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# ๐ **Multi-lingual AI Assistant with gTTS and Gemini Pro** ๐ค๐
* Caution: This is my mini_project hereWelcome to the **Multi-lingual AI Assistant**โthe future of voice-driven interaction powered by **Gemini Pro** and **gTTS**! This AI assistant brings the power of Googleโs cutting-edge models to your fingertips, enabling **seamless, real-time voice interactions** across multiple languages. Speak your mind, and let the AI do the rest! ๐
Whether you want to ask a question, get a recommendation, or just chat, this assistant is ready to assist you in **multiple languages**. It takes **voice input**, processes it using **Gemini Pro**, and responds with **text-to-speech** using **gTTS**. ๐งโจ Plus, you can **download the speech output** for offline access and share it anytime!
This isn't just a simple assistantโit's an experience! ๐ฅ๐ฅ
---
## ๐จ **Key Features** ๐จ
- ๐ **Multi-Language Support**: Communicate in **multiple languages** with Gemini Proโs robust capabilitiesโwhether you're in English, Spanish, French, or many others! The assistant speaks your language. ๐ฌ๐
- ๐ค **Voice Input**: No typing needed! Use the microphone to speak to your assistant, and it will convert your speech into text using **Speech Recognition**. ๐ฃ๏ธ๐๏ธ
- ๐ **Text-to-Speech with gTTS**: The assistant converts its generated responses back into speech using the **Google Text-to-Speech** (gTTS) API. Hear the assistantโs voice in your preferred language. ๐ง๐
- ๐ฅ **Downloadable Speech Output**: After interacting with the assistant, get your generated speech as an **audio file** for offline use! ๐พ๐ฒ
- โจ **Streamlit UI**: A stunning, **easy-to-use web interface** built with **Streamlit** to bring everything together in a beautiful package. Interact with the assistant effortlessly. ๐จ๐ฅ๏ธ---
## ๐ ๏ธ **Installation Guide** ๐ ๏ธ
### Step 1: Create Your Conda Environment
Let's get your environment set up and ready to go! Open your terminal and run:
```bash
conda create --name multilingual-assistant python=3.9
```
Activate the env:
```bash
conda activate multilingual-assistant
```### Step 2: Install Dependencies
Now, install all the required dependencies using the following command:
```bash
pip install -r requirements.txt
```Make sure youโve got everything you need to make the magic happen!
**Dependencies**:
- **gTTS** (Google Text-to-Speech): Converts the assistantโs responses into speech.
- **Gemini Pro**: The language model behind all the intelligence.
- **Streamlit**: For building the stunning web interface.
- **Speech Recognition**: To convert your voice into text.---
## ๐ **How to Use** ๐
### Step 1: Set Up API Keys for Gemini Pro
To interact with **Gemini Pro**, you'll need to set up API access. Head to **Google Cloud**, create a project, and enable **Gemini Pro**. Store your API key securely and configure it in your environment.
### Step 2: Launch the Streamlit Application
Now, it's time to see the magic in action. Run the following command:
```bash
streamlit run app.py
```This will start the Streamlit app and open the web interface in your browser.
### Step 3: Interact with the Assistant
1. **Record Your Voice**: Click the **Record** button to start speaking.
2. **AI Processing**: The assistant will listen to your speech, convert it to text, and send it to **Gemini Pro** for processing.
3. **Listen to the Response**: The assistant will convert the AI-generated text back into speech using **gTTS** and play it back to you.
4. **Download the Speech**: After hearing the assistantโs response, click the download button to save the speech for offline use.---
## ๐ **Project Structure**
Hereโs a look at the project structure:
```
Multi-lingual-AI-Assistant-with-gTTS-and-Gemini-Pro/
โ
โโโ app.py # Streamlit UI for interaction
โโโ requirements.txt # All the necessary dependencies
โโโ src
|-----helper.py
โโโ README.md # Project documentation (Youโre looking at it right now!)
```---
## ๐ก **Technologies Used** ๐ก
- **Gemini Pro**: Googleโs state-of-the-art language model for intelligent AI responses.
- **gTTS (Google Text-to-Speech)**: Converting text to natural-sounding speech using Googleโs powerful TTS engine.
- **Streamlit**: A super-fast, easy-to-use library for creating web apps with a focus on machine learning.
- **Speech Recognition**: Capturing voice input and converting it to text.
- **Python 3.9**: The Python version keeping everything running smoothly.---
## ๐ **License** ๐
This project is licensed under the **MIT License**. Check the [LICENSE](LICENSE) file for more details.
---
## ๐ **Acknowledgments** ๐
A big thank you to the following technologies that made this project possible:
- **Google Gemini Pro**
- **gTTS**
- **Streamlit**
- **Speech Recognition**---
## ๐ **Letโs Talk!** ๐
Ready to try it out? Clone the repository, install the dependencies, and fire up your assistant! ๐๐ฌ Letโs create something amazing together. โจ
---
## ๐ **Stars are Always Welcome!** ๐
If you love the project, โญ **star** โญ it and show some love! Also, feel free to contribute and make this assistant even smarter. ๐ก