Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/shaheennabi/multi-lingual-ai-assistant-with-gtts-and-gemini-pro
π Multi-lingual AI Assistant with gTTS & Gemini Pro π An end-to-end AI assistant using gTTS for multi-lingual text-to-speech and Gemini Pro API for smart responses. ππ¬ Experience seamless voice interaction in various languages with continuous updates and improvements! πβ¨
https://github.com/shaheennabi/multi-lingual-ai-assistant-with-gtts-and-gemini-pro
ai assistant end-to-end-project google-generative-ai gtts multilingual speech-recognition streamlit
Last synced: 21 days ago
JSON representation
π Multi-lingual AI Assistant with gTTS & Gemini Pro π An end-to-end AI assistant using gTTS for multi-lingual text-to-speech and Gemini Pro API for smart responses. ππ¬ Experience seamless voice interaction in various languages with continuous updates and improvements! πβ¨
- Host: GitHub
- URL: https://github.com/shaheennabi/multi-lingual-ai-assistant-with-gtts-and-gemini-pro
- Owner: shaheennabi
- License: mit
- Created: 2024-11-12T16:08:11.000Z (3 months ago)
- Default Branch: main
- Last Pushed: 2024-12-31T13:58:44.000Z (about 1 month ago)
- Last Synced: 2024-12-31T14:39:33.873Z (about 1 month ago)
- Topics: ai, assistant, end-to-end-project, google-generative-ai, gtts, multilingual, speech-recognition, streamlit
- Language: Python
- Homepage:
- Size: 1.36 MB
- Stars: 3
- Watchers: 1
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# π **Multi-lingual AI Assistant with gTTS and Gemini Pro** π€π
* Caution: This is my mini_project hereWelcome to the **Multi-lingual AI Assistant**βthe future of voice-driven interaction powered by **Gemini Pro** and **gTTS**! This AI assistant brings the power of Googleβs cutting-edge models to your fingertips, enabling **seamless, real-time voice interactions** across multiple languages. Speak your mind, and let the AI do the rest! π
Whether you want to ask a question, get a recommendation, or just chat, this assistant is ready to assist you in **multiple languages**. It takes **voice input**, processes it using **Gemini Pro**, and responds with **text-to-speech** using **gTTS**. π§β¨ Plus, you can **download the speech output** for offline access and share it anytime!
This isn't just a simple assistantβit's an experience! π₯π₯
---
## π¨ **Key Features** π¨
- π **Multi-Language Support**: Communicate in **multiple languages** with Gemini Proβs robust capabilitiesβwhether you're in English, Spanish, French, or many others! The assistant speaks your language. π¬π
- π€ **Voice Input**: No typing needed! Use the microphone to speak to your assistant, and it will convert your speech into text using **Speech Recognition**. π£οΈποΈ
- π **Text-to-Speech with gTTS**: The assistant converts its generated responses back into speech using the **Google Text-to-Speech** (gTTS) API. Hear the assistantβs voice in your preferred language. π§π
- π₯ **Downloadable Speech Output**: After interacting with the assistant, get your generated speech as an **audio file** for offline use! πΎπ²
- β¨ **Streamlit UI**: A stunning, **easy-to-use web interface** built with **Streamlit** to bring everything together in a beautiful package. Interact with the assistant effortlessly. π¨π₯οΈ---
## π οΈ **Installation Guide** π οΈ
### Step 1: Create Your Conda Environment
Let's get your environment set up and ready to go! Open your terminal and run:
```bash
conda create --name multilingual-assistant python=3.9
```
Activate the env:
```bash
conda activate multilingual-assistant
```### Step 2: Install Dependencies
Now, install all the required dependencies using the following command:
```bash
pip install -r requirements.txt
```Make sure youβve got everything you need to make the magic happen!
**Dependencies**:
- **gTTS** (Google Text-to-Speech): Converts the assistantβs responses into speech.
- **Gemini Pro**: The language model behind all the intelligence.
- **Streamlit**: For building the stunning web interface.
- **Speech Recognition**: To convert your voice into text.---
## π **How to Use** π
### Step 1: Set Up API Keys for Gemini Pro
To interact with **Gemini Pro**, you'll need to set up API access. Head to **Google Cloud**, create a project, and enable **Gemini Pro**. Store your API key securely and configure it in your environment.
### Step 2: Launch the Streamlit Application
Now, it's time to see the magic in action. Run the following command:
```bash
streamlit run app.py
```This will start the Streamlit app and open the web interface in your browser.
### Step 3: Interact with the Assistant
1. **Record Your Voice**: Click the **Record** button to start speaking.
2. **AI Processing**: The assistant will listen to your speech, convert it to text, and send it to **Gemini Pro** for processing.
3. **Listen to the Response**: The assistant will convert the AI-generated text back into speech using **gTTS** and play it back to you.
4. **Download the Speech**: After hearing the assistantβs response, click the download button to save the speech for offline use.---
## π **Project Structure**
Hereβs a look at the project structure:
```
Multi-lingual-AI-Assistant-with-gTTS-and-Gemini-Pro/
β
βββ app.py # Streamlit UI for interaction
βββ requirements.txt # All the necessary dependencies
βββ src
|-----helper.py
βββ README.md # Project documentation (Youβre looking at it right now!)
```---
## π‘ **Technologies Used** π‘
- **Gemini Pro**: Googleβs state-of-the-art language model for intelligent AI responses.
- **gTTS (Google Text-to-Speech)**: Converting text to natural-sounding speech using Googleβs powerful TTS engine.
- **Streamlit**: A super-fast, easy-to-use library for creating web apps with a focus on machine learning.
- **Speech Recognition**: Capturing voice input and converting it to text.
- **Python 3.9**: The Python version keeping everything running smoothly.---
## π **License** π
This project is licensed under the **MIT License**. Check the [LICENSE](LICENSE) file for more details.
---
## π **Acknowledgments** π
A big thank you to the following technologies that made this project possible:
- **Google Gemini Pro**
- **gTTS**
- **Streamlit**
- **Speech Recognition**---
## π **Letβs Talk!** π
Ready to try it out? Clone the repository, install the dependencies, and fire up your assistant! ππ¬ Letβs create something amazing together. β¨
---
## π **Stars are Always Welcome!** π
If you love the project, β **star** β it and show some love! Also, feel free to contribute and make this assistant even smarter. π‘