https://github.com/yanivhaliwa/gemini-tts-conversation-generator
https://github.com/yanivhaliwa/gemini-tts-conversation-generator
ai-powered-tools google-gemini linux-tools text-to-speech voice-synthesis
Last synced: 3 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/yanivhaliwa/gemini-tts-conversation-generator
- Owner: YanivHaliwa
- Created: 2025-05-23T20:18:17.000Z (5 months ago)
- Default Branch: master
- Last Pushed: 2025-07-11T18:26:03.000Z (4 months ago)
- Last Synced: 2025-07-11T20:32:26.225Z (4 months ago)
- Topics: ai-powered-tools, google-gemini, linux-tools, text-to-speech, voice-synthesis
- Language: Python
- Size: 8.79 KB
- Stars: 1
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Funding: .github/FUNDING.yml
Awesome Lists containing this project
README
# Gemini TTS Conversation Generator
A Python tool that uses Google's Gemini API to generate natural-sounding audio conversations from text scripts. This tool automatically identifies speakers in your script, assigns appropriate voices to each character, and generates audio with realistic multi-speaker dialogue.
## Features
- **Automatic Speaker Detection**: Automatically identifies speaker names from your script
- **Voice Assignment**: Intelligently assigns suitable voices to each speaker based on context
- **Multiple Speakers**: Handles conversations between two different speakers
- **High-Quality TTS**: Uses Google's Gemini 2.5 Flash Preview TTS for natural-sounding audio
- **Gender Balance**: Automatically selects male and female voices to match character context
- **Error Handling**: Robust error handling for voice selection and speaker detection
## Requirements
- Python 3.7+
- Google Generative AI Python SDK
- Google API key with Gemini API access
## Installation
```bash
# Clone the repository
git clone https://github.com/YanivHaliwa/gemini-tts-conversation-generator.git
cd gemini-tts-conversation-generator
# Install dependencies
pip install google-genai
```
## Usage
1. Set your Gemini API key as an environment variable:
```bash
export GEMINI_API_KEY="your_api_key_here"
```
2. Create a script file with dialogue. Format should have speaker names followed by a colon:
```
John: Hello there, how are you today?
Sarah: I'm doing well, thanks for asking! How about you?
John: Can't complain. The weather is beautiful today.
```
3. Run the script (choose one method):
**Basic Usage:**
```bash
python gemini_2_speakers_tts.py your_script.txt
```
**With Custom Output Filename:**
```bash
python gemini_2_speakers_tts.py your_script.txt -o custom_output_name
```
**Command-line Options:**
- First argument: Path to input script file (required)
- `-o, --output`: Custom output filename (optional, auto-generated from speaker names if not provided)
4. The script will:
- Extract speaker names from your dialogue (e.g., "John" and "Sarah")
- Automatically generate a filename based on speakers (e.g., "john_sarah.wav") if none specified
- Select appropriate voices for each speaker
5. The script will:
- Extract speaker names from your dialogue
- Select appropriate voices for each speaker
- Generate natural-sounding audio with proper speaker differentiation
## Available Voices
The script uses Google's Gemini voices, which include a wide range of voice styles:
| Voice Name | Style | Gender |
| -------------------- | ----------- | ------ |
| Zephyr | Bright | Female |
| Puck | Upbeat | Male |
| Charon | Informative | Male |
| Kore | Firm | Female |
| Fenrir | Excitable | Male |
| Leda | Youthful | Female |
| *and many more...* | | |
## Example
Input script (`conversation.txt`):
```
Alice: Hi Bob, have you heard about the new AI tools from Google?
Bob: Yes, the Gemini models are quite impressive! Have you tried them?
Alice: I'm using one right now to generate this conversation.
Bob: That's amazing! The voices sound so natural.
```
**Command:**
```bash
python gemini_2_speakers_tts.py conversation.txt
```
Check out the `examples` folder for more sample conversation scripts you can use with this tool. These examples demonstrate different conversation styles and emotional tones that work well with voice synthesis.
**Output:**
```
🎧 Welcome to the Gemini Audio Generator CLI!
📁 Generating smart filename based on speakers: alice_bob
🔄 Extracting speakers from script...
👥 Detected speakers: Alice and Bob
🔄 Selecting best voices for each speaker...
🎙️ Selected voices:
Alice: Leda
Bob: Puck
🎤 Generating conversation audio...
✅ File saved to: demo.wav
```
## License
MIT License
## Author
Created by [Yaniv Haliwa](https://github.com/YanivHaliwa) for security testing and educational purposes.