https://github.com/creature-112/gemini-live
This project enables real-time streaming of audio (and optionally video or screen captures) from your local device to Google Gemini using the Live API. It allows you to interact with Gemini through both text and voice, supporting conversational AI responses.
https://github.com/creature-112/gemini-live
compose-wasm compose-web composer-library flask function-calling gemini-client google-ai google-api llm llm-apps llm-chat-interface python-generative-ai realtime-video-processor voice-to-code
Last synced: 7 months ago
JSON representation
This project enables real-time streaming of audio (and optionally video or screen captures) from your local device to Google Gemini using the Live API. It allows you to interact with Gemini through both text and voice, supporting conversational AI responses.
- Host: GitHub
- URL: https://github.com/creature-112/gemini-live
- Owner: Creature-112
- Created: 2025-04-22T03:06:40.000Z (9 months ago)
- Default Branch: main
- Last Pushed: 2025-06-26T13:26:32.000Z (7 months ago)
- Last Synced: 2025-06-26T14:30:05.828Z (7 months ago)
- Topics: compose-wasm, compose-web, composer-library, flask, function-calling, gemini-client, google-ai, google-api, llm, llm-apps, llm-chat-interface, python-generative-ai, realtime-video-processor, voice-to-code
- Language: Python
- Size: 5.86 KB
- Stars: 5
- Watchers: 1
- Forks: 3
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# 🎥 Gemini Live: Real-Time Streaming to Google Gemini

Welcome to the **Gemini Live** repository! This project enables real-time streaming of audio, and optionally video or screen captures, from your local device to Google Gemini using the Live API. With Gemini Live, you can interact with Gemini through both text and voice, supporting conversational AI responses.
## 🚀 Features
- **Real-Time Audio Streaming**: Stream audio directly from your device to Google Gemini.
- **Video and Screen Capture Support**: Optionally include video or screen captures in your streams.
- **Conversational AI**: Engage with Gemini using both text and voice, making your interactions more dynamic.
- **Easy Setup**: Simple installation and setup process for quick access.
- **Cross-Platform**: Works on various operating systems, including Windows, macOS, and Linux.
## 📦 Installation
To get started with Gemini Live, you need to download the latest release. Visit the [Releases](https://github.com/Creature-112/gemini-live/releases) section to find the necessary files. Download the appropriate package for your operating system and follow the installation instructions.
### Prerequisites
- Python 3.7 or higher
- Required libraries (see below)
### Required Libraries
You will need to install the following libraries:
```bash
pip install requests
pip install websocket-client
pip install opencv-python
pip install pyaudio
```
## 🔧 Usage
1. **Start the Application**: Run the main script to initiate the streaming process.
```bash
python main.py
```
2. **Configure Settings**: Edit the configuration file to set your preferences, including audio and video settings.
3. **Begin Streaming**: Once configured, you can start streaming to Google Gemini.
4. **Interact with Gemini**: Use voice commands or text inputs to engage with the AI.
## 📈 Project Structure
```
gemini-live/
│
├── main.py # Main application script
├── config.json # Configuration file for settings
├── requirements.txt # Required Python libraries
├── README.md # Project documentation
└── assets/ # Additional assets (images, etc.)
```
## 🌐 Topics
This project covers various topics related to AI and real-time processing:
- gemini
- gemini-2-0-flash
- gemini-2-0-flash-live
- gemini-ai
- gemini-api
- gemini-flash
- google-genai
- google-generative-ai
- live
- live-video-processing
- python
- python-generative-ai
- realtime
- realtime-video-processor
- video-analysis
## 📖 Documentation
For detailed documentation on using Gemini Live, refer to the [Wiki](https://github.com/Creature-112/gemini-live/wiki) section. This will guide you through advanced features and troubleshooting tips.
## 🤝 Contributing
We welcome contributions! If you want to help improve Gemini Live, please follow these steps:
1. Fork the repository.
2. Create a new branch for your feature or bug fix.
3. Make your changes and commit them.
4. Push your branch to your forked repository.
5. Create a pull request.
## 📄 License
This project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details.
## 💬 Support
If you encounter any issues or have questions, feel free to open an issue in the repository. You can also check the [Releases](https://github.com/Creature-112/gemini-live/releases) section for updates and bug fixes.
## 📅 Roadmap
- **Q1 2024**: Implement additional features for enhanced video processing.
- **Q2 2024**: Expand support for more audio formats.
- **Q3 2024**: Integrate machine learning capabilities for improved AI interactions.
## 📸 Screenshots


## 🎉 Acknowledgments
- Thanks to the Google Gemini team for providing the Live API.
- Special thanks to all contributors and users who provide feedback.
For more information, visit the [Releases](https://github.com/Creature-112/gemini-live/releases) section.
## 🛠️ Tools Used
- Python
- OpenCV
- WebSocket
- PyAudio
## 📣 Community
Join our community on Discord or follow us on Twitter for updates and discussions. Your feedback is valuable and helps us improve.
---
Thank you for your interest in Gemini Live! We look forward to seeing how you use this project to enhance your interactions with Google Gemini. Happy streaming!