https://github.com/abhrankan-chakrabarti/geminifusion
A versatile web application that leverages advanced AI models, including Gemini Flash, DALL-E 3, and Stable Diffusion XL, to provide three main features: Chatbot Interaction, Image Captioning, and Text-to-Image Generation.
https://github.com/abhrankan-chakrabarti/geminifusion
ai-chatbot ai-integration dall-e-3 gemini-pro gemini-pro-vision generative-ai image-captioning multimodal-processing stable-diffusion-xl text-and-vision
Last synced: 4 months ago
JSON representation
A versatile web application that leverages advanced AI models, including Gemini Flash, DALL-E 3, and Stable Diffusion XL, to provide three main features: Chatbot Interaction, Image Captioning, and Text-to-Image Generation.
- Host: GitHub
- URL: https://github.com/abhrankan-chakrabarti/geminifusion
- Owner: Abhrankan-Chakrabarti
- License: mit
- Created: 2024-08-17T15:13:09.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2025-05-04T15:56:26.000Z (6 months ago)
- Last Synced: 2025-06-17T07:44:42.842Z (5 months ago)
- Topics: ai-chatbot, ai-integration, dall-e-3, gemini-pro, gemini-pro-vision, generative-ai, image-captioning, multimodal-processing, stable-diffusion-xl, text-and-vision
- Language: Python
- Homepage: https://abhrankan.streamlit.app/
- Size: 43 KB
- Stars: 4
- Watchers: 1
- Forks: 2
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
Awesome Lists containing this project
README
# VisionaryAI (formerly GeminiFusion)
**VisionaryAI** is a versatile web application that leverages advanced AI models, including Gemini Flash, DALL-E 3, and Stable Diffusion XL, to provide three main features: Chatbot Interaction, Image Captioning, and Text-to-Image Generation.
## Features
- **ChatBot:** Engage in real-time conversations with the AI, powered by the Gemini Flash model.
- **Image Captioning:** Generate descriptive captions for your images using the Gemini Flash model.
- **Text to Image:** Generate images using either DALL-E 3 or Stable Diffusion XL.
## Installation
1. **Clone the repository:**
```bash
git clone https://github.com/Abhrankan-Chakrabarti/GeminiFusion.git
cd GeminiFusion
```
2. **Create a virtual environment (optional but recommended):**
```bash
python -m venv venv
source venv/bin/activate # On Windows use `venv\Scripts\activate`
```
3. **Install dependencies:**
```bash
pip install -r requirements.txt
```
4. **Set up environment variables:**
- Create a `.env` file in the root directory.
- Add your Google API key:
```
api_key=YOUR_GOOGLE_API_KEY
```
## Usage
1. **Run the application:**
```bash
streamlit run app.py
```
2. **Features:**
- **ChatBot:** Navigate to the ChatBot section to start a conversation with the AI.
- **Image Captioning:** Upload an image and enter a prompt to generate a caption.
- **Text to Image:** Enter a text prompt to generate images using either DALL-E 3 or Stable Diffusion XL.
## Technology Stack
- **Python**
- **Streamlit**
- **Google Gemini Flash**
- **DALL-E 3**
- **Stable Diffusion XL**
## Contributing
We welcome contributions! Please see our [contribution guidelines](CONTRIBUTING.md) for more information.
## License
This project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details.