https://github.com/alpha-131/fictive-frames
Fictive Frames is text-to-image synthesis model, employing variational autoencoders, transformers, and UNET architecture. By encoding textual descriptions into latent spaces, it generates high-quality images, optimizing parameter efficiency. This innovative approach streamlines content creation, catering to diverse industries.
https://github.com/alpha-131/fictive-frames
diffusion-models transformers vae
Last synced: 8 days ago
JSON representation
Fictive Frames is text-to-image synthesis model, employing variational autoencoders, transformers, and UNET architecture. By encoding textual descriptions into latent spaces, it generates high-quality images, optimizing parameter efficiency. This innovative approach streamlines content creation, catering to diverse industries.
- Host: GitHub
- URL: https://github.com/alpha-131/fictive-frames
- Owner: Alpha-131
- Created: 2024-01-29T18:56:30.000Z (about 2 years ago)
- Default Branch: master
- Last Pushed: 2024-01-29T19:03:39.000Z (about 2 years ago)
- Last Synced: 2025-01-18T23:53:32.081Z (about 1 year ago)
- Topics: diffusion-models, transformers, vae
- Homepage:
- Size: 5.86 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Fictive Frames: Text-to-Image Generation
* NOTE : Our project is still under work. Will update the codebase soon.
## Overview
This repository hosts the code for our IPD project, Fictive Frames, focused on text-to-image generation. Our goal is to develop an efficient model that translates textual descriptions into visually compelling images.
## Key Features
- **Innovative Approach**: We combine variational autoencoders (VAE), text embeddings, transformers, and UNET architecture for image generation.
- **Simplicity & Efficiency**: Emphasis on minimizing parameters while maintaining image quality for faster training and inference.
- **Streamlined Solution**: Designed to facilitate content creation by automating image production from natural language inputs.
## Project Objectives
1. **Comprehensive Understanding**: Survey existing models in text-to-image generation.
2. **Dataset Compilation**: Gather a diverse dataset of textual descriptions and corresponding images.
3. **Model Development**: Propose an innovative model integrating computer vision and NLP techniques.
4. **Performance Validation**: Validate and optimize the model's performance.
5. **Evaluation Metrics**: Establish unbiased metrics for rigorous assessment.
6. **User Studies**: Conduct user studies to gauge effectiveness and user experience.
## Motivation
Driven by the increasing demand for multimedia content, our project aims to streamline the image creation process, reducing time and resources while enhancing engagement and creativity.
## Contributors
- Varun Pillai (https://github.com/Alpha-131)
- Rachit Patni (https://github.com/rachit901109)
- Manav Sangoi (https://github.com/ManavSangoi)
- Hastansh Pandit (https://github.com/Hastansh12)
Feel free to explore the codebase and contribute to advancing text-to-image generation in artificial intelligence! 🚀