Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/rafaykhattak/captionify
With Captionify, users can upload an image or enter an image URL to generate a descriptive caption that accurately describes the contents of the image.
https://github.com/rafaykhattak/captionify
deep-neural-networks encoder-decoder-model gpt-2 transfer-learning transformer vision-transformer
Last synced: 29 days ago
JSON representation
With Captionify, users can upload an image or enter an image URL to generate a descriptive caption that accurately describes the contents of the image.
- Host: GitHub
- URL: https://github.com/rafaykhattak/captionify
- Owner: RafayKhattak
- Created: 2023-05-04T12:07:07.000Z (almost 2 years ago)
- Default Branch: main
- Last Pushed: 2023-05-04T14:17:11.000Z (almost 2 years ago)
- Last Synced: 2024-11-13T02:32:20.734Z (3 months ago)
- Topics: deep-neural-networks, encoder-decoder-model, gpt-2, transfer-learning, transformer, vision-transformer
- Language: Python
- Homepage: https://rafaykhattak-captionify-app-btvva7.streamlit.app/
- Size: 16.6 KB
- Stars: 5
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Captionify
Captionify is a web application that generates a descriptive caption for an image using an encoder-decoder architecture. The application uses a pre-trained Transformer-based vision model (ViT) as an encoder and a pre-trained language model (GPT2) as a decoder to generate highly accurate captions for uploaded images or image URLs.
![Screenshot (437)](https://user-images.githubusercontent.com/90026724/236234966-71693df3-f3a7-45f2-8a90-109c85315d6e.png)
## Usage
To use Captionify, simply upload an image or enter an image URL on the web interface. The tool will then use the pre-trained models to generate a descriptive caption that accurately describes the contents of the image.
## Getting Started
To install Captionify, simply clone this repository and install the necessary dependencies using pip:
```
git clone https://github.com//.git
cd
```
Install the required dependencies using the following command:
```
pip install -r requirements.txt
```
Then run the app.py file using the following command:
```
streamlit run app.py
```
This will launch the application on your local machine. You can then upload an image or enter an image URL to generate a descriptive caption.
## Architecture
Captionify uses an encoder-decoder architecture to generate captions for images. The encoder is a pre-trained Transformer-based vision model (ViT) that encodes the input image into a sequence of feature vectors. The decoder is a pre-trained language model (GPT2) that generates a descriptive caption for the image based on the encoded features.![vit_architecture](https://user-images.githubusercontent.com/90026724/236233200-745dae6a-569f-4558-9a12-3a56b0b8a872.jpg)
## Dependencies
- streamlit
- requests
- Pillow
- transformers
- torch
## References
- This project is based on the Encoder-Decoder architecture and uses pre-trained models from the Hugging Face Transformers library.
- The application was developed using Streamlit, an open-source app framework for Machine Learning and Data Science projects.