https://github.com/dk3775/g26-img-captioning
https://github.com/dk3775/g26-img-captioning
Last synced: about 1 year ago
JSON representation
- Host: GitHub
- URL: https://github.com/dk3775/g26-img-captioning
- Owner: dk3775
- Created: 2025-02-03T12:28:47.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2025-04-23T11:44:40.000Z (about 1 year ago)
- Last Synced: 2025-04-23T12:35:18.739Z (about 1 year ago)
- Language: TypeScript
- Homepage: https://g26-img-captioning.vercel.app
- Size: 4.32 MB
- Stars: 0
- Watchers: 1
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Image Captioning using PyTorch
A deep learning project that automatically generates natural language descriptions for images using PyTorch
About ·
Features ·
Tech Stack ·
Setup ·
Team
## About
This project is developed as part of our B.Tech final year project. It implements an image captioning system that combines computer vision and natural language processing to automatically generate descriptive captions for images.
## Features
- Automatic caption generation for input images
- CNN-based image feature extraction
- LSTM-based caption generation
- Support for multiple image formats
- Pre-trained models for quick inference
- User-friendly interface for testing
## Tech Stack
- PyTorch - Deep learning framework
- Python 3.x
- CUDA (for GPU support)
- ResNet/VGG (for image feature extraction)
- LSTM (for caption generation)
- [Additional dependencies will be listed in requirements.txt]
## Setup
1. Clone the repository
```bash
git clone https://github.com/[username]/g26-img-captioning.git
cd g26-img-captioning
```
2. Create a virtual environment
```bash
python -m venv venv
source venv/bin/activate # On Windows use: venv\Scripts\activate
```
3. Install dependencies
```bash
pip install -r requirements.txt
```
4. Download pre-trained models
```bash
python scripts/download_models.py
```
5. Run the application
```bash
python main.py
```
## Team
- [Dhyey Kathiriya]
- [Riya Mehta]
- [Kush Nadpara]
Under the guidance of:
Prof. Kunal Garud
B.Tech. CSE
ICT Ganpat University
## License
This project is licensed under the MIT License - see the LICENSE file for details.