{"id":25616687,"url":"https://github.com/prachipatel15/ai-image-captioning","last_synced_at":"2026-04-15T11:36:58.425Z","repository":{"id":278696111,"uuid":"936477838","full_name":"PrachiPatel15/AI-Image-Captioning","owner":"PrachiPatel15","description":"An AI-powered image captioning app built with Streamlit, using ViT-GPT2 for caption generation and YOLOv8 for object detection. The app provides enhanced captions by integrating detected objects into the generated text.","archived":false,"fork":false,"pushed_at":"2025-02-21T06:44:32.000Z","size":0,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-02-21T07:30:53.803Z","etag":null,"topics":["computer-vision","image-processing","streamlit","transformers","vit-gpt2","yolov8"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/PrachiPatel15.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2025-02-21T06:43:14.000Z","updated_at":"2025-02-21T07:00:04.000Z","dependencies_parsed_at":"2025-02-21T07:30:57.832Z","dependency_job_id":"67c220e8-d422-42f6-a518-93d9ece0da74","html_url":"https://github.com/PrachiPatel15/AI-Image-Captioning","commit_stats":null,"previous_names":["prachipatel15/ai-image-captioning"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PrachiPatel15%2FAI-Image-Captioning","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PrachiPatel15%2FAI-Image-Captioning/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PrachiPatel15%2FAI-Image-Captioning/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PrachiPatel15%2FAI-Image-Captioning/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/PrachiPatel15","download_url":"https://codeload.github.com/PrachiPatel15/AI-Image-Captioning/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":240122814,"owners_count":19751178,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["computer-vision","image-processing","streamlit","transformers","vit-gpt2","yolov8"],"created_at":"2025-02-22T04:17:59.584Z","updated_at":"2026-04-15T11:36:58.371Z","avatar_url":"https://github.com/PrachiPatel15.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# AI-Image-Captioning\nAn **AI-powered image captioning app** built with **Streamlit**, using **ViT-GPT2** for caption generation and **YOLOv8** for object detection. The app enhances captions by integrating detected objects into the generated text.\n\n## 🔥 Features\n- **AI-powered image captioning** using **ViT-GPT2**.\n- **Object detection** with **YOLOv8** to enhance captions.\n- **Dark-themed UI** with **Streamlit**.\n- **Interactive settings** for enabling/disabling object detection.\n- **Optimized inference** with GPU acceleration (CUDA support).\n\n## 🚀 Demo\n### 1️⃣ **Upload an Image**\n![Upload Screenshot](https://github.com/PrachiPatel15/AI-Image-Captioning/blob/main/assest/upload_image.png)\n\n### 2️⃣ **Enable Object Detection and Generate Captions**\n![Detection Screenshot](https://github.com/PrachiPatel15/AI-Image-Captioning/blob/main/assest/object_detection_tick.png)\n\n### 3️⃣ **View Enhanced Caption and Detected Objects**\n![Results Screenshot](https://github.com/PrachiPatel15/AI-Image-Captioning/blob/main/assest/obj_with_caption.png)\n\n## 📂 Installation \u0026 Setup\n### 1️⃣ **Clone the Repository**\n```bash\ngit clone https://github.com/yourusername/AI-Image-Captioning.git\ncd AI-Image-Captioning\n```\n\n### 2️⃣ **Create a Virtual Environment (Optional but Recommended)**\n```bash\npython -m venv venv\nsource venv/bin/activate   # On macOS/Linux\nvenv\\Scripts\\activate     # On Windows\n```\n\n### 3️⃣ **Install Dependencies**\n```bash\npip install -r requirements.txt\n```\n\n### 4️⃣ **Run the Application**\n```bash\nstreamlit run app.py\n```\n\n## 🧠 Models Used\n### **1️⃣ ViT-GPT2** (Image Captioning)\n- **Pretrained Model**: `nlpconnect/vit-gpt2-image-captioning`\n- **Task**: Generates textual descriptions for input images.\n\n### **2️⃣ YOLOv8** (Object Detection)\n- **Pretrained Model**: `yolov8n.pt`\n- **Task**: Detects objects in the image to enhance captions.\n\n## ⚙️ Project Structure\n```bash\nAI-Image-Captioning/\n│── app.py                  # Main Streamlit application\n│── requirements.txt        # Required dependencies\n│── README.md               # Documentation\n│── assest/                 # Store images/screenshots\n```\n\n## 🛠️ Usage Instructions\n1. **Upload an image** in the app.\n2. **Choose whether to enable object detection**.\n3. **Click 'Analyze Image'** to generate a caption.\n4. **View enhanced captions** and object detection results.\n\n## 💡 Future Improvements\n- [ ] Add multilingual captioning support.\n- [ ] Optimize object detection performance.\n- [ ] Implement additional caption refinement techniques.\n\n## 🤝 Contributing\nContributions are welcome! Feel free to **fork** this repository and create a **pull request** with your improvements.\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fprachipatel15%2Fai-image-captioning","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fprachipatel15%2Fai-image-captioning","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fprachipatel15%2Fai-image-captioning/lists"}