https://github.com/fofr/cog-batch-image-captioning

Caption images for lora training
https://github.com/fofr/cog-batch-image-captioning

ai anthropic captioning claude cog gemini openai replicate

Last synced: over 1 year ago
JSON representation

Caption images for lora training

Host: GitHub
URL: https://github.com/fofr/cog-batch-image-captioning
Owner: fofr
License: mit
Created: 2024-08-13T09:33:08.000Z (almost 2 years ago)
Default Branch: main
Last Pushed: 2024-08-13T12:52:23.000Z (almost 2 years ago)
Last Synced: 2025-03-24T11:56:55.922Z (over 1 year ago)
Topics: ai, anthropic, captioning, claude, cog, gemini, openai, replicate
Language: Python
Homepage: https://replicate.com/fofr/batch-image-captioning
Size: 17.6 KB
Stars: 8
Watchers: 1
Forks: 5
Open Issues: 1
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# cog-batch-image-captioning

A cog model for batch image captioning using various AI from OpenAI, Anthropic, and Google's Generative AI:

https://replicate.com/fofr/batch-image-captioning

## Features

- Process multiple images from a ZIP archive
- supports png, jpg, jpeg, webp
- Optional image resizing for more cost-effective captioning
- Customizable caption prefixes and suffixes
- Support for multiple AI models:
- OpenAI: GPT-4 and variants
- Anthropic: Claude-3.5, Claude-3 variants
- Google: Gemini-1.5 variants
- Flexible system and message prompts
- Error handling and retry mechanism
- Output as a ZIP file containing captions that match image filenames as well as a CSV summary

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/fofr/cog-batch-image-captioning

Awesome Lists containing this project

README