Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/fofr/cog-batch-image-captioning
Caption images for lora training
https://github.com/fofr/cog-batch-image-captioning
ai anthropic captioning claude cog gemini openai replicate
Last synced: 3 months ago
JSON representation
Caption images for lora training
- Host: GitHub
- URL: https://github.com/fofr/cog-batch-image-captioning
- Owner: fofr
- License: mit
- Created: 2024-08-13T09:33:08.000Z (5 months ago)
- Default Branch: main
- Last Pushed: 2024-08-13T12:52:23.000Z (5 months ago)
- Last Synced: 2024-08-14T13:41:53.964Z (5 months ago)
- Topics: ai, anthropic, captioning, claude, cog, gemini, openai, replicate
- Language: Python
- Homepage: https://replicate.com/fofr/batch-image-captioning
- Size: 17.6 KB
- Stars: 3
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# cog-batch-image-captioning
A cog model for batch image captioning using various AI from OpenAI, Anthropic, and Google's Generative AI:
https://replicate.com/fofr/batch-image-captioning
## Features
- Process multiple images from a ZIP archive
- supports png, jpg, jpeg, webp
- Optional image resizing for more cost-effective captioning
- Customizable caption prefixes and suffixes
- Support for multiple AI models:
- OpenAI: GPT-4 and variants
- Anthropic: Claude-3.5, Claude-3 variants
- Google: Gemini-1.5 variants
- Flexible system and message prompts
- Error handling and retry mechanism
- Output as a ZIP file containing captions that match image filenames as well as a CSV summary