An open API service indexing awesome lists of open source software.

https://github.com/santiagosamuel3455/ComfyUI-GeminiImageToPrompt

Imagen descripcion prompt system
https://github.com/santiagosamuel3455/ComfyUI-GeminiImageToPrompt

Last synced: about 2 months ago
JSON representation

Imagen descripcion prompt system

Awesome Lists containing this project

README

        

# ComfyUI-GeminiImageToPrompt
Advanced Prompt Generation System for Audiovisual Content
3 Integrated Nodes to Optimize Multimodal Content Creation

1. Gemini Text to Cinematic Prompt Node (Google API)
Function: Transforms basic textual descriptions into high-quality cinematic prompts using Google's Gemini model.
Capabilities:

Enriches narrative, stylistic, and technical details (e.g., lighting, camera angles, atmosphere).
Ideal for generating videos or images with a cinematic focus.
Application: Use it as a starting point for projects that require professional prompts without writing from scratch.
2. Gemini Image to Prompt Node (Multimodal Analysis)
Function: Analyzes input images to automatically generate descriptive prompts, optimized for video conversion.
Capabilities:

Extracts key visual elements (colors, objects, composition, artistic style).
Translates the analysis into technical instructions for image-to-video flows.
Application: Avoid manual prompt writing by working with reference images or existing assets.
3. Deepseek R1 Node + KlingAI Text/Image to Video (Free Version)
Feature: Generate refined prompts for text-to-video or image-to-video content, leveraging KlingAI technology with a free account.
Advantages:

No credit consumption: Ideal for users on a budget.
Support for multimodal input (text and image) for hybrid workflows.
Application: Create detailed scripts for videos using short descriptions or images, maintaining quality and visual consistency.
Recommended Workflow: Text to Cinematic Prompt: Use Gemini to define a rich narrative.
Image to Technical Prompt: Analyze visual references with Gemini.
Final Video Generation: Export optimized prompts to KlingAI/Deepseek R1 for video production.
Key Benefit: Combines Gemini's strengths (creativity and analytics) with KlingAI's accessibility, reducing manual effort and operational costs.Advanced Prompt Generation System for Audiovisual Content
3 Integrated Nodes to Optimize Multimodal Content Creation

1. Gemini Text to Cinematic Prompt Node (Google API)
Function: Transforms basic textual descriptions into high-quality cinematic prompts using Google's Gemini model.
Capabilities:

Enriches narrative, stylistic, and technical details (e.g., lighting, camera angles, atmosphere).
Ideal for generating videos or images with a cinematic focus.
Application: Use it as a starting point for projects that require professional prompts without writing from scratch.
2. Gemini Image to Prompt Node (Multimodal Analysis)
Function: Analyzes input images to automatically generate descriptive prompts, optimized for video conversion.
Capabilities:

Extracts key visual elements (colors, objects, composition, artistic style).
Translates the analysis into technical instructions for image-to-video flows.
Application: Avoid manual prompt writing by working with reference images or existing assets.
3. Deepseek R1 Node + KlingAI Text/Image to Video (Free Version)
Feature: Generate refined prompts for text-to-video or image-to-video content, leveraging KlingAI technology with a free account.
Advantages:

No credit consumption: Ideal for users on a budget.
Support for multimodal input (text and image) for hybrid workflows.
Application: Create detailed scripts for videos using short descriptions or images, maintaining quality and visual consistency.
Recommended Workflow: Text to Cinematic Prompt: Use Gemini to define a rich narrative.
Image to Technical Prompt: Analyze visual references with Gemini.
Final Video Generation: Export optimized prompts to KlingAI/Deepseek R1 for video production.
Key Benefit: Combines Gemini's strengths (creativity and analytics) with KlingAI's accessibility, reducing manual effort and operational costs.