https://github.com/zentrocdot/ComfyUI-Simple_Image_To_Prompt

Simple Image To Prompt
https://github.com/zentrocdot/ComfyUI-Simple_Image_To_Prompt

Last synced: 2 months ago
JSON representation

Simple Image To Prompt

Host: GitHub
URL: https://github.com/zentrocdot/ComfyUI-Simple_Image_To_Prompt
Owner: zentrocdot
Created: 2025-02-10T15:05:08.000Z (11 months ago)
Default Branch: main
Last Pushed: 2025-02-10T15:25:21.000Z (11 months ago)
Last Synced: 2025-02-10T16:23:09.665Z (11 months ago)
Language: Python
Size: 24.4 KB
Stars: 0
Watchers: 1
Forks: 1
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

awesome-comfyui - **ComfyUI-Simple_Image_To_Prompt**
awesome-comfyui - **ComfyUI-Simple_Image_To_Prompt**

README

> [!IMPORTANT]
>

🚧 This documentation is still under construction.
> Parts of the node are still under development. There may therefore be
> minor differences between the node itself and the documentation for
> the node. The documentation is also not yet complete.

# Preface

This node is one result of my investigation what one
is able to do with Moondream. One important thing for what one can use
Moondream is the Image To Prompt feature.

# Motivation

Image To Prompt can be used in different ways.
One can get an idea how to change or improve a Prompt by suggestions
from Image To Prompt. It is also possible to get informations about
the art and style of the image. This can be helpful to get the right
key words for further work.

# Introductory Words

The node is using the CPU and not the GPU. First
way, the proposed one, can be done with Moondream directly. For the
second way one needs Huggingface.

In the first versions of the node the model cannot
be selected. At the moment there are four models available. In one of
the following versions I will add the support for all of the four models.

In the future I need something like a download node,
which should be offer the possibility to download and monitor the process
of the download in the node and not in the terminal window.

# Prerequisites

Download the Moon dream model

https://huggingface.co/vikhyatk/moondream2/resolve/9dddae84d54db4ac56fe37817aeaeb502ed083e2/moondream-2b-int8.mf.gz

and unzip it to

a folder called moondream to be created in the directory tree

ComfyUI/models

In this directory all other models from checkpoint
over LoRAs to upscaler models. So it is the best way to place a model.

# Node Preview

Figure 1 shows a preview of the node. Over the
connector on the left side one can load the image of interest. The
connectors on the right side have as output

* Answer (to a question or Prompt)
* Short caption
* Normal caption

node preview

Figure 1: Node preview

The current Version allows to ask a question which is not given by
the fixed one from the implemented source code

What does the image show?

Like I have done it in the example workflow one should output all three
to get the best answer or caption for each case.

One can ask what one likes to do e.g.

+ What do we see in the image?
+ What is the art style of the image?
+ What is the story behind the image?

and so on.

# Workflow Preview

Figure 2 shows the simple case of a workflow. From my point of view it
is not helpful to create in an automatic way new images from the given
answer or caption.

workflow preview

Figure 2: Example workflow preview

Read the next section why I do not propose to use a automatic image generation.

## What the Workflow/Node Does

Each time one let run the workflow Moon dream is generating
a new answer. No two answers will be the same. This way it make sense to run the
workflow different times untill one get an answer which one likes more than an
other answer.

# Installation

## Model Directory

To be compatible with ComfyUI there should be a directory created in the ComfyUI directory

moondream

In this directory the Moondream models should be placed.

## Node Installation

Use the ComfyUI Manager for the installation.
Search for my nick 'zentrocdot' or search for 'ComfyUI-Simple_Image_To_Prompt'.

Alternative one can install the node from within the directory custom_nodes by

```
git clone https://github.com/zentrocdot/ComfyUI-Simple_Image_To_Prompt
```

# Limitations

I was asked for a different functionality. The node in its origin produces
from run to run an new answer and a new caption. I hat to program this feature.
Somebody asked if I can make this fix. That results in my first an old node an
the new one I am proposing.

The Node 'No Update' is not changing as long nothing changes beginning from the image
over settings to the question.

I tested both nodes and have to say, that one can not chanbge the node without restarting
ComfyUI. This is the problem that I cannot unload a loaded model.

# Troubleshooting

## Error Message

If one get an error message like this

ImportError: tokenizers>=0.21,<0.22 is required for a normal functioning of this module, but found tokenizers==0.20.3.

one can fix this error message easily.

## Error Fixing

After installing of this node one has to do

```pip install -U transformers```

and the error message is gone.

## General Solution

By changing the requirements.txt
this error should no longer occur.

# Open Issue

The one and only open issue is how to unload a loaded
Moondream model. Memory is locked after loading a model independend if the
workflow with the node is open or closed.

Not being able to unload the model from memory is a serious
problem if you want to use Moondream in this way.

In the latest version of the node, I am testing a new approach
for the memory management. This looks very promising for the memory plumbing problem.

# Do-Do

Improvement of this documentation.

The open issue that I did not found a way to unload
a loaded model makes much more test runs necessary.

The algorithm I found for Image To Prompt using Moondream
works well for the moment. Some other approaches I tried before not. It has
to be tested if the current approach works well under different conditions all
the time.

# Conclusion

I am on the way to finishing the work on this node. The
node does what the node should be. For my work the node in the current state
is sufficient.

# Remark

If one needs an improvment of this node, feel free to make
a donation. Tell me what you need. I will take a look if it is possible.

# References

[1] https://github.com/vikhyat/moondream/tree/main/clients/python

[2] https://moondream.ai/playground

[3] https://moondream.ai/

[4] https://github.com/vikhyat/moondream

[5] https://www.copus.io/work/474668694161482d83005265199b4995?spaceId=zentrocdotsposts

## Donation

If you like what I present here, or if it helps you,
or if it is useful, you are welcome to donate a small contribution. Or
as you might say: Every TRON counts! Many thanks in advance! :smiley:

${\textnormal{\color{navy}Tron}}$

```
TQamF8Q3z63sVFWiXgn2pzpWyhkQJhRtW7
```

${\textnormal{\color{navy}Doge}}$

```
DMh7EXf7XbibFFsqaAetdQQ77Zb5TVCXiX
```

${\textnormal{\color{navy}Bitcoin}}$

```
12JsKesep3yuDpmrcXCxXu7EQJkRaAvsc5
```

${\textnormal{\color{navy}Ethereum}}$

```
0x31042e2F3AE241093e0387b41C6910B11d94f7ec
```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/zentrocdot/ComfyUI-Simple_Image_To_Prompt

Awesome Lists containing this project

README