https://github.com/zentrocdot/ComfyUI-Simple_Image_To_Prompt
Simple Image To Prompt
https://github.com/zentrocdot/ComfyUI-Simple_Image_To_Prompt
Last synced: 2 months ago
JSON representation
Simple Image To Prompt
- Host: GitHub
- URL: https://github.com/zentrocdot/ComfyUI-Simple_Image_To_Prompt
- Owner: zentrocdot
- Created: 2025-02-10T15:05:08.000Z (2 months ago)
- Default Branch: main
- Last Pushed: 2025-02-10T15:25:21.000Z (2 months ago)
- Last Synced: 2025-02-10T16:23:09.665Z (2 months ago)
- Language: Python
- Size: 24.4 KB
- Stars: 0
- Watchers: 1
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
- awesome-comfyui - **ComfyUI-Simple_Image_To_Prompt**
README
> [!IMPORTANT]
>🚧 This documentation is still under construction.
> Parts of the node are still under development. There may therefore be
> minor differences between the node itself and the documentation for
> the node. The documentation is also not yet complete.# Preface
This node is one result of my investigation what one
is able to do with Moondream. One important thing for what one can use
Moondream is the Image To Prompt feature.# Motivation
Image To Prompt can be used in different ways.
One can get an idea how to change or improve a Prompt by suggestions
from Image To Prompt. It is also possible to get informations about
the art and style of the image. This can be helpful to get the right
key words for further work.# Introductory Words
The node is using the CPU and not the GPU. First
way, the proposed one, can be done with Moon dream directly. For the
second way one needs Huggingface.In the first versions of the node the model cannot
be selected. At the moment there are four models available. In one of
the following versions I will add the support for all of the four models.In the future I need something like a download node,
which should be offer the possibility to download and monitor the process
of the download in the node and not in the terminal window.# Prerequisites
Download the Moon dream model
https://huggingface.co/vikhyatk/moondream2/resolve/9dddae84d54db4ac56fe37817aeaeb502ed083e2/moondream-2b-int8.mf.gz
and unzip it to
a folder called moondream to be created in the directory tree
ComfyUI/modelsIn this directory all other models from checkpoint
over LoRAs to upscaler models. So it is the best way to place a model.# Node Preview
Figure 1 shows a preview of the node. Over the
connector on the left side one can load the image of interest. The
connectors on the right side have as output* Answer (to a question or Prompt)
* Short caption
* Normal caption
![]()
Figure 1: Node preview
The current Version allows to ask a question which is not given by
the fixed one from the implemented source code
What does the image show?
Like I have done it in the example workflow one should output all three
to get the best answer or caption for each case.One can ask what one likes to do e.g.
+ What do we see in the image?
+ What is the art style of the image?
+ What is the story behind the image?and so on.
# Workflow Preview
Figure 2 shows the simple case of a workflow. From my point of view it
is not helpful to create in an automatic way new images from the given
answer or caption.
![]()
Figure 2: Example workflow preview
Read the next section why I do not propose to use a automatic image generation.
What the Workflow/Node Does
Each time one let run the workflow Moon dream is generating
a new answer. No two answers will be the same. This way it make sense to run the
workflow different times untill one get an answer which one likes more than an
other answer.# Installation
## Model Directory
To be compatible with ComfyUI there should be a directory created in the ComfyUI directory
moondreamIn this directory the Moondream models should be placed.
## Node Installation
Use the ComfyUI Manager for the installation.
Search for my nick 'zentrocdot' or search for 'ComfyUI-Simple_Image_To_Prompt'.Alternative one can install the node from within the directory
custom_nodes
by```
git clone https://github.com/zentrocdot/ComfyUI-Simple_Image_To_Prompt
```# Troubleshooting
## Error Message
If one get an error message like this
ImportError: tokenizers>=0.21,<0.22 is required for a normal functioning of this module, but found tokenizers==0.20.3.
one can fix this error message easily.
## Error Fixing
After installing of this node one has to do
```pip install -U transformers```
and the error message is gone.
## General Solution
By changing the
requirements.txt
this error should no longer occur.# Open Issue
The one and only open issue is how to unload a loaded
Moondream model. Memory is locked after loading a model independend if the
workflow with the node is open or closed.Not being able to unload the model from memory is a serious
problem if you want to use Moondream in this way.In the latest version of the node, I am testing a new approach
for the memory management. This looks very promising for the memory plumbing problem.# Do-Do
Improvement of this documentation.
The open issue that I did not found a way to unload
a loaded model makes much more test runs necessary.The algorithm I found for Image To Prompt using Moondream
works well for the moment. Some other approaches I tried before not. It has
to be tested if the current approach works well under different conditions all
the time.# Conclusion
I am on the way to finishing the work on this node. The
node does what the node should be. For my work the node in the current state
is sufficient.# Remark
If one needs an improvment of this node, feel free to make
a donation. Tell me what you need. I will take a look if it is possible.# References
[1] https://github.com/vikhyat/moondream/tree/main/clients/python
[2] https://moondream.ai/playground
[3] https://moondream.ai/
[4] https://github.com/vikhyat/moondream
[5] https://www.copus.io/work/474668694161482d83005265199b4995?spaceId=zentrocdotsposts
## Donation
If you like what I present here, or if it helps you,
or if it is useful, you are welcome to donate a small contribution. Or
as you might say: Every TRON counts! Many thanks in advance! :smiley:${\textnormal{\color{navy}Tron}}$
```
TQamF8Q3z63sVFWiXgn2pzpWyhkQJhRtW7
```${\textnormal{\color{navy}Doge}}$
```
DMh7EXf7XbibFFsqaAetdQQ77Zb5TVCXiX
```${\textnormal{\color{navy}Bitcoin}}$
```
12JsKesep3yuDpmrcXCxXu7EQJkRaAvsc5
```${\textnormal{\color{navy}Ethereum}}$
```
0x31042e2F3AE241093e0387b41C6910B11d94f7ec
```