https://github.com/kyegomez/multimodal-tot

Multi-Modal Tree of thoughts for DALLE-3 like auto self improvement
https://github.com/kyegomez/multimodal-tot

artificial-intelligence gpt4 multi-modal multi-modality multi-modality-data

Last synced: about 1 year ago
JSON representation

Multi-Modal Tree of thoughts for DALLE-3 like auto self improvement

Host: GitHub
URL: https://github.com/kyegomez/multimodal-tot
Owner: kyegomez
License: mit
Created: 2023-09-21T03:15:08.000Z (almost 3 years ago)
Default Branch: main
Last Pushed: 2024-11-11T21:03:23.000Z (over 1 year ago)
Last Synced: 2025-04-19T20:16:58.679Z (over 1 year ago)
Topics: artificial-intelligence, gpt4, multi-modal, multi-modality, multi-modality-data
Language: Python
Homepage: https://discord.gg/GYbXvDGevY
Size: 81.2 MB
Stars: 16
Watchers: 4
Forks: 2
Open Issues: 1
Metadata Files:
- Readme: README.md
- Funding: .github/FUNDING.yml
- License: LICENSE

Awesome Lists containing this project

README

          [![Multi-Modality](agorabanner.png)](https://discord.gg/qUtxnK2NMf)

# MultiModal Tree of Thoughts

Multi Modal tree of thoughts that leverages the GPT-4 language model and the

Stable Diffusion model to generate a multimodal output and evaluate the

output based a metric from 0.0 to 1.0 and then run a search algorithm using DFS and BFS and return the best output.

    

    

task: Generate an image of a swarm of bees -> Image generator -> GPT4V evaluates the img from 0.0 to 1.0 -> DFS/BFS -> return the best output

- GPT4Vision will evaluate the image from 0.0 to 1.0 based on how likely it accomplishes the task

- DFS/BFS will search for the best output based on the evaluation from GPT4Vision

- The output will be a multimodal output that is a combination of the image and the text

- The output will be evaluated by GPT4Vision

- The prompt to the image generator will be optimized from the output of GPT4Vision and the search

# Usage

`streamlit run app.py`

# License

MIT

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/kyegomez/multimodal-tot

Awesome Lists containing this project

README