{"id":50870238,"url":"https://github.com/jmanhype/prompt-forge","last_synced_at":"2026-06-15T04:33:49.631Z","repository":{"id":364363118,"uuid":"1267584835","full_name":"jmanhype/prompt-forge","owner":"jmanhype","description":"Type what you want → get the perfect image. AI image generation that auto-refines prompts until the output matches your description.","archived":false,"fork":false,"pushed_at":"2026-06-13T16:50:35.000Z","size":611,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-06-15T04:33:47.970Z","etag":null,"topics":["ai-art","clip","comfyui","florence2","ideogram","lora"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/jmanhype.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":"AGENTS.md","dco":null,"cla":null}},"created_at":"2026-06-12T17:14:20.000Z","updated_at":"2026-06-13T16:50:39.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/jmanhype/prompt-forge","commit_stats":null,"previous_names":["jmanhype/prompt-forge"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/jmanhype/prompt-forge","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jmanhype%2Fprompt-forge","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jmanhype%2Fprompt-forge/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jmanhype%2Fprompt-forge/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jmanhype%2Fprompt-forge/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/jmanhype","download_url":"https://codeload.github.com/jmanhype/prompt-forge/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jmanhype%2Fprompt-forge/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34348291,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-15T02:00:07.085Z","response_time":63,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai-art","clip","comfyui","florence2","ideogram","lora"],"created_at":"2026-06-15T04:33:49.549Z","updated_at":"2026-06-15T04:33:49.623Z","avatar_url":"https://github.com/jmanhype.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Prompt Forge\n\n**Type what you want → get the perfect image**\n\nPrompt Forge is an AI image generation tool that automatically refines prompts until the output matches your description. Instead of manually tweaking prompts, you describe what you want and let the system converge on the best result.\n\n![Demo](content/demo.gif)\n\n## Quick Start\n\n### Option 1: One-Command Install (Docker)\n\n```bash\ngit clone https://github.com/jmanhype/prompt-forge\ncd prompt-forge\ndocker-compose up\n```\n\nThen open http://localhost:7861\n\n**Requirements:**\n- Docker \u0026 Docker Compose\n- NVIDIA GPU with CUDA support (or CPU-only mode)\n- 24GB+ VRAM recommended (runs on 16GB with reduced quality)\n\n### Option 2: Manual Install\n\n```bash\ngit clone https://github.com/jmanhype/prompt-forge\ncd prompt-forge\n\n# Install dependencies\npython -m venv venv\nsource venv/bin/activate  # On Windows: venv\\Scripts\\activate\npip install -r requirements.txt\n\n# Start the server\npython -m uvicorn server.main:app --host 0.0.0.0 --port 7861\n```\n\n**Requirements:**\n- Python 3.10+\n- ComfyUI running on localhost:8188 (see [ComfyUI setup](#comfyui-setup))\n- Florence-2 model (auto-downloads on first run)\n\n## How It Works\n\n1. **Describe your image** — Type a natural language description or upload a reference image\n2. **Automatic generation** — The system generates an image using Ideogram 4 + your LoRAs\n3. **AI scoring** — Florence-2 analyzes the output and compares it to your description using CLIP\n4. **Smart mutation** — If the score is below threshold, the system enriches the prompt and tries again\n5. **Convergence** — Repeats until the image matches your description (or hits max iterations)\n\n### Example\n\n**Input:** \"ektachrome vintage 1960s Kodak film photograph of a golden retriever sitting in an empty concrete testing facility\"\n\n**Output:** High-quality image with Ektachrome LoRA automatically injected, vintage film aesthetic, correct subject placement.\n\n**Iterations:** 1 (converged immediately because the prompt was detailed)\n\n**Vague prompts** like \"a dog\" will trigger multiple iterations as the system adds detail (lighting, camera angle, environment, etc.) until the output is compelling.\n\n## Features\n\n### Core\n- **Closed-loop generation** — Describe → Generate → Score → Mutate → Repeat\n- **Florence-2 analysis** — Captioning, object detection, OCR for reference images\n- **CLIP scoring** — Calibrated piecewise normalization (65 measurements across 5 images, 13 prompts)\n- **Smart mutation** — Enrichment-based (adds film grain, camera details, setting context) instead of random word swaps\n- **Auto LoRA detection** — Scans ComfyUI for available LoRAs and injects matching ones (e.g., \"ektachrome\" → Ektachrome Style LoRA v1)\n\n### UI\n- **Dark terminal aesthetic** — Minimal, focused interface\n- **Real-time updates** — WebSocket or HTTP polling fallback\n- **Composition library** — Save successful generations to SQLite for reuse\n- **Image upload** — Drop a reference image and let Florence-2 analyze it\n\n## ComfyUI Setup\n\nPrompt Forge requires ComfyUI with Ideogram 4 support.\n\n### Install ComfyUI\n\n```bash\ngit clone https://github.com/comfyanonymous/ComfyUI\ncd ComfyUI\npython -m venv venv\nsource venv/bin/activate\npip install -r requirements.txt\n```\n\n### Download Models\n\nDownload these to `ComfyUI/models/`:\n\n- **Checkpoints:** `checkpoints/ideogram4_fp8_scaled.safetensors`\n- **CLIP:** `clip/qwen3vl_8b_fp8_scaled.safetensors`\n- **VAE:** `vae/flux2-vae.safetensors`\n- **LoRAs:** `loras/ektachrome_style_v1.safetensors` (optional)\n\n### Start ComfyUI\n\n```bash\npython main.py --listen 0.0.0.0 --port 8188\n```\n\n## API\n\n### Analyze Image\n\n```bash\ncurl -X POST http://localhost:7861/api/analyze \\\n  -F \"text=ektachrome vintage photograph\"\n```\n\n### Start Forge Session\n\n```bash\ncurl -X POST http://localhost:7861/api/forge \\\n  -F \"description=ektachrome vintage 1960s photograph of a dog\" \\\n  -F \"max_iterations=3\"\n```\n\nReturns: `{\"session_id\": \"abc123\", \"ws_url\": \"/ws/forge/abc123\"}`\n\n### Poll for Results\n\n```bash\ncurl http://localhost:7861/api/forge/abc123\n```\n\nReturns: Full session history with iterations, scores, diagnoses, and generated images.\n\n### WebSocket (Real-time)\n\n```javascript\nconst ws = new WebSocket('ws://localhost:7861/ws/forge/abc123')\nws.onmessage = (e) =\u003e {\n  const { type, data } = JSON.parse(e.data)\n  if (type === 'iteration') {\n    console.log(`Iteration ${data.number}: score=${data.score.overall}`)\n  }\n}\n```\n\n## Configuration\n\nCreate a `.env` file in the project root:\n\n```env\n# ComfyUI\nCOMFYUI_URL=http://localhost:8188\n\n# Models\nFLORENCE_MODEL=microsoft/Florence-2-base-ft\nDEFAULT_CHECKPOINT=ideogram4_fp8_scaled.safetensors\nDEFAULT_VAE=flux2-vae.safetensors\n\n# Forge\nMAX_ITERATIONS=5\nCONVERGENCE_THRESHOLD=0.55\n\n# Database\nDB_PATH=data/library/compositions.db\n```\n\n## Architecture\n\n```\nserver/\n├── main.py              # FastAPI endpoints\n├── config.py            # Environment config\n├── analyzer/            # Florence-2 scene analysis\n├── compiler/            # ComfyUI workflow compilation\n├── connector/           # ComfyUI API client\n├── scorer/              # CLIP-based scoring\n├── mutator/             # Rule-based prompt enrichment\n├── forge/               # Main convergence loop\n├── store/               # SQLite composition library\n└── lora/                # LoRA auto-detection\n\nfrontend/\n├── index.html           # Single-page app\n├── css/forge.css        # Dark terminal theme\n└── js/                  # Vanilla JS (no build step)\n```\n\n## How It Compares\n\n| Feature | Prompt Forge | image-to-prompt | Stable Diffusion WebUI |\n|---------|--------------|-----------------|------------------------|\n| **Input** | Text description or reference image | Reference image only | Text prompt |\n| **Output** | Converged image | JSON prompt | Single image |\n| **Automation** | Full loop (describe → generate → score → mutate) | One-shot analysis | Manual prompting |\n| **Scoring** | CLIP + Florence-2 | None | None |\n| **LoRA support** | Auto-detection + injection | N/A | Manual selection |\n| **ComfyUI required** | Yes | No | No |\n\n## Development\n\n```bash\n# Run tests\npytest tests/ -v\n\n# Start with auto-reload\nuvicorn server.main:app --host 0.0.0.0 --port 7861 --reload\n\n# Format code\nblack server/ tests/\nisort server/ tests/\n```\n\n## Roadmap\n\n- [ ] DINOv2 per-region scoring (tell which part of the image is wrong)\n- [ ] Multi-LoRA support (combine styles)\n- [ ] Ideogram 4 native JSON prompting (style_description + composition fields)\n- [ ] Browser-based WebSocket (currently using HTTP polling fallback)\n- [ ] Image upload → forge loop (reference-based generation)\n- [ ] Composition library search/retrieval\n- [ ] Pinokio 1-click installer\n- [ ] CPU-only mode (no GPU required)\n\n## License\n\nMIT\n\n## Acknowledgments\n\n- [Florence-2](https://huggingface.co/microsoft/Florence-2-base-ft) — Microsoft's vision-language model\n- [CLIP](https://github.com/mlfoundations/open_clip) — OpenAI's contrastive image-text model\n- [ComfyUI](https://github.com/comfyanonymous/ComfyUI) — Node-based Stable Diffusion GUI\n- [Ideogram 4](https://ideogram.ai/) — AI image generation platform\n- [Ektachrome LoRA](https://huggingface.co/jmanhype/Ektachrome-LoRA-v1-Ideogram-v4) — Vintage film aesthetic\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjmanhype%2Fprompt-forge","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjmanhype%2Fprompt-forge","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjmanhype%2Fprompt-forge/lists"}