{"id":50412635,"url":"https://github.com/TIGER-AI-Lab/ImagenWorld","last_synced_at":"2026-06-16T20:00:36.118Z","repository":{"id":317647606,"uuid":"1065921673","full_name":"TIGER-AI-Lab/ImagenWorld","owner":"TIGER-AI-Lab","description":"Stress-Testing Image Generation Models with Explainable Human Evaluation on Open-ended Real-World Tasks [ICLR 2026]","archived":false,"fork":false,"pushed_at":"2026-04-02T03:00:29.000Z","size":18955,"stargazers_count":31,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2026-04-02T15:36:05.968Z","etag":null,"topics":["genai","generation","image"],"latest_commit_sha":null,"homepage":"https://tiger-ai-lab.github.io/ImagenWorld/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/TIGER-AI-Lab.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-09-28T17:40:49.000Z","updated_at":"2026-04-02T02:28:50.000Z","dependencies_parsed_at":"2025-10-02T07:17:47.546Z","dependency_job_id":"2031353f-cf24-4fab-9901-d0ae45acba8d","html_url":"https://github.com/TIGER-AI-Lab/ImagenWorld","commit_stats":null,"previous_names":["tiger-ai-lab/imagenworld"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/TIGER-AI-Lab/ImagenWorld","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TIGER-AI-Lab%2FImagenWorld","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TIGER-AI-Lab%2FImagenWorld/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TIGER-AI-Lab%2FImagenWorld/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TIGER-AI-Lab%2FImagenWorld/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/TIGER-AI-Lab","download_url":"https://codeload.github.com/TIGER-AI-Lab/ImagenWorld/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TIGER-AI-Lab%2FImagenWorld/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34421326,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-16T02:00:06.860Z","response_time":126,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["genai","generation","image"],"created_at":"2026-05-31T04:30:35.937Z","updated_at":"2026-06-16T20:00:36.112Z","avatar_url":"https://github.com/TIGER-AI-Lab.png","language":"Python","funding_links":[],"categories":["Code Availability Summary"],"sub_categories":["With Official Code ✅"],"readme":"# 🖼️ ImagenWorld \n[![arXiv](https://img.shields.io/badge/arXiv-2310.01596-b31b1b.svg)](https://arxiv.org/abs/2603.27862)\n\nImagenWorld: Stress-Testing Image Generation Models with Explainable Human Evaluation on Open-ended Real-World Tasks\n\n[ICLR Paper](https://openreview.net/forum?id=bld9g6jFh9)\n\n\n\u003cp align=\"center\"\u003e\n\u003cimg src=\"https://github.com/TIGER-AI-Lab/ImagenWorld/blob/gh-pages/static/images/psudo_banner.png\" width=\"40%\"\u003e\n\u003c/p\u003e\n\n\n**ImagenWorld** is a large-scale, human-centric benchmark designed to stress-test image generation models in real-world scenarios.  \n- **Broad coverage across 6 domains:** Artworks, Photorealistic Images, Information Graphics, Textual Graphics, Computer Graphics, and Screenshots.\n- **Rich supervision:** ~3.6K condition sets and ~20K fine-grained human annotations enable comprehensive, reproducible evaluation.\n- **Explainable evaluation pipeline:** We decompose generated outputs via object/segment extraction to identify entities (objects, fine-grained regions), supporting both scalar ratings and object-/segment-level failure tags.\n- **Diverse model suite:** We evaluate **14 models** in total — **4 unified** (GPT-Image-1, Gemini 2.0 Flash, BAGEL, OmniGen2) and **10 task-specific** baselines (SDXL, Flux.1-Krea-dev, Flux.1-Kontext-dev, Qwen-Image, Infinity, Janus Pro, UNO, Step1X-Edit, IC-Edit, InstructPix2Pix).\n\n\u003cdiv align=\"center\"\u003e\n \u003ca href = \"https://tiger-ai-lab.github.io/ImagenWorld/\"\u003e[🌐 Project Page]\u003c/a\u003e \u003ca href = \"https://openreview.net/forum?id=bld9g6jFh9\"\u003e[📄 Paper]\u003c/a\u003e \u003ca href = \"https://huggingface.co/datasets/TIGER-Lab/ImagenWorld\"\u003e[💾 Datasets]\u003c/a\u003e \u003ca href = \"https://huggingface.co/spaces/TIGER-Lab/ImagenWorld-Visualizer\"\u003e[🏛️ ImagenWorld-Visualizer]\u003c/a\u003e\n\u003c/div\u003e\n\n## 📰 News\n* 2025 Jan 25: Accepted to ICLR 2026!\n* 2025 Oct 16: ComfyUI Blog on [https://blog.comfy.org/p/introducing-imagenworld](https://blog.comfy.org/p/introducing-imagenworld)\n* 2025 Oct 13: Preprint released on Github.\n\n## 📖 Introduction\n\nThis repository contains the code for the paper [ImagenWorld: Stress-Testing Image Generation Models with Explainable Human Evaluation on Open-ended Real-World Tasks]().\nIn this paper, We introduce **ImagenWorld**, a large-scale, human-centric benchmark designed to stress-test image generation models in real-world scenarios. Unlike prior evaluations that focus on isolated tasks or narrow domains, ImagenWorld is organized into six domains: Artworks, Photorealistic Images, Information Graphics, Textual Graphics, Computer Graphics, and Screenshots, and six tasks: Text-to-Image Generation (TIG), Single-Reference Image Generation (SRIG), Multi-Reference Image Generation (MRIG), Text-to-Image Editing (TIE), Single-Reference Image Editing (SRIE), and Multi-Reference Image Editing (MRIE). The benchmark includes 3.6K condition sets and 20K fine-grained human annotations, providing a comprehensive testbed for generative models. To support explainable evaluation, ImagenWorld applies object- and segment-level extraction to generated outputs, identifying entities such as objects and fine-grained regions. This structured decomposition enables human annotators to provide not only scalar ratings but also detailed tags of object-level and segment-level failures.\n\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"https://github.com/TIGER-AI-Lab/ImagenWorld/blob/gh-pages/static/images/overview.PNG\" alt=\"Teaser\" width=\"70%\"/\u003e\n\u003c/p\u003e\n\n## 🚀 Quick Start — Inference\n\n**Tasks:** `TIG` (Text→Image Generation), `TIE` (Text→Image Editing), `SRIG`, `SRIE`, `MRIG`, `MRIE`  \n**Datasets:** assumes `ImagenWorld/\u003cTASK\u003e/...` layout (adjust `--task_path` as needed)\n\n---\n\n### Open-Source Models\n\n**Directory:** `inference/open-source/`  \n**Entrypoint:** `main.py`  \n**Model registry:** `inference/open-source/config.py`  \n**Batch helper:** `open_models.sh`  \n\nAll open-source and close-source runners follow a unified CLI:\n```bash\npython main.py   --task \u003cTASK\u003e   --model \u003cMODEL\u003e   --task_path \u003cDATASET_PATH\u003e   --limit \u003cN\u003e --verbose\n```\n\n#### 🔹 Example: TIG (Text→Image Generation) with UNO\n```bash\ncd inference/open-source\n\npython main.py   --task TIG   --model UNO   --task_path /path/to/ImagenWorld/TIG   --limit 5 --verbose\n```\n\n**Explanation**\n- Loads the **UNO** open-source generator from the registry (`config.py`)  \n- Runs the **TIG** (Text→Image Generation) task using samples from `/path/to/ImagenWorld/TIG`  \n- Saves results to `model_outputs/model_name.png` \n- Prints per-sample logs if `--verbose` is enabled  \n\n---\n\n### Closed-Source Models\n\n**Directory:** `inference/closed-source/`  \n**Entrypoint:** `main.py`  \n**Model registry:** `inference/closed-source/config.py`  \n**Batch helper:** `closed_models.sh`  \n\nAvailable closed-source APIs and outputs:\n- `GPT-Image-1` → saves `gpt-image-1.png`  \n- `Gemini2Flash` → saves `gemini.png`\n\n#### 🔧 Setup Environment\nSet your API keys before running:\n```bash\nexport OPENAI_API_KEY=\"sk-...\"     # for GPT-Image-1\nexport GEMINI_API_KEY=\"...\"        # for Gemini 2.5 Flash Image Preview\n```\n\n#### 🔹 Example: TIE (Text→Image Editing) with Gemini 2.5 Flash\n```bash\ncd inference/closed-source\n\npython main.py   --task TIE   --model Gemini2Flash   --task_path /path/to/ImagenWorld/TIE   --limit 5 --verbose\n```\n\n**Explanation**\n- Loads the selected **closed-source API model** (via OpenAI or Gemini)  \n- Runs the specified task on samples from `/path/to/ImagenWorld/\u003cTASK\u003e`  \n- Stores generated images (e.g., `gpt-image-1.png`, `gemini.png`)  \n\n---\n\n### Batch Execution (Optional)\n\nEach inference type includes a shell helper for multi-task runs:\n\n```bash\n# open-source batch\ncd inference/open-source\nbash open_models.sh\n\n# closed-source batch\ncd inference/closed-source\nbash closed_models.sh\n```\n\nIn both scripts:\n- Set `BASE_PATH` → dataset root (e.g., `/path/to/ImagenWorld`)  \n- Define `TASK_MODELS` to map each task to a model  \n- Set API keys for closed-source models \n\n## Citation\n\nIf you find our work useful for your research, please consider citing our paper:\n\n```bibtex\n@inproceedings{\nsani2026imagenworld,\ntitle={ImagenWorld: Stress-Testing Image Generation Models with Explainable Human Evaluation on Open-ended Real-World Tasks},\nauthor={Samin Mahdizadeh Sani and Max Ku and Nima Jamali and Matina Mahdizadeh Sani and Paria Khoshtab and Wei-Chieh Sun and Parnian Fazel and Zhi Rui Tam and Thomas Chong and Edisy Kin Wai Chan and Donald Wai Tong Tsang and Chiao-Wei Hsu and Lam Ting Wai and Ho Yin Sam Ng and Chiafeng Chu and Chak-Wing Mak and Keming Wu and Hiu Tung Wong and Yik Chun Ho and Chi Ruan and Zhuofeng Li and I-Sheng Fang and Shih-Ying Yeh and Ho Kei Cheng and Ping Nie and Wenhu Chen},\nbooktitle={The Fourteenth International Conference on Learning Representations},\nyear={2026},\nurl={https://openreview.net/forum?id=bld9g6jFh9}\n}\n\n@misc{sani2026imagenworldstresstestingimagegeneration,\n      title={ImagenWorld: Stress-Testing Image Generation Models with Explainable Human Evaluation on Open-ended Real-World Tasks}, \n      author={Samin Mahdizadeh Sani and Max Ku and Nima Jamali and Matina Mahdizadeh Sani and Paria Khoshtab and Wei-Chieh Sun and Parnian Fazel and Zhi Rui Tam and Thomas Chong and Edisy Kin Wai Chan and Donald Wai Tong Tsang and Chiao-Wei Hsu and Ting Wai Lam and Ho Yin Sam Ng and Chiafeng Chu and Chak-Wing Mak and Keming Wu and Hiu Tung Wong and Yik Chun Ho and Chi Ruan and Zhuofeng Li and I-Sheng Fang and Shih-Ying Yeh and Ho Kei Cheng and Ping Nie and Wenhu Chen},\n      year={2026},\n      eprint={2603.27862},\n      archivePrefix={arXiv},\n      primaryClass={cs.GR},\n      url={https://arxiv.org/abs/2603.27862}, \n}\n```\n\n\u003c!--\n```bibtex\n@misc{imagenworld2025,\n  title        = {ImagenWorld: Stress-Testing Image Generation Models with Explainable Human Evaluation on Open-ended Real-World Tasks},\n  author       = {Samin Mahdizadeh Sani and Max Ku and Nima Jamali and Matina Mahdizadeh Sani and Paria Khoshtab and Wei-Chieh Sun and Parnian Fazel and Zhi Rui Tam and Thomas Chong and Edisy Kin Wai Chan and Donald Wai Tong Tsang and Chiao-Wei Hsu and Ting Wai Lam and Ho Yin Sam Ng and Chiafeng Chu and Chak-Wing Mak and Keming Wu and Hiu Tung Wong and Yik Chun Ho and Chi Ruan and Zhuofeng Li and I-Sheng Fang and Shih-Ying Yeh and Ho Kei Cheng and Ping Nie and Wenhu Chen},\n  year         = {2025},\n  doi          = {10.5281/zenodo.17344183},\n  url          = {https://zenodo.org/records/17344183},\n  projectpage  = {https://tiger-ai-lab.github.io/ImagenWorld/},\n  blogpost     = {https://blog.comfy.org/p/introducing-imagenworld},\n  note         = {Community-driven dataset and benchmark release, Temporarily archived on Zenodo while arXiv submission is under moderation review.},\n}\n```\n\n```bibtex\n @article{imagenworld2026,\ntitle={ImagenWorld: Stress-Testing Image Generation Models with Explainable Human Evaluation on Open-ended Real-World Tasks},\nurl={http://dx.doi.org/10.36227/techrxiv.176800878.82723313/v1},\nDOI={10.36227/techrxiv.176800878.82723313/v1},\npublisher={Institute of Electrical and Electronics Engineers (IEEE)},\nauthor={Sani, Samin Mahdizadeh and Ku, Max and Jamali, Nima and Sani, Matina Mahdizadeh and Khoshtab, Paria and Sun, Wei-Chieh and Fazel, Parnian and Tam, Zhi Rui and Chong, Thomas and Chan, Edisy Kin Wai and Tsang, Donald Wai Tong and Hsu, Chiao-Wei and Lam, Ting Wai and Ng, Ho Yin Sam and Chu, Chiafeng and Mak, Chak-Wing and Wu, Keming and Wong, Hiu Tung and Ho, Yik Chun and Ruan, Chi and Li, Zhuofeng and Fang, I-Sheng and Yeh, Shih-Ying and Cheng, Ho Kei and Nie, Ping and Chen, Wenhu},\nyear={2026},\nmonth=jan }\n```\n--\u003e\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FTIGER-AI-Lab%2FImagenWorld","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FTIGER-AI-Lab%2FImagenWorld","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FTIGER-AI-Lab%2FImagenWorld/lists"}