https://github.com/zhangjiewu/awesome-t2i-eval
A curated list of papers and resources for text-to-image evaluation.
https://github.com/zhangjiewu/awesome-t2i-eval
List: awesome-t2i-eval
Last synced: 7 months ago
JSON representation
A curated list of papers and resources for text-to-image evaluation.
- Host: GitHub
- URL: https://github.com/zhangjiewu/awesome-t2i-eval
- Owner: zhangjiewu
- License: mit
- Created: 2023-09-01T07:45:10.000Z (almost 2 years ago)
- Default Branch: main
- Last Pushed: 2023-09-06T08:05:17.000Z (almost 2 years ago)
- Last Synced: 2024-05-20T23:24:06.379Z (about 1 year ago)
- Homepage:
- Size: 12.7 KB
- Stars: 26
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- ultimate-awesome - awesome-t2i-eval - A curated list of papers and resources for text-to-image evaluation. (Other Lists / Julia Lists)
README
# awesome-t2i-eval
[](https://github.com/zhangjiewu/awesome-t2i-eval)
[](https://opensource.org/licenses/MIT)
[](https://github.com/chetanraj/awesome-github-badges)This repository contains a collection of resources and papers on ***Text-to-Image evaluation***.
## Table of Contents
- [Papers](#papers)
- [Metrics](#metrics)## Papers
+ [What You See is What You Read? Improving Text-Image Alignment Evaluation](https://arxiv.org/abs/2305.10400) (Jul., 2023)
[](https://github.com/yonatanbitton/wysiwyr)
[](https://arxiv.org/abs/2305.10400)
[](https://wysiwyr-itm.github.io/)+ [Let's ViCE! Mimicking Human Cognitive Behavior in Image Generation Evaluation](https://arxiv.org/abs/2307.09416) (Jul., 2023)
[](https://arxiv.org/abs/2307.09416)+ [Divide, Evaluate, and Refine: Evaluating and Improving Text-to-Image Alignment with Iterative VQA Feedback](https://arxiv.org/abs/2307.04749) (Jul., 2023)
[](https://github.com/1jsingh/Divide-Evaluate-and-Refine)
[](https://arxiv.org/abs/2307.04749)
[](https://1jsingh.github.io/divide-evaluate-and-refine)+ [T2I-CompBench: A Comprehensive Benchmark for Open-world Compositional Text-to-image Generation](https://arxiv.org/abs/2307.06350) (Jul., 2023)
[](https://github.com/Karine-Huang/T2I-CompBench)
[](https://arxiv.org/abs/2307.06350)
[](https://karine-h.github.io/T2I-CompBench/)+ [Human Preference Score v2: A Solid Benchmark for Evaluating Human Preferences of Text-to-Image Synthesis](https://arxiv.org/abs/2306.09341) (Jun., 2023)
[](https://github.com/tgxs002/HPSv2)
[](https://arxiv.org/abs/2306.09341)+ [Visual Programming for Text-to-Image Generation and Evaluation](https://arxiv.org/abs/2305.15328) (May, 2023)
[](https://github.com/aszala/VPEval)
[](https://arxiv.org/abs/2305.15328)
[](https://vp-t2i.github.io/)+ [LLMScore: Unveiling the Power of Large Language Models in Text-to-Image Synthesis Evaluation](https://arxiv.org/abs/2305.11116) (May, 2023)
[](https://github.com/YujieLu10/LLMScore)
[](https://arxiv.org/abs/2305.11116)+ [X-IQE: eXplainable Image Quality Evaluation for Text-to-Image Generation with Visual Large Language Models](https://arxiv.org/abs/2305.10843) (May, 2023)
[](https://github.com/Schuture/Benchmarking-Awesome-Diffusion-Models)
[](https://arxiv.org/abs/2305.10843)+ [Pick-a-Pic: An Open Dataset of User Preferences for Text-to-Image Generation](https://arxiv.org/abs/2305.01569) (May, 2023)
[](https://github.com/yuvalkirstain/PickScore)
[](https://arxiv.org/abs/2305.01569)+ [ImageReward: Learning and Evaluating Human Preferences for Text-to-Image Generation](https://arxiv.org/abs/2304.05977) (Apr., 2023)
[](https://github.com/THUDM/ImageReward)
[](https://arxiv.org/abs/2304.05977)+ [Better Aligning Text-to-Image Models with Human Preference](https://arxiv.org/abs/2303.14420) (Mar., 2023)
[](https://github.com/tgxs002/align_sd)
[](https://arxiv.org/abs/2303.14420)
[](https://tgxs002.github.io/align_sd_web/)+ [TIFA: Accurate and Interpretable Text-to-Image Faithfulness Evaluation with Question Answering](https://arxiv.org/abs/2303.11897) (Mar., 2023)
[](https://github.com/Yushi-Hu/tifa)
[](https://arxiv.org/abs/2303.11897)
[](https://tifa-benchmark.github.io/)+ [DALL-Eval: Probing the Reasoning Skills and Social Biases of Text-to-Image Generative Transformers](https://arxiv.org/abs/2202.04053) (Feb., 2022)
[](https://github.com/j-min/DallEval)
[](https://arxiv.org/abs/2202.04053)## Metrics
+ [IS (Inception Score)](https://arxiv.org/abs/1606.03498)
[](https://github.com/openai/improved-gan)
[](https://arxiv.org/abs/1606.03498)
- **Summary**: IS evaluates the quality and diversity of generated images.
- **Implemetation**: [PyTorch](https://github.com/sbarratt/inception-score-pytorch)[](https://github.com/sbarratt/inception-score-pytorch)+ [FID (Fréchet Inception Distance)](https://arxiv.org/abs/1706.08500)
[](https://github.com/bioinf-jku/TTUR)
[](https://arxiv.org/abs/1706.08500)
- **Summary**: FID measures the quality of generated images by comparing the distribution of generated images to real images in the feature space of a pre-trained Inception network.
- **Implemetation**: [PyTorch](https://github.com/mseitzer/pytorch-fid)[](https://github.com/mseitzer/pytorch-fid)+ [CLIP Score](https://arxiv.org/abs/2103.00020)
[](https://github.com/openai/CLIP)
[](https://arxiv.org/abs/2103.00020)
- **Summary**: CLIP Score measures the consistency between text and generated images.+ [BLIP Score](https://arxiv.org/abs/2201.12086)
[](https://github.com/salesforce/LAVIS)
[](https://arxiv.org/abs/2201.12086)
- **Summary**: BLIP Score measures the consistency between text and generated images, similar to CLIP Score but with different methodologies.---
Feel free to **Fork** this repository and contribute via **Pull Requests**. If you have any questions, please open an **Issue**.