Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/zhangjiewu/awesome-t2i-eval

A curated list of papers and resources for text-to-image evaluation.
https://github.com/zhangjiewu/awesome-t2i-eval

List: awesome-t2i-eval

Last synced: about 1 month ago
JSON representation

A curated list of papers and resources for text-to-image evaluation.

Awesome Lists containing this project

README

        

# awesome-t2i-eval
[![Awesome](https://cdn.rawgit.com/sindresorhus/awesome/d7305f38d29fed78fa85652e3a63e154dd8e8829/media/badge.svg)](https://github.com/zhangjiewu/awesome-t2i-eval)
[![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](https://opensource.org/licenses/MIT)
[![Made With Love](https://img.shields.io/badge/Made%20With-Love-red.svg)](https://github.com/chetanraj/awesome-github-badges)

This repository contains a collection of resources and papers on ***Text-to-Image evaluation***.

## Table of Contents
- [Papers](#papers)
- [Metrics](#metrics)

## Papers

+ [What You See is What You Read? Improving Text-Image Alignment Evaluation](https://arxiv.org/abs/2305.10400) (Jul., 2023)
[![Star](https://img.shields.io/github/stars/yonatanbitton/wysiwyr.svg?style=social&label=Star)](https://github.com/yonatanbitton/wysiwyr)
[![arXiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2305.10400)
[![Website](https://img.shields.io/badge/Website-9cf)](https://wysiwyr-itm.github.io/)

+ [Let's ViCE! Mimicking Human Cognitive Behavior in Image Generation Evaluation](https://arxiv.org/abs/2307.09416) (Jul., 2023)
[![arXiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2307.09416)

+ [Divide, Evaluate, and Refine: Evaluating and Improving Text-to-Image Alignment with Iterative VQA Feedback](https://arxiv.org/abs/2307.04749) (Jul., 2023)
[![Star](https://img.shields.io/github/stars/1jsingh/Divide-Evaluate-and-Refine.svg?style=social&label=Star)](https://github.com/1jsingh/Divide-Evaluate-and-Refine)
[![arXiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2307.04749)
[![Website](https://img.shields.io/badge/Website-9cf)](https://1jsingh.github.io/divide-evaluate-and-refine)

+ [T2I-CompBench: A Comprehensive Benchmark for Open-world Compositional Text-to-image Generation](https://arxiv.org/abs/2307.06350) (Jul., 2023)
[![Star](https://img.shields.io/github/stars/Karine-Huang/T2I-CompBench.svg?style=social&label=Star)](https://github.com/Karine-Huang/T2I-CompBench)
[![arXiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2307.06350)
[![Website](https://img.shields.io/badge/Website-9cf)](https://karine-h.github.io/T2I-CompBench/)

+ [Human Preference Score v2: A Solid Benchmark for Evaluating Human Preferences of Text-to-Image Synthesis](https://arxiv.org/abs/2306.09341) (Jun., 2023)
[![Star](https://img.shields.io/github/stars/tgxs002/HPSv2.svg?style=social&label=Star)](https://github.com/tgxs002/HPSv2)
[![arXiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2306.09341)

+ [Visual Programming for Text-to-Image Generation and Evaluation](https://arxiv.org/abs/2305.15328) (May, 2023)
[![Star](https://img.shields.io/github/stars/aszala/VPEval.svg?style=social&label=Star)](https://github.com/aszala/VPEval)
[![arXiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2305.15328)
[![Website](https://img.shields.io/badge/Website-9cf)](https://vp-t2i.github.io/)

+ [LLMScore: Unveiling the Power of Large Language Models in Text-to-Image Synthesis Evaluation](https://arxiv.org/abs/2305.11116) (May, 2023)
[![Star](https://img.shields.io/github/stars/YujieLu10/LLMScore.svg?style=social&label=Star)](https://github.com/YujieLu10/LLMScore)
[![arXiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2305.11116)

+ [X-IQE: eXplainable Image Quality Evaluation for Text-to-Image Generation with Visual Large Language Models](https://arxiv.org/abs/2305.10843) (May, 2023)
[![Star](https://img.shields.io/github/stars/Schuture/Benchmarking-Awesome-Diffusion-Models.svg?style=social&label=Star)](https://github.com/Schuture/Benchmarking-Awesome-Diffusion-Models)
[![arXiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2305.10843)

+ [Pick-a-Pic: An Open Dataset of User Preferences for Text-to-Image Generation](https://arxiv.org/abs/2305.01569) (May, 2023)
[![Star](https://img.shields.io/github/stars/yuvalkirstain/PickScore.svg?style=social&label=Star)](https://github.com/yuvalkirstain/PickScore)
[![arXiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2305.01569)

+ [ImageReward: Learning and Evaluating Human Preferences for Text-to-Image Generation](https://arxiv.org/abs/2304.05977) (Apr., 2023)
[![Star](https://img.shields.io/github/stars/THUDM/ImageReward.svg?style=social&label=Star)](https://github.com/THUDM/ImageReward)
[![arXiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2304.05977)

+ [Better Aligning Text-to-Image Models with Human Preference](https://arxiv.org/abs/2303.14420) (Mar., 2023)
[![Star](https://img.shields.io/github/stars/tgxs002/align_sd.svg?style=social&label=Star)](https://github.com/tgxs002/align_sd)
[![arXiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2303.14420)
[![Website](https://img.shields.io/badge/Website-9cf)](https://tgxs002.github.io/align_sd_web/)

+ [TIFA: Accurate and Interpretable Text-to-Image Faithfulness Evaluation with Question Answering](https://arxiv.org/abs/2303.11897) (Mar., 2023)
[![Star](https://img.shields.io/github/stars/Yushi-Hu/tifa.svg?style=social&label=Star)](https://github.com/Yushi-Hu/tifa)
[![arXiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2303.11897)
[![Website](https://img.shields.io/badge/Website-9cf)](https://tifa-benchmark.github.io/)

+ [DALL-Eval: Probing the Reasoning Skills and Social Biases of Text-to-Image Generative Transformers](https://arxiv.org/abs/2202.04053) (Feb., 2022)
[![Star](https://img.shields.io/github/stars/j-min/DallEval.svg?style=social&label=Star)](https://github.com/j-min/DallEval)
[![arXiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2202.04053)

## Metrics

+ [IS (Inception Score)](https://arxiv.org/abs/1606.03498)
[![Star](https://img.shields.io/github/stars/openai/improved-gan?style=social&label=Star)](https://github.com/openai/improved-gan)
[![arXiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/1606.03498)
- **Summary**: IS evaluates the quality and diversity of generated images.
- **Implemetation**: [PyTorch](https://github.com/sbarratt/inception-score-pytorch)[![Star](https://img.shields.io/github/stars/sbarratt/inception-score-pytorch?style=social&label=Star)](https://github.com/sbarratt/inception-score-pytorch)

+ [FID (Fréchet Inception Distance)](https://arxiv.org/abs/1706.08500)
[![Star](https://img.shields.io/github/stars/bioinf-jku/TTUR?style=social&label=Star)](https://github.com/bioinf-jku/TTUR)
[![arXiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/1706.08500)
- **Summary**: FID measures the quality of generated images by comparing the distribution of generated images to real images in the feature space of a pre-trained Inception network.
- **Implemetation**: [PyTorch](https://github.com/mseitzer/pytorch-fid)[![Star](https://img.shields.io/github/stars/mseitzer/pytorch-fid?style=social&label=Star)](https://github.com/mseitzer/pytorch-fid)

+ [CLIP Score](https://arxiv.org/abs/2103.00020)
[![Star](https://img.shields.io/github/stars/openai/CLIP?style=social&label=Star)](https://github.com/openai/CLIP)
[![arXiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2103.00020)
- **Summary**: CLIP Score measures the consistency between text and generated images.

+ [BLIP Score](https://arxiv.org/abs/2201.12086)
[![Star](https://img.shields.io/github/stars/salesforce/LAVIS?style=social&label=Star)](https://github.com/salesforce/LAVIS)
[![arXiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2201.12086)
- **Summary**: BLIP Score measures the consistency between text and generated images, similar to CLIP Score but with different methodologies.

---

Feel free to **Fork** this repository and contribute via **Pull Requests**. If you have any questions, please open an **Issue**.