{"id":15039425,"url":"https://github.com/zai-org/ImageReward","last_synced_at":"2025-12-29T23:00:40.725Z","repository":{"id":152865851,"uuid":"622147274","full_name":"THUDM/ImageReward","owner":"THUDM","description":"[NeurIPS 2023] ImageReward: Learning and Evaluating Human Preferences for Text-to-image Generation","archived":false,"fork":false,"pushed_at":"2025-01-24T13:40:22.000Z","size":4381,"stargazers_count":1369,"open_issues_count":53,"forks_count":71,"subscribers_count":14,"default_branch":"main","last_synced_at":"2025-04-10T19:49:18.148Z","etag":null,"topics":["diffusion-models","generative-model","human-preferences","rlhf"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/THUDM.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-04-01T09:04:17.000Z","updated_at":"2025-04-10T13:42:39.000Z","dependencies_parsed_at":"2025-01-09T01:29:59.209Z","dependency_job_id":"488d6d6b-11f4-4526-b775-5629b33d7f6a","html_url":"https://github.com/THUDM/ImageReward","commit_stats":{"total_commits":31,"total_committers":10,"mean_commits":3.1,"dds":0.5806451612903225,"last_synced_commit":"2ca71bac4ed86b922fe53ddaec3109fe94d45fd3"},"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/THUDM%2FImageReward","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/THUDM%2FImageReward/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/THUDM%2FImageReward/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/THUDM%2FImageReward/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/THUDM","download_url":"https://codeload.github.com/THUDM/ImageReward/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":251516953,"owners_count":21601912,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["diffusion-models","generative-model","human-preferences","rlhf"],"created_at":"2024-09-24T20:42:46.208Z","updated_at":"2025-12-29T23:00:40.094Z","avatar_url":"https://github.com/THUDM.png","language":"Python","funding_links":[],"categories":["Python"],"sub_categories":[],"readme":"# ImageReward\n\n\u003cp align=\"center\"\u003e\n   📃 \u003ca href=\"https://arxiv.org/abs/2304.05977\" target=\"_blank\"\u003ePaper\u003c/a\u003e • 🖼 \u003ca href=\"https://huggingface.co/datasets/THUDM/ImageRewardDB\" target=\"_blank\"\u003eDataset\u003c/a\u003e • 🌐 \u003ca href=\"https://zhuanlan.zhihu.com/p/639494251\" target=\"_blank\"\u003e中文博客\u003c/a\u003e • 🤗 \u003ca href=\"https://huggingface.co/THUDM/ImageReward\" target=\"_blank\"\u003eHF Repo\u003c/a\u003e • 🐦 \u003ca href=\"https://twitter.com/thukeg\" target=\"_blank\"\u003eTwitter\u003c/a\u003e \u003cbr\u003e\n\u003c/p\u003e\n\n🔥🔥 **News!** ```2024/12/31```: We released the **next generation of model, [VisionReward](https://github.com/THUDM/VisionReward)**, which is a fine-grained and multi-dimensional reward model for stable RLHF for visual generation (text-to-image / text-to-video)!\n\n🔥 **News!** ```2023/9/22```: The paper of ImageReward is accepted by NeurIPS 2023!\n\n**ImageReward: Learning and Evaluating Human Preferences for Text-to-Image Generation**\n\nImageReward is the first general-purpose text-to-image human preference RM, which is trained on in total **137k pairs of expert comparisons**, outperforming existing text-image scoring methods, such as CLIP (by 38.6%), Aesthetic (by 39.6%), and BLIP (by 31.6%), in terms of understanding human preference in text-to-image synthesis.\n\nAdditionally, we introduce Reward Feedback Learning (ReFL) for direct optimizing a text-to-image diffusion model using ImageReward. ReFL-tuned Stable Diffusion wins against untuned version by 58.4% in human evaluation.\n\nBoth ImageReward and ReFL are all packed up to Python `image-reward` package now!\n\n[![PyPI](https://img.shields.io/pypi/v/image-reward)](https://pypi.org/project/image-reward/) [![Downloads](https://static.pepy.tech/badge/image-reward)](https://pepy.tech/project/image-reward)\n\nTry `image-reward` package in only 3 lines of code for ImageReward scoring!\n\n```python\n# pip install image-reward\nimport ImageReward as RM\nmodel = RM.load(\"ImageReward-v1.0\")\n\nrewards = model.score(\"\u003cprompt\u003e\", [\"\u003cimg1_obj_or_path\u003e\", \"\u003cimg2_obj_or_path\u003e\", ...])\n```\n\nTry `image-reward` package in only 4 lines of code for ReFL fine-tuning!\n\n```python\n# pip install image-reward\n# pip install diffusers==0.16.0 accelerate==0.16.0 datasets==2.11.0\nfrom ImageReward import ReFL\nargs = ReFL.parse_args()\ntrainer = ReFL.Trainer(\"CompVis/stable-diffusion-v1-4\", \"data/refl_data.json\", args=args)\ntrainer.train(args=args)\n```\n\nIf you find `ImageReward`'s open-source effort useful, please 🌟 us to encourage our following developement!\n\n\u003cp align=\"center\"\u003e\n    \u003cimg src=\"figures/ImageReward.jpg\" width=\"700px\"\u003e\n\u003c/p\u003e\n\n- [ImageReward](#imagereward)\n  - [Quick Start](#quick-start)\n    - [Install Dependency](#install-dependency)\n    - [Example Use](#example-use)\n  - [ReFL](#refl)\n    - [Install Dependency](#install-dependency-1)\n    - [Example Use](#example-use-1)\n  - [Demos of ImageReward and ReFL](#demos-of-imagereward-and-refl)\n  - [Training code for ImageReward](#training-code-for-imagereward)\n  - [Integration into Stable Diffusion Web UI](#integration-into-stable-diffusion-web-ui)\n    - [Features](#features)\n      - [Score generated images and append to image information](#score-generated-images-and-append-to-image-information)\n        - [Usage](#usage)\n        - [Demo video](#demo-video)\n      - [Automatically filter out images with low scores](#automatically-filter-out-images-with-low-scores)\n        - [Usage](#usage-1)\n        - [Demo video](#demo-video-1)\n      - [View the scores of images that have been scored](#view-the-scores-of-images-that-have-been-scored)\n        - [Usage](#usage-2)\n        - [Example](#example)\n      - [Other Features](#other-features)\n        - [Memory Management](#memory-management)\n    - [FAQ](#faq)\n  - [Reproduce Experiments in Table 1](#reproduce-experiments-in-table-1)\n  - [Reproduce Experiments in Table 3](#reproduce-experiments-in-table-3)\n  - [Citation](#citation)\n\n## Quick Start\n\n### Install Dependency\n\nWe have integrated the whole repository to a single python package `image-reward`. Following the commands below to prepare the environment:\n\n```shell\n# Clone the ImageReward repository (containing data for testing)\ngit clone https://github.com/THUDM/ImageReward.git\ncd ImageReward\n\n# Install the integrated package `image-reward`\npip install image-reward\n```\n\n### Example Use\n\nWe provide example images in the [`assets/images`](assets/images) directory of this repo. The example prompt is:\n\n```text\na painting of an ocean with clouds and birds, day time, low depth field effect\n```\n\nUse the following code to get the human preference scores from ImageReward:\n\n```python\nimport os\nimport torch\nimport ImageReward as RM\n\nif __name__ == \"__main__\":\n    prompt = \"a painting of an ocean with clouds and birds, day time, low depth field effect\"\n    img_prefix = \"assets/images\"\n    generations = [f\"{pic_id}.webp\" for pic_id in range(1, 5)]\n    img_list = [os.path.join(img_prefix, img) for img in generations]\n    model = RM.load(\"ImageReward-v1.0\")\n    with torch.no_grad():\n        ranking, rewards = model.inference_rank(prompt, img_list)\n        # Print the result\n        print(\"\\nPreference predictions:\\n\")\n        print(f\"ranking = {ranking}\")\n        print(f\"rewards = {rewards}\")\n        for index in range(len(img_list)):\n            score = model.score(prompt, img_list[index])\n            print(f\"{generations[index]:\u003e16s}: {score:.2f}\")\n\n```\n\nThe output should be like as follow (the exact numbers may be slightly different depending on the compute device):\n\n```\nPreference predictions:\n\nranking = [1, 2, 3, 4]\nrewards = [[0.5811622738838196], [0.2745276093482971], [-1.4131819009780884], [-2.029569625854492]]\n          1.webp: 0.58\n          2.webp: 0.27\n          3.webp: -1.41\n          4.webp: -2.03\n```\n\n\n## ReFL\n\n### Install Dependency\n```shell\npip install diffusers==0.16.0 accelerate==0.16.0 datasets==2.11.0\n```\n\n### Example Use\n\nWe provide example dataset for ReFL in the [`data/refl_data.json`](data/refl_data.json) of this repo. Run ReFL as following:\n\n```shell\nbash scripts/train_refl.sh\n```\n\n## Demos of ImageReward and ReFL\n\n\u003cp align=\"center\"\u003e\n    \u003cimg src=\"figures/Demo.jpg\" width=\"700px\"\u003e\n\u003c/p\u003e\n\n\n## Training code for ImageReward\n\n1. Download data: 🖼 \u003ca href=\"https://huggingface.co/datasets/THUDM/ImageRewardDB\" target=\"_blank\"\u003eDataset\u003c/a\u003e.\n\n2. Make dataset.\n```shell\ncd train\npython src/make_dataset.py\n```\n\n3. Set training config: [`train/src/config/config.yaml`](train/src/config/config.yaml)\n\n4. One command to train.\n```shell\nbash scripts/train_one_node.sh\n```\n\n\n## Integration into [Stable Diffusion Web UI](https://github.com/AUTOMATIC1111/stable-diffusion-webui)\n\nWe have developed a **custom script** to integrate ImageReward into SD Web UI for a convenient experience.\n\nThe script is located at [`sdwebui/image_reward.py`](sdwebui/image_reward.py) in this repository.\n\nThe **usage** of the script is described as follows:\n\n1. **Install**: put the custom script into the [`stable-diffusion-webui/scripts/`](https://github.com/AUTOMATIC1111/stable-diffusion-webui/tree/master/scripts) directory\n2. **Reload**: restart the service, or click the **\"Reload custom script\"** button at the bottom of the settings tab of SD Web UI. (If the button can't be found, try clicking the **\"Show all pages\"** button at the bottom of the left sidebar.)\n3. **Select**: go back to the **\"txt2img\"/\"img2img\"** tab, and select **\"ImageReward - generate human preference scores\"** from the \"**Script\"** dropdown menu in the lower left corner.\n4. **Run**: the specific usage varies depending on the functional requirements, as described in the **\"Features\"** section below.\n\n### Features\n\n#### Score generated images and append to image information\n\n##### Usage\n\n1. **Do not** check the \"Filter out images with low scores\" checkbox.\n2. Click the **\"Generate\"** button to generate images.\n3. Check the ImageReward at the **bottom** of the image information **below the gallery**.\n\n##### Demo video\n\nhttps://github.com/THUDM/ImageReward/assets/98524878/9d8a036d-1583-4978-aac7-4b758edf9b89\n\n#### Automatically filter out images with low scores\n\n##### Usage\n\n1. Check the **\"Filter out images with low scores\"** checkbox.\n2. Enter the score lower limit in **\"Lower score limit\"**. (ImageReward roughly follows the standard normal distribution, with a mean of 0 and a variance of 1.)\n3. Click the **\"Generate\"** button to generate images.\n4. Images with scores below the lower limit will be automatically filtered out and **will not appear in the gallery**.\n5. Check the ImageReward at the **bottom** of the image information **below the gallery**.\n\n##### Demo video\n\nhttps://github.com/THUDM/ImageReward/assets/98524878/b9f01629-87d6-4c92-9990-fe065711b9c6\n\n#### View the scores of images that have been scored\n\n##### Usage\n\n1. Upload the scored image file in the **\"PNG Info\"** tab\n2. Check the image information on the right with the score of the image at the **bottom**.\n\n##### Example\n\n\u003cp align=\"center\"\u003e\n    \u003cimg src=\"https://user-images.githubusercontent.com/98524878/233829640-12190bff-f62b-4160-b05d-29624fa83677.jpg\" width=\"700px\"\u003e\n\u003c/p\u003e\n\n#### Other Features\n\n##### Memory Management\n\n- ImageReward model will not be loaded **until first script run**.\n- **\"Reload UI\"** will not reload the model nor unload it, but **reuse**s the currently loaded model (if it exists).\n- A **\"Unload Model\"** button is provided to manually unload the currently loaded model.\n\n### FAQ\n\n#### How to adjust the Python environment used by the SD Web UI (e.g. reinstall a package)?\n\nNote that **SD Web UI has two ways to set up its Python environment**:\n\n- If you **launch with `python launch.py`**, Web UI will use the Python environment **found in your `PATH` (in Linux, you can check its exact path with `which python`)**.\n- If you **launch with a script like `webui-user.bat`**, Web UI creates a new **venv environment** in the directory `stable-diffusion-webui\\venv`.\n    - Generally, you need some other operations to activate this environment. For example, in Windows, you need to enter the `stable-diffusion-webui\\venv\\Scripts` directory, run `activate` or `activate.bat` (if you are using **cmd**) or `activate.ps1` (if you are using **PowerShell**) from .\n    - If you see **the prompt `(venv)` appear at the far left of the command line**, you have successfully activated venv created by the SD Web UI.\n\nAfter activating the right Python environment, just do what you want to do true to form.\n\n## Reproduce Experiments in Table 1\n\n\u003cp align=\"center\"\u003e\n    \u003cimg alt=\"Table_1_in_paper\" src=\"figures/Table_1_in_paper.png\" width=\"700px\"\u003e\n\u003c/p\u003e\n\n**Note:** The experimental results are produced in an environment that satisfies:\n- (NVIDIA) Driver Version: 515.86.01\n- CUDA Version: 11.7\n- `torch` Version: 1.12.1+cu113\nAccording to our own reproduction experience, reproducing this experiment in other environments may cause the last decimal place to fluctuate, typically within a range of ±0.1.\n\nRun the following script to automatically download data, baseline models, and run experiments:\n\n```bash\nbash ./scripts/test-benchmark.sh\n```\n\nThen you can check the results in **`benchmark/results/` or the terminal**.\n\nIf you want to check the raw data files individually:\n\n- Test prompts and corresponding human rankings for images are located in [`benchmark/benchmark-prompts.json`](benchmark/benchmark-prompts.json).\n- Generated outputs for each prompt (originally from [DiffusionDB](https://github.com/poloclub/diffusiondb)) can be downloaded from [Hugging Face](https://huggingface.co/THUDM/ImageReward/tree/main/generations) or [Tsinghua Cloud](https://cloud.tsinghua.edu.cn/d/8048c335cb464220b663/).\n    - Each `\u003cmodel_name\u003e.zip` contains a directory of the same name, in which there are in total 1000 images generated from 100 prompts of 10 images each.\n    - Every `\u003cmodel_name\u003e.zip` should be decompressed into `benchmark/generations/` as directory `\u003cmodel_name\u003e` that contains images.\n\n## Reproduce Experiments in Table 3\n\n\u003cp align=\"center\"\u003e\n    \u003cimg src=\"figures/Table_3_in_paper.png\" width=\"700px\"\u003e\n\u003c/p\u003e\n\nRun the following script to automatically download data, baseline models, and run experiments:\n\n```bash\nbash ./scripts/test.sh\n```\n\nIf you want to check the raw data files individually:\n\n* Test prompts and corresponding human rankings for images are located in [`data/test.json`](data/test.json).\n* Generated outputs for each prompt (originally from [DiffusionDB](https://github.com/poloclub/diffusiondb)) can be downloaded from [Hugging Face](https://huggingface.co/THUDM/ImageReward/blob/main/test_images.zip) or [Tsinghua Cloud](https://cloud.tsinghua.edu.cn/f/9bd245027652422499f4/?dl=1). It should be decompressed to `data/test_images`.\n\n## Citation\n\n```\n@inproceedings{xu2023imagereward,\n  title={ImageReward: learning and evaluating human preferences for text-to-image generation},\n  author={Xu, Jiazheng and Liu, Xiao and Wu, Yuchen and Tong, Yuxuan and Li, Qinkai and Ding, Ming and Tang, Jie and Dong, Yuxiao},\n  booktitle={Proceedings of the 37th International Conference on Neural Information Processing Systems},\n  pages={15903--15935},\n  year={2023}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fzai-org%2FImageReward","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fzai-org%2FImageReward","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fzai-org%2FImageReward/lists"}