https://github.com/kwaivgi/uniaa
Unified Multi-modal IAA Baseline and Benchmark
https://github.com/kwaivgi/uniaa
benchmark dataset image-aesthetic-assessment llava mllm
Last synced: 6 months ago
JSON representation
Unified Multi-modal IAA Baseline and Benchmark
- Host: GitHub
- URL: https://github.com/kwaivgi/uniaa
- Owner: KwaiVGI
- Created: 2024-03-15T08:50:53.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-09-27T10:29:06.000Z (about 1 year ago)
- Last Synced: 2025-03-17T06:06:07.713Z (7 months ago)
- Topics: benchmark, dataset, image-aesthetic-assessment, llava, mllm
- Language: Python
- Homepage:
- Size: 9.12 MB
- Stars: 74
- Watchers: 5
- Forks: 5
- Open Issues: 4
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
![]()
# Uniaa: A Unified Multi-modal Image Aesthetic Assessment Baseline and Benchmark
The Unified Multi-modal Image Aesthetic Assessment Framework, containing a baseline (a) and a benchmark (b). The aesthetic perception performance of UNIAA-LLaVA and other MLLMs is shown in (c).
![]()
The IAA Datasets Conversion Paradigm for UNIAA-LLaVA.
![]()
The UNIAA-Bench overview. (a) UNIAA-QA contains 5354 Image-Question-Answer samples and (b) UNIAA-Describe contains 501 Image-Description samples. (c) For open-source MLLMs, Logits can be extracted to calculate the score.
![]()
## Release
- [9/25] 🔥 Our [UNIAA](https://huggingface.co/datasets/zkzhou/UNIAA) data is released! The corresponding fine-tuning and evaluation code can be found in the GitHub repository folder.
- [4/15] 🔥 We build the page of UNIAA!
## Performance
### Aesthetic Perception Performance
### Aesthetic Description Performance
### Aesthetic Assessment Performance
#### Zero-shot
#### Supervised learning on AVA and TAD66K
## Training on data of UNIAA
#### Step 1: Download Images and Json files
#### Step 2: Training On Specific MLLM## Test on UNIAA-Bench
### For Aesthetic Perception
#### Step 1: Download Images and Json files
#### Step 2: Run the inference code
#### Step 3: Calculate the score### For Aesthetic Description
#### Step 1: Download Images and Json files
#### Step 2: Run the inference code## Citation
If you find UNIAA useful for your your research and applications, please cite using this BibTeX:
```bibtex
@misc{zhou2024uniaa,
title={UNIAA: A Unified Multi-modal Image Aesthetic Assessment Baseline and Benchmark},
author={Zhaokun Zhou and Qiulin Wang and Bin Lin and Yiwei Su and Rui Chen and Xin Tao and Amin Zheng and Li Yuan and Pengfei Wan and Di Zhang},
year={2024},
eprint={2404.09619},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```## Contact
If you have any questions, please feel free to email wangqiulin@kuaishou.com and zhouzhaokun@stu.pku.edu.cn.