Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/chencn2020/Seagull
Official implementation for "Seagull: No-reference Image Quality Assessment for Regions of Interest via Visual-Language Instruction Tuning"
https://github.com/chencn2020/Seagull
Last synced: about 1 month ago
JSON representation
Official implementation for "Seagull: No-reference Image Quality Assessment for Regions of Interest via Visual-Language Instruction Tuning"
- Host: GitHub
- URL: https://github.com/chencn2020/Seagull
- Owner: chencn2020
- Created: 2024-11-12T03:37:01.000Z (2 months ago)
- Default Branch: main
- Last Pushed: 2024-11-28T15:21:44.000Z (about 1 month ago)
- Last Synced: 2024-11-28T15:36:54.867Z (about 1 month ago)
- Size: 46.3 MB
- Stars: 27
- Watchers: 2
- Forks: 5
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
- Awesome-Segment-Anything - [code
README
:rocket: :rocket: :rocket: **News:**
- To be updated...
- â **Nov. 29, 2024**: We release the [online](#online-demo) and [offline](#offline-demo) demo for SEAGULL.
- â **Nov. 25, 2024**: We make SEAGULL-100w publicly available at [Hugging Face](https://huggingface.co/datasets/Zevin2023/SEAGULL-100w) and [Baidu Netdisk](https://pan.baidu.com/s/1PY_EqdwY1FsCVfNpEXrlHA?pwd=i7h1). More details can be found at [Hugging Face](https://huggingface.co/datasets/Zevin2023/SEAGULL-100w).
- â **Nov. 12, 2024**: We create this repository.## TODO List đ
- [x] Release the SEAGULL-100w dataset.
- [x] Release the online and offline demo.
- [] Release the checkpoints and inference codes.
- [] Release the training codes.
- [] Release the SEAGULL-3k dataset.## Contents đ
1. [Introduction đ](#Introduction)
2. [Try Our Demo đšī¸](#Try-Our-Demo)
3. [Demonstrate đĨ](#Demonstrate)
4. [Acknowledgement đ](#Acknowledgement)## Introduction đ
TL;DR: We propose a novel network (SEAGULL) and construct two datasets (SEAGULL-100w and SEAGULL-3k) to achieve fine-grained IQA for any ROIs.
> Existing Image Quality Assessment (IQA) methods achieve remarkable success in analyzing quality for overall image, but few works explore quality analysis for Regions of Interest (ROIs). The quality analysis of ROIs can provide fine-grained guidance for image quality improvement and is crucial for scenarios focusing on region-level quality. This paper proposes a novel network, SEAGULL, which can SEe and Assess ROIs quality with GUidance from a Large vision-Language model. SEAGULL incorporates a vision-language model (VLM), masks generated by Segment Anything Model (SAM) to specify ROIs, and a meticulously designed Mask-based Feature Extractor (MFE) to extract global and local tokens for specified ROIs, enabling accurate fine-grained IQA for ROIs. Moreover, this paper constructs two ROI-based IQA datasets, SEAGULL-100w and SEAGULL-3k, for training and evaluating ROI-based IQA. SEAGULL-100w comprises about 100w synthetic distortion images with 33 million ROIs for pre-training to improve the model's ability of regional quality perception, and SEAGULL-3k contains about 3k authentic distortion ROIs to enhance the model's ability to perceive real world distortions. After pre-training on SEAGULL-100w and fine-tuning on SEAGULL-3k, SEAGULL shows remarkable performance on fine-grained ROI quality assessment.
## Try Our Demo đšī¸
### Online demo
Click đ to try our demo.### Offline demo
â ī¸ Please make sure the GPU memory of your device is larger than `17GB`.
1. Create the environment
```
conda create -n seagull python=3.10
conda activate seagull
pip install -e .
```2. Install [Gradio Extention](https://github.com/chencn2020/gradio-bbox) for drawing boxes on images.
3. Install Segment Anything Model.
```
pip install git+https://github.com/facebookresearch/segment-anything.git
```4. Download [ViT-B SAM model](https://dl.fbaipublicfiles.com/segment_anything/sam_vit_b_01ec64.pth) and [CLIP-convnext](https://huggingface.co/laion/CLIP-convnext_large_d_320.laion2B-s29B-b131K-ft-soup/blob/main/open_clip_pytorch_model.bin), then put it into the ```checkpoints``` folder.
4. Run demo on your device.
```
python app.py --model Zevin2023/SEAGULL-7B
```
> [!TIP]
> If Hugging Face is not accessible to you, try the following command.```
HF_ENDPOINT=https://hf-mirror.com python app.py --model Zevin2023/SEAGULL-7B
```6. You can also download [SEAGULL-7B](https://huggingface.co/Zevin2023/SEAGULL-7B) and put them into the ```checkpoints``` folder.
The folder structure should be:
```
âââ checkpoints
âââ SEAGULL-7B
â âââ config.json
â âââ pytorch_model-xxxxx-of-xxxxx.bin
â âââ xxx
âââ sam_vit_b_01ec64.pth
âââ open_clip_pytorch_model.bin
```Then run the following command:
```
python app.py --model ./checkpoints/SEAGULL-7B
```
## Demonstrate đĨ## Acknowledgement đ
- [Osprey](https://github.com/CircleRadon/Osprey) and [LLaVA-v1.5](https://github.com/haotian-liu/LLaVA): We build this repostory based on them.
- [RAISE](http://loki.disi.unitn.it/RAISE/): The Dist. images in SEAGULL-100w are constructed based on this dataset.
- [SAM](https://segment-anything.com/) and [SEEM](https://github.com/UX-Decoder/Segment-Everything-Everywhere-All-At-Once): The mask-based ROIs are generated using these two awesome works. And SAM are used to get the segmentation result in the demo.
- [TOPIQ](https://github.com/chaofengc/IQA-PyTorch): The quality scores and importance scores for ROIs are generated using this great FR-IQA.## Stars âī¸
## Citation đī¸
If our work is useful to your research, we will be grateful for you to cite our paper:
```
@misc{chen2024seagull,
title={SEAGULL: No-reference Image Quality Assessment for Regions of Interest via Vision-Language Instruction Tuning},
author={Zewen Chen and Juan Wang and Wen Wang and Sunhan Xu and Hang Xiong and Yun Zeng and Jian Guo and Shuxun Wang and Chunfeng Yuan and Bing Li and Weiming Hu},
year={2024},
eprint={2411.10161},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2411.10161},
}
```