https://github.com/xzz2/pa-sam

PA-SAM: Prompt Adapter SAM for High-quality Image Segmentation
https://github.com/xzz2/pa-sam

Last synced: 7 months ago
JSON representation

PA-SAM: Prompt Adapter SAM for High-quality Image Segmentation

Host: GitHub
URL: https://github.com/xzz2/pa-sam
Owner: xzz2
Created: 2024-01-20T04:49:28.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2024-03-13T15:45:54.000Z (over 1 year ago)
Last Synced: 2024-03-13T17:01:48.925Z (over 1 year ago)
Language: Python
Homepage:
Size: 765 KB
Stars: 40
Watchers: 1
Forks: 3
Open Issues: 8
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

Awesome-Segment-Anything - [code

README

        # PA-SAM: Prompt Adapter SAM for High-quality Image Segmentation (ICME2024)

Official implementation of ['PA-SAM: Prompt Adapter SAM for High-quality Image Segmentation'](https://arxiv.org/abs/2401.13051).

## Introduction

The Segment Anything Model (SAM) has exhibited outstanding performance in various image segmentation tasks. Despite being trained with over a billion masks, SAM faces challenges in mask prediction quality in numerous scenarios, especially in real-world contexts. In this paper, we introduce a novel prompt-driven adapter into SAM, namely Prompt Adapter Segment Anything Model (PA-SAM), aiming to enhance the segmentation mask quality of the original SAM. By exclusively training the prompt adapter, PA-SAM extracts detailed information from images and optimizes the mask decoder feature at both sparse and dense prompt levels, improving the segmentation performance of SAM to produce high-quality masks. Experimental results demonstrate that our PA-SAM outperforms other SAM-based methods in high-quality, zero-shot, and open-set segmentation.



  






The architecture of the prompt adapter, which achieves adaptive detail enhancement using a consistent representation module (CRM) and token-to-image attention, and implements hard point mining using the Gumbel top-k point sampler.






  



## PA-SAM vs HQ-SAM

| Model  | DIS  (mIoU) | DIS (mBIoU) | COIFT (mIoU) | COIFT (mBIoU) | HRSOD (mIoU) | HRSOD (mBIoU) | ThinObject (mIoU) | ThinObject (mBIoU) | Average (mIoU) | Average (mBIoU) |

|--------|----------|-----------|------------|-------------|------------|-------------|------------------|-------------------|---------------|----------------|

| HQ-SAM | 78.6     | 70.4      | 94.8       | 90.1        | 93.6       | 86.9        | 89.5             | 79.9              | 89.1          | 81.8           |

| PA-SAM | 81.5     | 73.9      | 95.8       | 92.1        | 94.6       | 88.0        | 92.7             | 84.0              | 91.2          | 84.5           |






## Setup and Installation

The code package can be cloned from the git repository using:

```bash

> git clone https://github.com/xzz2/pa-sam.git

```

### Anaconda Environment Setup

The conda environment for the code can be created using the `envname.yaml` file provided with the scripts.

```bash

> cd pa-sam

> conda env create --file envname.yml

> conda activate pasam

```

## Preparation

Please download dataset **HQSeg-44K** for high-quality segmentation from [hugging face link](https://huggingface.co/sam-hq-team/sam-hq-training/tree/main/data), and the pre-trained weights of SAM from [here](https://dl.fbaipublicfiles.com/segment_anything/sam_vit_l_0b3195.pth). Then, unzip the dataset file and organize them as

```

data

|____DIS5K

|____cascade_psp

| |____DUTS-TE

| |____DUTS-TR

| |____ecssd

| |____fss_all

| |____MSRA_10K

|____thin_object_detection

| |____COIFT

| |____HRSOD

| |____ThinObject5K

pretrained_checkpoint

|____sam_vit_l_0b3195.pth

|____sam_vit_l_maskdecoder.pth

```

## Getting Started

### Training

```

sh train.sh

```

or

```

python -m torch.distributed.launch --nproc_per_node= train.py --checkpoint  --model-type  --output 

```

Example PA-SAM-L training script

```

python -m torch.distributed.launch --nproc_per_node=8 train.py --checkpoint ./pretrained_checkpoint/sam_vit_l_0b3195.pth --model-type vit_l --output work_dirs/pa_sam_l

```

### Evaluation

```

sh eval.sh

```

or

```

python -m torch.distributed.launch --nproc_per_node= train.py --checkpoint  --model-type  --output  --eval --restore-model 

```

Example PA-SAM-L evaluation script, and you can download the checkpoint from [here](https://pan.baidu.com/s/1PGfooGqweEPeXWvA5c55EA?pwd=wr97)

```

python -m torch.distributed.launch --nproc_per_node=1 train.py --checkpoint ./pretrained_checkpoint/sam_vit_l_0b3195.pth --model-type vit_l --output work_dirs/pa_sam_l --eval --restore-model work_dirs/pa_sam_l/epoch_20.pth

```

## TODO

- [ ] Combine pre-trained models to generate more detailed intermediate segmentation masks.

- [ ] Correction training for noisy prompts.

## Citation

```

@article{xie2024pasam,

  title={PA-SAM: Prompt Adapter SAM for High-quality Image Segmentation},

  author={Xie, Zhaozhi and Guan, Bochen and Jiang, Weihao and Yi, Muyang and Ding, Yue and Lu, Hongtao and Zhang, Lei},

  journal={2024 IEEE International Conference on Multimedia and Expo (ICME)},

  year={2024}

}

```

## Acknowledgement

This repo benefits from [Segment Anything in High Quality](https://github.com/SysCV/sam-hq). Thanks for their wonderful works.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/xzz2/pa-sam

Awesome Lists containing this project

README