An open API service indexing awesome lists of open source software.

https://github.com/Tennine2077/PDFNet

Patch-Depth Fusion: Dichotomous Image Segmentation via Fine-Grained Patch Strategy and Depth Integrity-Prior
https://github.com/Tennine2077/PDFNet

Last synced: 3 months ago
JSON representation

Patch-Depth Fusion: Dichotomous Image Segmentation via Fine-Grained Patch Strategy and Depth Integrity-Prior

Awesome Lists containing this project

README

        

# PDFNet

This is the official PyTorch implementation of [PDFNet](https://arxiv.org/abs/2503.06100).

[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/patch-depth-fusion-dichotomous-image/dichotomous-image-segmentation-on-dis-vd)](https://paperswithcode.com/sota/dichotomous-image-segmentation-on-dis-vd?p=patch-depth-fusion-dichotomous-image)
[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/patch-depth-fusion-dichotomous-image/dichotomous-image-segmentation-on-dis-te1)](https://paperswithcode.com/sota/dichotomous-image-segmentation-on-dis-te1?p=patch-depth-fusion-dichotomous-image)
[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/patch-depth-fusion-dichotomous-image/dichotomous-image-segmentation-on-dis-te2)](https://paperswithcode.com/sota/dichotomous-image-segmentation-on-dis-te2?p=patch-depth-fusion-dichotomous-image)
[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/patch-depth-fusion-dichotomous-image/dichotomous-image-segmentation-on-dis-te3)](https://paperswithcode.com/sota/dichotomous-image-segmentation-on-dis-te3?p=patch-depth-fusion-dichotomous-image)
[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/patch-depth-fusion-dichotomous-image/dichotomous-image-segmentation-on-dis-te4)](https://paperswithcode.com/sota/dichotomous-image-segmentation-on-dis-te4?p=patch-depth-fusion-dichotomous-image)
[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/patch-depth-fusion-dichotomous-image/rgb-salient-object-detection-on-hrsod)](https://paperswithcode.com/sota/rgb-salient-object-detection-on-hrsod?p=patch-depth-fusion-dichotomous-image)
[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/patch-depth-fusion-dichotomous-image/rgb-salient-object-detection-on-uhrsd)](https://paperswithcode.com/sota/rgb-salient-object-detection-on-uhrsd?p=patch-depth-fusion-dichotomous-image)

> # Patch-Depth Fusion: Dichotomous Image Segmentation via Fine-Grained Patch Strategy and Depth Integrity-Prior
>
> Xianjie Liu, Keren Fu, Qijun Zhao
>
> arXiv:2401.00248
>
> πŸ”₯2025/3/13: We released the code and checkpoints on GitHub.
>
> πŸ“•2025/3/10: We released the paper on the ArXiv.

πŸ”₯If you are interested in **Dichotomous Image Segmentation** (DIS), we highly recommend checking out our additional project [Awesome Dichotomous Image Segmentation](https://github.com/Tennine2077/Awesome-Dichotomous-Image-Segmentation/tree/main). This project compiles all significant research and resources related to DIS, providing comprehensive references and inspiration for your research and practice. We hope this resource list will help you better understand and apply DIS techniques, driving more accurate image segmentation tasks.

# Abstract

Dichotomous Image Segmentation (DIS) is a high-precision object segmentation task for high-resolution natural images. The current mainstream methods focus on the optimization of local details but overlook the fundamental challenge of modeling the integrity of objects. We have found that the depth integrity-prior implicit in the the pseudo-depth maps generated by Depth Anything Model v2 and the local detail features of image patches can jointly address the above dilemmas. Based on the above findings, we have designed a novel Patch-Depth Fusion Network (PDFNet) for high-precision dichotomous image segmentation. The core of PDFNet consists of three aspects. Firstly, the object perception is enhanced through multi-modal input fusion. By utilizing the patch fine-grained strategy, coupled with patch selection and enhancement, the sensitivity to details is improved. Secondly, by leveraging the depth integrity-prior distributed in the depth maps, we propose an integrity-prior loss to enhance the uniformity of the segmentation results in the depth maps. Finally, we utilize the features of the shared encoder and, through a simple depth refinement decoder, improve the ability of the shared encoder to capture subtle depth-related information in the images. Experiments on the DIS-5K dataset show that PDFNet significantly outperforms state-of-the-art non-diffusion methods. Due to the incorporation of the depth integrity-prior, PDFNet achieves or even surpassing the performance of the latest diffusion-based methods while using less than 11% of the parameters of diffusion-based methods.

![image](pics/Framwork.png)
## Installation
```
conda create -n PDFNet python = 3.11.4
conda activate PDFNet

pip install -r requirements.txt
```
## Dataset Preparation

Please download the [DIS-5K dataset](https://github.com/xuebinqin/DIS) first and place them in the "**data**" directory. The structure of the "**data**" folder should be as follows:
```
PDFNet
└──DATA
└──DIS-DATA
└── DIS-TE1
β”œβ”€β”€ DIS-TE2
β”œβ”€β”€ DIS-TE3
β”œβ”€β”€ DIS-TE4
β”œβ”€β”€ DIS-TR
└── DIS-VD
β”œβ”€β”€images
└──masks
```
Download [Swin-B weights](https://github.com/SwinTransformer/storage/releases/download/v1.0.0/swin_base_patch4_window12_384_22k.pth) into '**checkpoints**'.

### Depth Preparation
Please download the [DAM-V2 Project](https://github.com/DepthAnything/Depth-Anything-V2) and place them into the DAM-V2 and download the [DAM-V2 weights](https://github.com/DepthAnything/Depth-Anything-V2) into the '**checkpoints**'.
Now you can use the '**DAM-V2/Depth-preprare.ipynb**' to generate the pseudo-depth map for training and testing.

# Training

Run
```
python Train_PDFNet.py
```
If you wanna change the training datasets, you can open the '**dataoloaders/Mydataset.py**' '**build_dataset**' function to add the other datasets.

# Test and metric

Open the '**metric_tools/Test**' to change the '**save_dir**' and open the '**soc_metric**' to change the '**gt_roots**' and '**cycle_roots**' to what you need.
Run
```
cd metric_tools
python Test.py
```

# Different training datasets results and checkpoints

| Training Dataset | Checkpoints and Validation Results |
| ---------------------- | ------------------------------------------------------------------------------------------- |
| DIS-5K TR | [DIS](https://drive.google.com/drive/folders/1dqkFVR4TElSRFNHhu6er45OQkoHhJsZz?usp=sharing) |
| HRSOD -TR + UHRSD - TR | Coming soon... |
# Compare
## DIS-5K
Performance comparisons of PDFNet with MAGNet, CPNet, DACOD, RISNet, IS-Net, FP-DIS, UDUN, InSPyReNet, BiRefNet, MVANet ,GenPercept and DiffDIS. The symbols ↑/↓ indicate that higher/lower scores are better. The best score is highlighted in **bold**, and the second is **underlined** without diffusion-based models because of the much larger parameters.
![image](pics/compare.png)
## HRSOD and UHRSD
![image](pics/HRSOD.png)
## Visual results
![image](pics/vcompare.png)
# BibTeX

Please consider to cite PDFNet if it helps your research.
```
@misc{liu2025patchdepthfusiondichotomousimage,
title={Patch-Depth Fusion: Dichotomous Image Segmentation via Fine-Grained Patch Strategy and Depth Integrity-Prior},
author={Xianjie Liu and Keren Fu and Qijun Zhao},
year={2025},
eprint={2503.06100},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2503.06100},
}
```