https://github.com/lartpang/zoomnext

ZoomNeXt: A Unified Collaborative Pyramid Network for Camouflaged Object Detection (TPAMI 2024)
https://github.com/lartpang/zoomnext

camouflage-detection camouflaged-object-detection camouflaged-target-detection image-camouflaged-object-detection image-video-unified-model unified-model video-camouflaged-object-detection

Last synced: 4 months ago
JSON representation

ZoomNeXt: A Unified Collaborative Pyramid Network for Camouflaged Object Detection (TPAMI 2024)

Host: GitHub
URL: https://github.com/lartpang/zoomnext
Owner: lartpang
Created: 2023-10-30T07:15:08.000Z (about 2 years ago)
Default Branch: main
Last Pushed: 2024-12-13T06:54:17.000Z (about 1 year ago)
Last Synced: 2024-12-28T23:32:40.411Z (about 1 year ago)
Topics: camouflage-detection, camouflaged-object-detection, camouflaged-target-detection, image-camouflaged-object-detection, image-video-unified-model, unified-model, video-camouflaged-object-detection
Language: Python
Homepage: https://arxiv.org/abs/2310.20208
Size: 77.1 KB
Stars: 39
Watchers: 6
Forks: 6
Open Issues: 2
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

          # ZoomNeXt: A Unified Collaborative Pyramid Network for Camouflaged Object Detection (TPAMI 2024)



  



[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/zoomnext-a-unified-collaborative-pyramid/camouflaged-object-segmentation-on-camo)](https://paperswithcode.com/sota/camouflaged-object-segmentation-on-camo?p=zoomnext-a-unified-collaborative-pyramid) [![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/zoomnext-a-unified-collaborative-pyramid/camouflaged-object-segmentation-on-chameleon)](https://paperswithcode.com/sota/camouflaged-object-segmentation-on-chameleon?p=zoomnext-a-unified-collaborative-pyramid) [![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/zoomnext-a-unified-collaborative-pyramid/camouflaged-object-segmentation-on-cod)](https://paperswithcode.com/sota/camouflaged-object-segmentation-on-cod?p=zoomnext-a-unified-collaborative-pyramid) [![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/zoomnext-a-unified-collaborative-pyramid/camouflaged-object-segmentation-on-nc4k)](https://paperswithcode.com/sota/camouflaged-object-segmentation-on-nc4k?p=zoomnext-a-unified-collaborative-pyramid) [![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/zoomnext-a-unified-collaborative-pyramid/camouflaged-object-segmentation-on-moca-mask)](https://paperswithcode.com/sota/camouflaged-object-segmentation-on-moca-mask?p=zoomnext-a-unified-collaborative-pyramid) [![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/zoomnext-a-unified-collaborative-pyramid/camouflaged-object-segmentation-on)](https://paperswithcode.com/sota/camouflaged-object-segmentation-on?p=zoomnext-a-unified-collaborative-pyramid)

```bibtex

@ARTICLE {ZoomNeXt,

    title   = {ZoomNeXt: A Unified Collaborative Pyramid Network for Camouflaged Object Detection},

    author  ={Youwei Pang and Xiaoqi Zhao and Tian-Zhu Xiang and Lihe Zhang and Huchuan Lu},

    journal = {IEEE Transactions on Pattern Analysis and Machine Intelligence},

    year    = {2024},

    doi     = {10.1109/TPAMI.2024.3417329},

}

```

## Weights and Results

See [Google Drive](https://drive.google.com/drive/folders/1Hp3GIqossOrJYs3bRzJICujMKbZy4WxO?usp=drive_link).

### Performance

| Backbone        | CAMO-TE |                      |       | CHAMELEON |                      |       | COD10K-TE |                      |       | NC4K  |                      |       |

| --------------- | ------- | -------------------- | ----- | --------- | -------------------- | ----- | --------- | -------------------- | ----- | ----- | -------------------- | ----- |

|                 | $S_m$   | $F^{\omega}_{\beta}$ | MAE   | $S_m$     | $F^{\omega}_{\beta}$ | MAE   | $S_m$     | $F^{\omega}_{\beta}$ | MAE   | $S_m$ | $F^{\omega}_{\beta}$ | MAE   |

| ResNet-50       | 0.833   | 0.774                | 0.065 | 0.908     | 0.858                | 0.021 | 0.861     | 0.768                | 0.026 | 0.874 | 0.816                | 0.037 |

| EfficientNet-B1 | 0.848   | 0.803                | 0.056 | 0.916     | 0.870                | 0.020 | 0.863     | 0.773                | 0.024 | 0.876 | 0.823                | 0.036 |

| EfficientNet-B4 | 0.867   | 0.824                | 0.046 | 0.911     | 0.865                | 0.020 | 0.875     | 0.797                | 0.021 | 0.884 | 0.837                | 0.032 |

| PVTv2-B2        | 0.874   | 0.839                | 0.047 | 0.922     | 0.884                | 0.017 | 0.887     | 0.818                | 0.019 | 0.892 | 0.852                | 0.030 |

| PVTv2-B3        | 0.885   | 0.854                | 0.042 | 0.927     | 0.898                | 0.017 | 0.895     | 0.829                | 0.018 | 0.900 | 0.861                | 0.028 |

| PVTv2-B4        | 0.888   | 0.859                | 0.040 | 0.925     | 0.897                | 0.016 | 0.898     | 0.838                | 0.017 | 0.900 | 0.865                | 0.028 |

| PVTv2-B5        | 0.889   | 0.857                | 0.041 | 0.924     | 0.885                | 0.018 | 0.898     | 0.827                | 0.018 | 0.903 | 0.863                | 0.028 |

| Backbone       | CAD   |                      |       |       |       | MoCA-Mask-TE |                      |       |       |       |

| -------------- | ----- | -------------------- | ----- | ----- | ----- | ------------ | -------------------- | ----- | ----- | ----- |

|                | $S_m$ | $F^{\omega}_{\beta}$ | MAE   | mDice | mIoU  | $S_m$        | $F^{\omega}_{\beta}$ | MAE   | mDice | mIoU  |

| PVTv2-B5 (T=5) | 0.757 | 0.593                | 0.020 | 0.599 | 0.510 | 0.734        | 0.476                | 0.010 | 0.497 | 0.422 |

## Prepare Data

> [!note]

>

> CAD dataset can be found at https://drive.google.com/file/d/1XhrC6NSekGOAAM7osLne3p46pj1tLFdI/view?usp=sharing

>

> Based on the following data setup, the performance of the VCOD dataset evaluated directly using the training script is now consistent with the paper.

Set all dataset information to the `dataset.yaml` as follows.

Example of the config file (dataset.yaml):

```yaml

# VCOD Datasets

moca_mask_tr:

  {

    root: "YOUR-VCOD-DATASETS-ROOT/MoCA-Mask/MoCA_Video/TrainDataset_per_sq",

    image: { path: "*/Imgs", suffix: ".jpg" },

    mask: { path: "*/GT", suffix: ".png" },

    start_idx: 0,

    end_idx: 0

  }

moca_mask_te:

  {

    root: "YOUR-VCOD-DATASETS-ROOT/MoCA-Mask/MoCA_Video/TestDataset_per_sq",

    image: { path: "*/Imgs", suffix: ".jpg" },

    mask: { path: "*/GT", suffix: ".png" },

    start_idx: 0,

    end_idx: -2

  }

cad:

  {

    root: "YOUR-VCOD-DATASETS-ROOT/CamouflagedAnimalDataset",

    image: { path: "original_data/*/frames", suffix: ".png" },

    mask: { path: "converted_mask/*/groundtruth", suffix: ".png" },

    start_idx: 0,

    end_idx: 0

  }

# ICOD Datasets

cod10k_tr:

  {

    root: "YOUR-ICOD-DATASETS-ROOT/Train/COD10K-TR",

    image: { path: "Image", suffix: ".jpg" },

    mask: { path: "Mask", suffix: ".png" },

  }

camo_tr:

  {

    root: "YOUR-ICOD-DATASETS-ROOT/Train/CAMO-TR",

    image: { path: "Image", suffix: ".jpg" },

    mask: { path: "Mask", suffix: ".png" },

  }

cod10k_te:

  {

    root: "YOUR-ICOD-DATASETS-ROOT/Test/COD10K-TE",

    image: { path: "Image", suffix: ".jpg" },

    mask: { path: "Mask", suffix: ".png" },

  }

camo_te:

  {

    root: "YOUR-ICOD-DATASETS-ROOT/Test/CAMO-TE",

    image: { path: "Image", suffix: ".jpg" },

    mask: { path: "Mask", suffix: ".png" },

  }

chameleon:

  {

    root: "YOUR-ICOD-DATASETS-ROOT/Test/CHAMELEON",

    image: { path: "Image", suffix: ".jpg" },

    mask: { path: "Mask", suffix: ".png" },

  }

nc4k:

  {

    root: "YOUR-ICOD-DATASETS-ROOT/Test/NC4K",

    image: { path: "Imgs", suffix: ".jpg" },

    mask: { path: "GT", suffix: ".png" },

  }

```

## Install Requirements

* torch==2.1.2

* torchvision==0.16.2

* Others: `pip install -r requirements.txt`

## Evaluation

```shell

# ICOD

python main_for_image.py --config configs/icod_train.py --model-name  --evaluate --load-from 

# VCOD

python main_for_video.py --config configs/vcod_finetune.py --model-name  --evaluate --load-from 

```

## Training

### Image Camouflaged Object Detection

```shell

python main_for_image.py --config configs/icod_train.py --pretrained --model-name EffB1_ZoomNeXt

python main_for_image.py --config configs/icod_train.py --pretrained --model-name EffB4_ZoomNeXt

python main_for_image.py --config configs/icod_train.py --pretrained --model-name PvtV2B2_ZoomNeXt

python main_for_image.py --config configs/icod_train.py --pretrained --model-name PvtV2B3_ZoomNeXt

python main_for_image.py --config configs/icod_train.py --pretrained --model-name PvtV2B4_ZoomNeXt

python main_for_image.py --config configs/icod_train.py --pretrained --model-name PvtV2B5_ZoomNeXt

python main_for_image.py --config configs/icod_train.py --pretrained --model-name RN50_ZoomNeXt

```

### Video Camouflaged Object Detection

1. Pretrain on COD10K-TR: `python main_for_image.py --config configs/icod_pretrain.py --info pretrain --model-name PvtV2B5_ZoomNeXt --pretrained`

2. Finetune on MoCA-Mask-TR: `python main_for_video.py --config configs/vcod_finetune.py --info finetune --model-name videoPvtV2B5_ZoomNeXt --load-from `

> [!note]

> If you meets the OOM problem, you can try to reduce the batch size or switch on the `--use-checkpoint` flag:

> `python main_for_image.py/main_for_video.py  --use-checkpoint`

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/lartpang/zoomnext

Awesome Lists containing this project

README