https://github.com/rese1f/uniap

[AAAI 2024] UniAP: Towards Universal Animal Perception in Vision via Few-shot Learning
https://github.com/rese1f/uniap

2d-pose-estimation animal classification computer-vision semantic-segmentation

Last synced: 12 months ago
JSON representation

[AAAI 2024] UniAP: Towards Universal Animal Perception in Vision via Few-shot Learning

Host: GitHub
URL: https://github.com/rese1f/uniap
Owner: rese1f
License: mit
Created: 2023-08-03T17:45:36.000Z (almost 3 years ago)
Default Branch: main
Last Pushed: 2023-12-10T21:46:35.000Z (over 2 years ago)
Last Synced: 2025-06-21T09:07:34.098Z (about 1 year ago)
Topics: 2d-pose-estimation, animal, classification, computer-vision, semantic-segmentation
Language: Python
Homepage: https://rese1f.github.io/UniAP/
Size: 1.15 MB
Stars: 12
Watchers: 3
Forks: 3
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          

# UniAP

[![](http://img.shields.io/badge/cs.CV-arXiv%3A2308.09953-B31B1B.svg)](https://arxiv.org/abs/2308.09953)

> **UniAP: Towards Universal Animal Perception in Vision via Few-shot Learning**  

> Meiqi Sun*, Zhonghan Zhao*, Wenhao Chai*, Hanjun Luo, Shidong Cao, Yanting Zhang, Jenq-Neng Hwang, Gaoang Wang   

> _AAAI 2024_  

We introduce UniAP, a novel Universal Animal Perception model that leverages few-shot learning to enable cross-species perception among various visual tasks.

## :fire: News

* **[2023.12.10]**: 🎉 Our paper is accepted by AAAI 2024.

* **[2023.08.20]** : We release our code.

* **[2023.08.19]** :page_with_curl: We release the [paper](https://arxiv.org/abs/https://arxiv.org/abs/2308.09953).

 If you like our project, please give us a star ⭐ on GitHub for the latest update.


## Setup

1. Download Datasets

  * Animal Kingdom Dataset (pose estimation) from the official GitHub page https://github.com/sutdcv/Animal-Kingdom/blob/master/Animal_Kingdom/pose_estimation/README_pose_estimation.md.

  * Animal Pose Dataset from the official GitHub page https://github.com/noahcao/animal-pose-dataset.

  * APT-36K Dataset from the official GitHub page https://github.com/pandorgan/APT-36K.

  * Oxford-IIIT Pet Dataset from the official page https://www.robots.ox.ac.uk/~vgg/pets/.

  * (Optional) Resize the images and labels into (256, 256) resolution.

  * We stored data from all animal images and labels in a single directory. The directory structure looks like:

  ```

  

  |--

  |   |--_

  |   | ...

  |   |--_

  |   |...

  |

  |--

  |   |--_

  |   | ...

  |   |--_

  |   |...

  |

  |--

  |   |--_

  |   | ...

  |   |--_

  |   |...

  |

  |--

  |   |--_

  |   | ...

  |   |--_

  |   |...

  |

  |...

  ```

1. Create `data_paths.yaml` file and write the root directory path (`` in the above structure) by `UniASET: PATH_TO_YOUR_UniASET`.

2. Install pre-requirements by `pip install -r requirements.txt`.

3. Create `model/pretrained_checkpoints` directory and download [BEiT pre-trained checkpoints](https://github.com/microsoft/unilm/tree/master/beit) to the directory.

  * We used `beit_base_patch16_224_pt22k` checkpoint for our experiment.

  * We also provided the pre-trained model trained on the AnimalKingdom dataset that can be used to run the configs/demo.yaml (https://drive.google.com/file/d/1HmSMn1h4rY5JtEjS7Th8iPTJhFbAnW9x/view?usp=sharing)

## Usage

### Training

```

python main.py --stage 0 --task_id [0/1/2/3]

```

  * If you want to train universally on all tasks, please set `task_id=3`. 

  * If you want to train on the specific task, please follow `task_id=0`: pose estimation, `task_id=1`: semantic segmentation, `task_id=2`: classification.

### Fine-tuning

```

python main.py --stage 1 --task [kp/mask/cls]

```

* If you want to finetune on the specific task, please follow `task=kp`: pose estimation, `task=mask`: semantic segmentation, `task=cls`: classification.

### Evaluation

```

python main.py --stage 2 --task [kp/mask/cls]

```

* If you want to evaluate on the specific task, please follow `task=kp`: pose estimation, `task=mask`: semantic segmentation, `task=cls`: classification.

## Acknowledgements

Our code refers the following repositores:

* [BEiT: BERT Pre-Training of Image Transformers](https://github.com/microsoft/unilm/tree/master/beit)

* [Pose for Everything: Towards Category-Agnostic Pose Estimation](https://github.com/luminxu/Pose-for-Everything)

* [Images Speak in Images: A Generalist Painter for In-Context Visual Learning](https://github.com/baaivision/Painter)

* [Segment Anything](https://github.com/facebookresearch/segment-anything)

* [Contrastive Language-Image Pre-Training](https://github.com/openai/CLIP)

## Citation

If you find STEVE useful for your your research and applications, please cite using this BibTeX:

```bibtex

@article{sun2023uniap,

  title={UniAP: Towards Universal Animal Perception in Vision via Few-shot Learning},

  author={Sun, Meiqi and Zhao, Zhonghan and Chai, Wenhao and Luo, Hanjun and Cao, Shidong and Zhang, Yanting and Hwang, Jenq-Neng and Wang, Gaoang},

  journal={arXiv preprint arXiv:2308.09953},

  year={2023}

}

```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/rese1f/uniap

Awesome Lists containing this project

README

If you like our project, please give us a star ⭐ on GitHub for the latest update.