Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/Megvii-BaseDetection/DisAlign

Implementation of "Distribution Alignment: A Unified Framework for Long-tail Visual Recognition"(CVPR 2021)
https://github.com/Megvii-BaseDetection/DisAlign

Last synced: 3 months ago
JSON representation

Implementation of "Distribution Alignment: A Unified Framework for Long-tail Visual Recognition"(CVPR 2021)

Awesome Lists containing this project

README

        


Implementation of "Distribution Alignment: A Unified Framework for Long-tail Visual Recognition"(CVPR 2021)

[[Paper]](https://arxiv.org/abs/2103.16370)[[Code]](https://github.com/Megvii-BaseDetection/DisAlign)

We implement the classification, object detection and instance segmentation tasks based on our [cvpods](https://github.com/Megvii-BaseDetection/cvpods). The users should **install cvpods first** and run the experiments in this repo.

# Changelog
- **7.28.2021** Update the **DisAlign** on ImageNet-LT(ResX50)
- **4.23.2021** Update the **DisAlign** on LVIS v0.5(Mask R-CNN + Res50)
- **4.12.2021** Update the README

# 0. How to Use
- Step-1: Install the latest [cvpods](https://github.com/Megvii-BaseDetection/cvpods).
- Step-2: `cd cvpods`
- Step-3: Prepare dataset for different tasks.
- Step-4: `git clone https://github.com/Megvii-BaseDetection/DisAlign playground_disalign`
- Step-5: Enter one folder and run `pods_train --num-gpus 8`
- Step-6: Use `pods_test --num-gpus 8` to evaluate the last the checkpoint

# 1. Image Classification
We support the the following three datasets:
- ImageNet-LT Dataset
- iNaturalist-2018 Dataset
- Place-LT Dataset

We refer the user to [CLS_README](classification/README.md) for more details.

# 2. Object Detection/Instance Segmentation
We support the two versions of the LVIS dataset:
- LVIS v0.5
- LVIS v1.0

**Highlight**
1. To speedup the evaluation on LVIS dataset, we provide the C++ optimized evaluation api by modifying the [coco_eval(C++)](https://github.com/Megvii-BaseDetection/cvpods/blob/master/cvpods/layers/csrc/cocoeval/cocoeval.cpp) in `cvpods`.
- The C++ version lvis_eval API will save **~30% time** when calculating the mAP.
2. We provide support for the metric of `AP_fixed` and `AP_pool` proposed in [large-vocab-devil](https://github.com/achalddave/large-vocab-devil)
3. We will support more recent works on long-tail detection in this project(e.g. EQLv2, CenterNet2, etc.) in the future.

We refer the user to [DET_README](segmentation/README.md) for more details.

# 3. Semantic Segmentation

We adopt the mmsegmentation as the codebase for runing all experiments of DisAlign.
Currently, the user should use [DisAlign_Seg](TODO) for the semantic segmentation experiments.
We will add the support for these experiments in [cvpods](https://github.com/Megvii-BaseDetection/cvpods) in the future.

# Acknowledgement

Thanks for the following projects:
- [cvpods](https://github.com/Megvii-BaseDetection/cvpods)
- [Detectron2](https://github.com/facebookresearch/detectron2)
- [mmsegmentation](https://github.com/open-mmlab/mmsegmentation)
- [classifier-balancing](https://github.com/facebookresearch/classifier-balancing)

# Citing DisAlign
If you are using the DisAlign in your research or with to refer to the baseline results publised in this repo, please use the following BibTex entry.

```latex
@inproceedings{zhang2021disalign,
title={Distribution Alignment: A Unified Framework for Long-tail Visual Recognition.},
author={Zhang, Songyang and Li, Zeming and Yan, Shipeng and He, Xuming and Sun, Jian},
booktitle={CVPR},
year={2021}
}
```

# License

This repo is released under the Apache 2.0 license. Please see the LICENSE file for more information.