https://github.com/OpenGVLab/All-Seeing

[ICLR 2024] This is the official implementation of the paper "The All-Seeing Project: Towards Panoptic Visual Recognition and Understanding of the Open World"
https://github.com/OpenGVLab/All-Seeing

all-seeing dataset region-text

Last synced: 5 months ago
JSON representation

[ICLR 2024] This is the official implementation of the paper "The All-Seeing Project: Towards Panoptic Visual Recognition and Understanding of the Open World"

Host: GitHub
URL: https://github.com/OpenGVLab/All-Seeing
Owner: OpenGVLab
Created: 2023-08-03T06:49:14.000Z (over 2 years ago)
Default Branch: main
Last Pushed: 2024-08-09T06:39:51.000Z (over 1 year ago)
Last Synced: 2024-08-09T07:50:04.909Z (over 1 year ago)
Topics: all-seeing, dataset, region-text
Language: Python
Homepage: https://huggingface.co/spaces/OpenGVLab/all-seeing
Size: 57.5 MB
Stars: 433
Watchers: 23
Forks: 14
Open Issues: 7
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

Awesome-Segment-Anything - [homepage

README

          # The All-Seeing Project 

This is the official implementation of the following papers:

- [The All-Seeing Project: Towards Panoptic Visual Recognition and Understanding of the Open World](https://arxiv.org/abs/2308.01907)

- [The All-Seeing Project V2: Towards General Relation Comprehension of the Open World](https://arxiv.org/abs/2402.19474)

> The name "All-Seeing" is derived from "The All-Seeing Eye", which means having complete knowledge, awareness, or insight into all aspects of existence. The logo is Millennium Puzzle, an artifact from the manga "Yu-Gi-Oh!")

## News and Updates 🚀🚀🚀

- `July 01, 2024`: All-Seeing Project v2 is accepted by ECCV 2024! Note that the [model](https://huggingface.co/OpenGVLab/ASMv2) and [data](https://huggingface.co/datasets/OpenGVLab/AS-V2) have already been released in huggingface.

- `Feb 28, 2024`: All-Seeing Project v2 is out! Our [**ASMv2**](https://huggingface.co/OpenGVLab/ASMv2) achieves state-of-the-art performance across a variety of image-level and region-level tasks! See [**here**](all-seeing-v2/README.md) for more details.

- `Feb 21, 2024`: [**ASM**](https://huggingface.co/OpenGVLab/ASM-FT), [**AS-Core**](https://huggingface.co/datasets/OpenGVLab/AS-Core), [**AS-10M**](https://huggingface.co/datasets/OpenGVLab/AS-V2/blob/main/as_pretrain_10m.json), [**AS-100M**](https://huggingface.co/datasets/OpenGVLab/AS-100M) is released!

- `Jan 16, 2024`: All-Seeing Project is accepted by ICLR 2024!

- `Aug 29, 2023`: [**All-Seeing Model Demo**](https://openxlab.org.cn/apps/detail/wangweiyun/AllSeeingModel) is available on the OpenXLab now!

## Schedule

- [x] Release the ASMv2 model.

- [x] Release the AS-V2 dataset.

- [x] Release the ASM model.

- [ ] Release the full version of AS-1B.

- [x] Release AS-Core, which is the human-verified subset of AS-1B.

- [x] Release AS-100M, which is the 100M subset of AS-1B.

- [x] Release AS-10M, which is the 10M subset of AS-1B.

- [x] Online demo, including dataset browser and ASM online demo.

## Introduction 

### The All-Seeing Project [[Paper](https://arxiv.org/abs/2308.01907)][[Model](https://huggingface.co/OpenGVLab/ASM-FT)][[Dataset](https://huggingface.co/datasets/OpenGVLab/AS-100M)][[Code](all-seeing/README.md)][[Zhihu](https://zhuanlan.zhihu.com/p/686963813)][[Medium](https://ai.gopubby.com/the-all-seeing-project-towards-panoptic-visual-recognization-and-general-relation-comprehension-f76c2bde3e2c)]

[***All-Seeing 1B (AS-1B) dataset***](https://huggingface.co/datasets/OpenGVLab/AS-100M): we propose a new large-scale dataset (AS-1B) for open-world panoptic visual recognition and understanding, using an economical semi-automatic data engine that combines the power of off-the-shelf vision/language models and human feedback.

[***All-Seeing Model (ASM)***](https://huggingface.co/OpenGVLab/ASM-FT): we develop a unified vision-language foundation model (ASM) for open-world panoptic visual recognition and understanding. Aligning with LLMs, our ASM supports versatile image-text retrieval and generation tasks, demonstrating impressive zero-shot capability.

### The All-Seeing Project V2 [[Paper](https://arxiv.org/abs/2402.19474)][[Model](https://huggingface.co/OpenGVLab/ASMv2)][[Dataset](https://huggingface.co/datasets/OpenGVLab/AS-V2)][[Code](all-seeing-v2/README.md)][[Zhihu](https://zhuanlan.zhihu.com/p/686963813)][[Medium](https://ai.gopubby.com/the-all-seeing-project-towards-panoptic-visual-recognization-and-general-relation-comprehension-f76c2bde3e2c)]

***[All-Seeing Dataset V2 (AS-V2) dataset](https://huggingface.co/datasets/OpenGVLab/AS-V2)***: we propose a novel task, termed Relation Conversation (ReC), which unifies the formulation of text generation, object localization, and relation comprehension. Based on the unified formulation, we construct the AS-V2 dataset, which consists of 127K high-quality relation conversation samples, to unlock the ReC capability for Multi-modal Large Language Models (MLLMs).

***[All-Seeing Model v2 (ASMv2)](https://huggingface.co/OpenGVLab/ASMv2)***: we develop ASMv2, which integrates the Relation Conversation ability while maintaining powerful general capabilities.

It is endowed with grounding and referring capabilities, exhibiting state-of-the-art performance on region-level tasks.

Furthermore, this model can be naturally adapted to the Scene Graph Generation task in an open-ended manner.

***[Circular-based Relation Probing Evaluation (CRPE) benchmark](https://huggingface.co/datasets/OpenGVLab/CRPE)***: We construct a benchmark called Circular-based Relation Probing Evaluation (CRPE), which is the first benchmark that covers all elements of the relation triplets `(subject, predicate, object)`, providing a systematic platform for the evaluation of relation comprehension ability.

## License

This project is released under the [Apache 2.0 license](LICENSE). 

## 🖊️ Citation

If you find this project useful in your research, please consider cite:

```BibTeX

@article{wang2023allseeing,

  title={The All-Seeing Project: Towards Panoptic Visual Recognition and Understanding of the Open World},

  author={Wang, Weiyun and Shi, Min and Li, Qingyun and Wang, Wenhai and Huang, Zhenhang and Xing, Linjie and Chen, Zhe and Li, Hao and Zhu, Xizhou and Cao, Zhiguo and others},

  journal={arXiv preprint arXiv:2308.01907},

  year={2023}

}

@article{wang2024allseeing_v2,

  title={The All-Seeing Project V2: Towards General Relation Comprehension of the Open World},

  author={Wang, Weiyun and Ren, Yiming and Luo, Haowen and Li, Tiantong and Yan, Chenxiang and Chen, Zhe and Wang, Wenhai and Li, Qingyun and Lu, Lewei and Zhu, Xizhou and others},

  journal={arXiv preprint arXiv:2402.19474},

  year={2024}

}

```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/OpenGVLab/All-Seeing

Awesome Lists containing this project

README