{"id":22067545,"url":"https://github.com/OpenGVLab/All-Seeing","last_synced_at":"2025-07-24T04:31:41.946Z","repository":{"id":185935905,"uuid":"674106991","full_name":"OpenGVLab/all-seeing","owner":"OpenGVLab","description":"[ICLR 2024] This is the official implementation of the paper \"The All-Seeing Project: Towards Panoptic Visual Recognition and Understanding of the Open World\"","archived":false,"fork":false,"pushed_at":"2024-08-09T06:39:51.000Z","size":60334,"stargazers_count":433,"open_issues_count":7,"forks_count":14,"subscribers_count":23,"default_branch":"main","last_synced_at":"2024-08-09T07:50:04.909Z","etag":null,"topics":["all-seeing","dataset","region-text"],"latest_commit_sha":null,"homepage":"https://huggingface.co/spaces/OpenGVLab/all-seeing","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/OpenGVLab.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-08-03T06:49:14.000Z","updated_at":"2024-08-09T06:39:54.000Z","dependencies_parsed_at":"2024-06-18T12:00:50.270Z","dependency_job_id":"39c909de-58e7-461f-869c-ba0ba477e8e5","html_url":"https://github.com/OpenGVLab/all-seeing","commit_stats":null,"previous_names":["opengvlab/all-seeing"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/OpenGVLab%2Fall-seeing","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/OpenGVLab%2Fall-seeing/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/OpenGVLab%2Fall-seeing/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/OpenGVLab%2Fall-seeing/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/OpenGVLab","download_url":"https://codeload.github.com/OpenGVLab/all-seeing/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":227421056,"owners_count":17774999,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["all-seeing","dataset","region-text"],"created_at":"2024-11-30T20:03:18.156Z","updated_at":"2024-11-30T20:03:25.086Z","avatar_url":"https://github.com/OpenGVLab.png","language":"Python","funding_links":[],"categories":["Paper List"],"sub_categories":["Seminal Papers"],"readme":"# The All-Seeing Project \u003cimg width=\"60\" alt=\"image\" src=\"https://github.com/OpenGVLab/all-seeing/assets/8529570/54c8d328-aa67-4d28-99de-90d019e8e7d0\"\u003e\n\nThis is the official implementation of the following papers:\n\n- [The All-Seeing Project: Towards Panoptic Visual Recognition and Understanding of the Open World](https://arxiv.org/abs/2308.01907)\n\n- [The All-Seeing Project V2: Towards General Relation Comprehension of the Open World](https://arxiv.org/abs/2402.19474)\n\n\u003e The name \"All-Seeing\" is derived from \"The All-Seeing Eye\", which means having complete knowledge, awareness, or insight into all aspects of existence. The logo is Millennium Puzzle, an artifact from the manga \"Yu-Gi-Oh!\")\n\n## News and Updates 🚀🚀🚀\n- `July 01, 2024`: All-Seeing Project v2 is accepted by ECCV 2024! Note that the [model](https://huggingface.co/OpenGVLab/ASMv2) and [data](https://huggingface.co/datasets/OpenGVLab/AS-V2) have already been released in huggingface.\n- `Feb 28, 2024`: All-Seeing Project v2 is out! Our [**ASMv2**](https://huggingface.co/OpenGVLab/ASMv2) achieves state-of-the-art performance across a variety of image-level and region-level tasks! See [**here**](all-seeing-v2/README.md) for more details.\n- `Feb 21, 2024`: [**ASM**](https://huggingface.co/OpenGVLab/ASM-FT), [**AS-Core**](https://huggingface.co/datasets/OpenGVLab/AS-Core), [**AS-10M**](https://huggingface.co/datasets/OpenGVLab/AS-V2/blob/main/as_pretrain_10m.json), [**AS-100M**](https://huggingface.co/datasets/OpenGVLab/AS-100M) is released!\n- `Jan 16, 2024`: All-Seeing Project is accepted by ICLR 2024!\n- `Aug 29, 2023`: [**All-Seeing Model Demo**](https://openxlab.org.cn/apps/detail/wangweiyun/AllSeeingModel) is available on the OpenXLab now!\n\n## Schedule\n- [x] Release the ASMv2 model.\n- [x] Release the AS-V2 dataset.\n- [x] Release the ASM model.\n- [ ] Release the full version of AS-1B.\n- [x] Release AS-Core, which is the human-verified subset of AS-1B.\n- [x] Release AS-100M, which is the 100M subset of AS-1B.\n- [x] Release AS-10M, which is the 10M subset of AS-1B.\n- [x] Online demo, including dataset browser and ASM online demo.\n\n\u003c!-- ## Online Demo\n**All-Seeing Model demo** is available [here](https://openxlab.org.cn/apps/detail/wangweiyun/All-Seeing-Model-Demo).\n\n**Dataset Browser** is available [here](https://openxlab.org.cn/apps/detail/wangweiyun/All-Seeing-Dataset-Browser).\n\n\n\nhttps://github.com/OpenGVLab/all-seeing/assets/47669167/9b5b32d1-863a-4579-b576-b82523f2205e --\u003e\n\n## Introduction \n\n\u003c!-- ### [**The All-Seeing Project**](all-seeing/README.md) --\u003e\n### The All-Seeing Project [[Paper](https://arxiv.org/abs/2308.01907)][[Model](https://huggingface.co/OpenGVLab/ASM-FT)][[Dataset](https://huggingface.co/datasets/OpenGVLab/AS-100M)][[Code](all-seeing/README.md)][[Zhihu](https://zhuanlan.zhihu.com/p/686963813)][[Medium](https://ai.gopubby.com/the-all-seeing-project-towards-panoptic-visual-recognization-and-general-relation-comprehension-f76c2bde3e2c)]\n\n[***All-Seeing 1B (AS-1B) dataset***](https://huggingface.co/datasets/OpenGVLab/AS-100M): we propose a new large-scale dataset (AS-1B) for open-world panoptic visual recognition and understanding, using an economical semi-automatic data engine that combines the power of off-the-shelf vision/language models and human feedback.\n\n[***All-Seeing Model (ASM)***](https://huggingface.co/OpenGVLab/ASM-FT): we develop a unified vision-language foundation model (ASM) for open-world panoptic visual recognition and understanding. Aligning with LLMs, our ASM supports versatile image-text retrieval and generation tasks, demonstrating impressive zero-shot capability.\n\n\u003c!-- ### [**The All-Seeing Project V2**](all-seeing-v2/README.md) --\u003e\n### The All-Seeing Project V2 [[Paper](https://arxiv.org/abs/2402.19474)][[Model](https://huggingface.co/OpenGVLab/ASMv2)][[Dataset](https://huggingface.co/datasets/OpenGVLab/AS-V2)][[Code](all-seeing-v2/README.md)][[Zhihu](https://zhuanlan.zhihu.com/p/686963813)][[Medium](https://ai.gopubby.com/the-all-seeing-project-towards-panoptic-visual-recognization-and-general-relation-comprehension-f76c2bde3e2c)]\n\n***[All-Seeing Dataset V2 (AS-V2) dataset](https://huggingface.co/datasets/OpenGVLab/AS-V2)***: we propose a novel task, termed Relation Conversation (ReC), which unifies the formulation of text generation, object localization, and relation comprehension. Based on the unified formulation, we construct the AS-V2 dataset, which consists of 127K high-quality relation conversation samples, to unlock the ReC capability for Multi-modal Large Language Models (MLLMs).\n\n***[All-Seeing Model v2 (ASMv2)](https://huggingface.co/OpenGVLab/ASMv2)***: we develop ASMv2, which integrates the Relation Conversation ability while maintaining powerful general capabilities.\nIt is endowed with grounding and referring capabilities, exhibiting state-of-the-art performance on region-level tasks.\nFurthermore, this model can be naturally adapted to the Scene Graph Generation task in an open-ended manner.\n\n***[Circular-based Relation Probing Evaluation (CRPE) benchmark](https://huggingface.co/datasets/OpenGVLab/CRPE)***: We construct a benchmark called Circular-based Relation Probing Evaluation (CRPE), which is the first benchmark that covers all elements of the relation triplets `(subject, predicate, object)`, providing a systematic platform for the evaluation of relation comprehension ability.\n\n## License\n\nThis project is released under the [Apache 2.0 license](LICENSE). \n\n\n## 🖊️ Citation\n\nIf you find this project useful in your research, please consider cite:\n\n```BibTeX\n@article{wang2023allseeing,\n  title={The All-Seeing Project: Towards Panoptic Visual Recognition and Understanding of the Open World},\n  author={Wang, Weiyun and Shi, Min and Li, Qingyun and Wang, Wenhai and Huang, Zhenhang and Xing, Linjie and Chen, Zhe and Li, Hao and Zhu, Xizhou and Cao, Zhiguo and others},\n  journal={arXiv preprint arXiv:2308.01907},\n  year={2023}\n}\n@article{wang2024allseeing_v2,\n  title={The All-Seeing Project V2: Towards General Relation Comprehension of the Open World},\n  author={Wang, Weiyun and Ren, Yiming and Luo, Haowen and Li, Tiantong and Yan, Chenxiang and Chen, Zhe and Wang, Wenhai and Li, Qingyun and Lu, Lewei and Zhu, Xizhou and others},\n  journal={arXiv preprint arXiv:2402.19474},\n  year={2024}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FOpenGVLab%2FAll-Seeing","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FOpenGVLab%2FAll-Seeing","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FOpenGVLab%2FAll-Seeing/lists"}