https://github.com/Lavender105/RSGPT

Last synced: 27 days ago
JSON representation

Host: GitHub
URL: https://github.com/Lavender105/RSGPT
Owner: Lavender105
Created: 2023-07-24T07:17:27.000Z (almost 2 years ago)
Default Branch: main
Last Pushed: 2024-12-18T08:16:55.000Z (6 months ago)
Last Synced: 2025-04-27T00:02:38.654Z (about 1 month ago)
Size: 18.6 KB
Stars: 94
Watchers: 8
Forks: 1
Open Issues: 4
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

awesome-vision-language-models-for-earth-observation - link - annotated captions and 936 visual question-answer pairs with rich information and open-ended questions and answers.<br> Can be used for Image Captioning and Visual Question-Answering tasks <br> | (Vision-Language Remote Sensing Datasets)

README

        
**RSGPT: A Remote Sensing Vision Language Model and Benchmark**

[Yuan Hu](https://scholar.google.com.sg/citations?user=NFRuz4kAAAAJ&hl=zh-CN), Jianlong Yuan, Congcong Wen, Xiaonan Lu, [Xiang Li☨](https://xiangli.ac.cn)

☨corresponding author



This is an ongoing project. We are working on increasing the dataset size.

## Related Projects

**VRSBench: A Versatile Vision-Language Benchmark Dataset for Remote Sensing Image Understanding**

Xiang Li, Jian Ding, Mohamed Elhoseiny

   


**Vision-language models in remote sensing: Current progress and future trends**

[Xiang Li☨](https://xiangli.ac.cn), [Congcong Wen](https://wencc.xyz/), [Yuan Hu](https://scholar.google.com.sg/citations?user=NFRuz4kAAAAJ&hl=zh-CN), Zhenghang Yuan, [Xiao Xiang Zhu](https://www.professoren.tum.de/en/zhu-xiaoxiang)



## :fire: Updates

* **[2024.06.19]** We release the VRSBench, A Versatile Vision-Language Benchmark Dataset for Remote Sensing Image Understanding. VRSBench contains 29,614 images, with 29,614 human-verified detailed captions, 52,472 object references, and 123,221 question-answer pairs. check [VRSBench Project Page](https://vrsbench.github.io/).

* **[2024.05.23]** We release the RSICap dataset. Please fill out this [form](https://docs.google.com/forms/d/1h5ydiswunM_EMfZZtyJjNiTMpeOzRwooXh73AOqokzU/edit) to get both RSICap and RSIEval dataset.

* **[2023.11.10]** Our survey about vision-language models in remote sensing. [RSVLM](https://arxiv.org/pdf/2305.05726.pdf).

* **[2023.10.22]** The RSICap dataset and code will be released upon paper acceptance.

* **[2023.10.22]** We release the evaluation dataset RSIEval. Please fill out this [form](https://docs.google.com/forms/d/1h5ydiswunM_EMfZZtyJjNiTMpeOzRwooXh73AOqokzU/edit) to get both the RSIEval dataset.

## Dataset

* RSICap: 2,585 image-text pairs with high-quality human-annotated captions.

* RSIEval: 100 high-quality human-annotated captions with 936 open-ended visual question-answer pairs.

## Code

The idea of finetuning our vision-language model is borrowed from [MiniGPT-4](https://github.com/Vision-CAIR/MiniGPT-4).

Our model is based on finetuning [InstructBLIP](https://github.com/salesforce/LAVIS/blob/main/projects/instructblip/README.md) using our RSICap dataset.

## Acknowledgement

+ [MiniGPT-4](https://github.com/Vision-CAIR/MiniGPT-4). A popular open-source vision-language model.

+ [InstructBLIP](https://github.com/salesforce/LAVIS/blob/main/projects/instructblip/README.md). The model architecture of RSGPT follows InstructBLIP. Don't forget to check out this great open-source work if you don't know it before!

+ [Lavis](https://github.com/salesforce/LAVIS). This repository is built upon Lavis!

+ [Vicuna](https://github.com/lm-sys/FastChat). The fantastic language ability of Vicuna with only 13B parameters is just amazing. And it is open-source!

If you're using RSGPT in your research or applications, please cite using this BibTeX:

```bibtex

@article{hu2023rsgpt,

  title={RSGPT: A Remote Sensing Vision Language Model and Benchmark},

  author={Hu, Yuan and Yuan, Jianlong and Wen, Congcong and Lu, Xiaonan and Li, Xiang},

  journal={arXiv preprint arXiv:2307.15266},

  year={2023}

}

```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/Lavender105/RSGPT

Awesome Lists containing this project

README