https://github.com/lzw-lzw/awesome-remote-sensing-vision-language-models

Awesome-Remote-Sensing-Vision-Language-Models
https://github.com/lzw-lzw/awesome-remote-sensing-vision-language-models
List: awesome-remote-sensing-vision-language-models
Last synced: 3 months ago
JSON representation
Awesome-Remote-Sensing-Vision-Language-Models
Host: GitHub
URL: https://github.com/lzw-lzw/awesome-remote-sensing-vision-language-models
Owner: lzw-lzw
License: mit
Created: 2023-05-15T16:29:35.000Z (about 2 years ago)
Default Branch: main
Last Pushed: 2024-04-27T13:13:12.000Z (about 1 year ago)
Last Synced: 2024-05-22T00:08:38.631Z (about 1 year ago)
Homepage:
Size: 153 KB
Stars: 90
Watchers: 5
Forks: 5
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project

awesome-vision-language-models-for-earth-observation - awesome-remote-sensing-vision-language-models
ultimate-awesome - awesome-remote-sensing-vision-language-models - Awesome-Remote-Sensing-Vision-Language-Models. (Other Lists / Julia Lists)
README

        # Awesome remote sensing vision language models

This is a repository for visual language models in remote sensing, including advanced methods and commonly used datasets in different applications, such as image-text retrieval, visual question answering, pretraining, etc.

*If you find any relevant papers that are not included here, please feel free to pull requests at any time.*

[![PRs Welcome](https://img.shields.io/badge/PRs-welcome-brightgreen.svg?style=flat-square)](https://makeapullrequest.com)

## Table of Contents

* [Surveys](#surveys)

* [Remote Sensing Vision Language Model](#remote-sensing-vision-language-model)

* [Applications](#applications)

  * [Pretraining](#pretraining)

  * [Image Captioning](#image-captioning)

  * [Text-based Image Generation](#text-based-image-generation)

  * [Image text Retrieval](#image-text-retrieval)

  * [Visual Question Answering](#visual-question-answering)

  * [Visual Grounding](#visual-grounding)

  * [Scene Classification](#scene-classification)

  * [Object Detection](#object-detection)

  * [Semantic Segmentation](#semantic-segmentation)

  * [Others](#others)

* [Dataset](#dataset)

  * [Image Captioning dataset](#image-captioning-dataset)

  * [Text-based Image Generation dataset](#text-based-image-generation-dataset)

  * [Text-based Image Retrieval dataset](#text-based-image-retrieval-dataset)

  * [Visual Question Answering dataset](#visual-question-answering-dataset)

  * [Visual Grounding dataset](#visual-grounding-dataset)

  * [Scene Classification dataset](#scene-classification-dataset)

  * [Object Detection dataset](#object-detection-dataset)

  * [Semantic Segmentation dataset](#semantic-segmentation-dataset)

# Surveys

| Paper                                             |  Published in | Code/Project|                                  

|---------------------------------------------------|:-------------:|:------------:|

[Vision-Language Models in Remote Sensing: Current Progress and Future Trends](https://arxiv.org/abs/2305.05726)|arxiv 2023|-|

[The Potential of Visual ChatGPT For Remote Sensing](https://arxiv.org/abs/2304.13009)|arxiv 2023|-|

[Brain-inspired Remote Sensing Foundation Models and Open Problems: A Comprehensive Survey](https://ieeexplore.ieee.org/document/10254282/keywords#keywords)|JSTARG 2023|-

# Remote Sensing Vision Language Model

| Paper                                             |  Published in | Code/Project|  

|---------------------------------------------------|:-------------:|:------------:|

[RSGPT: A Remote Sensing Vision Language Model and Benchmark](https://arxiv.org/abs/2307.15266)|arxiv 2023|[code](https://github.com/Lavender105/RSGPT)

|RemoteGLM|2023|[code](https://github.com/lzw-lzw/RemoteGLM)|

[Tree-GPT: Modular Large Language Model Expert System for Forest Remote Sensing Image Understanding and Interactive Analysis](https://arxiv.org/abs/2310.04698)|arxiv 2023|-

[Towards Automatic Satellite Images Captions Generation Using Large Language Models](https://arxiv.org/abs/2310.11392)|arxiv 2023|-

[GeoChat: Grounded Large Vision-Language Model for Remote Sensing](https://arxiv.org/abs/2311.15826)|arxiv 2023|[code](https://github.com/mbzuai-oryx/geochat)

[SkyScript: A Large and Semantically Diverse Vision-Language Dataset for Remote Sensing](https://arxiv.org/abs/2312.12856)|AAAI 2024|[code](https://github.com/wangzhecheng/SkyScript)

# Applications

## Pretraining

| Paper                                             |  Published in | Code/Project|                                  

|---------------------------------------------------|:-------------:|:------------:|

[S-CLIP: Semi-supervised Vision-Language Pre-training using Few Specialist Captions](https://arxiv.org/abs/2305.14095)|arxiv 2023|[code](https://github.com/alinlab/s-clip)

[RemoteCLIP: A Vision Language Foundation Model for Remote Sensing](https://arxiv.org/abs/2306.11029)|arxiv 2023|[code](https://github.com/ChenDelong1999/RemoteCLIP)

[RS5M: A Large Scale Vision-Language Dataset for Remote Sensing Vision-Language Foundation Model](https://arxiv.org/abs/2306.11300)|arxiv 2023|[Project](https://github.com/om-ai-lab/RS5M)

  

## Image Captioning

| Paper                                             |  Published in | Code/Project|                                  

|---------------------------------------------------|:-------------:|:------------:|

[Deep Semantic Understanding of High Resolution Remote Sensing Image](https://ieeexplore.ieee.org/abstract/document/7546397)|CITS 2016|-

[Can a Machine Generate Humanlike Language Descriptions for a Remote Sensing Image?](https://ieeexplore.ieee.org/abstract/document/7891049)|TGRS 2017|-

[Exploring models and data for remote sensing image caption generation](https://ieeexplore.ieee.org/abstract/document/8240966)|TGRS 2017|[code](https://github.com/201528014227051/RSICD_optimal)|

[Natural language escription of remote sensing images based on deep learning](https://ieeexplore.ieee.org/abstract/document/8128075)|IGARSS 2017|-

[Description Generation for Remote Sensing Images Using Attribute Attention Mechanism](https://www.mdpi.com/2072-4292/11/6/612)|Remote Sensing 2019|-

[Vaa:Visual aligning attention model for remote sensing image captioning](https://ieeexplore.ieee.org/abstract/document/8843891)|IEEE Access 2019|-

[Exploring Multi-Level Attention and Semantic Relationship for Remote Sensing Image Captioning](https://ieeexplore.ieee.org/abstract/document/8943170)|IEEE Access 2019|-

[A multi-level attention model for remote sensing image captions](https://www.mdpi.com/2072-4292/12/6/939)|Remote Sensing 2020|-

[Remote sensing image captioning via variational autoencoder and reinforcement learning](https://www.sciencedirect.com/science/article/abs/pii/S0950705120302586)|Knowledge-Based Systems 2020|-

[Truncation cross entropy loss for remote sensing image captionin](https://ieeexplore.ieee.org/abstract/document/9153154)|TGRS 2020|-

[Word–Sentence Framework for Remote Sensing Image Captioning](https://ieeexplore.ieee.org/abstract/document/9308980)|TGRS 2020|[code](https://github.com/hw2hwei/WordSent)|

[A novel SVM-based decoder for remote sensing image captioning](https://ieeexplore.ieee.org/abstract/document/9521989)|TGRS 2021|-

[High-resolution remote sensing image captioning based on structured attention](https://ieeexplore.ieee.org/abstract/document/9400386)|TGRS 2021|[code](https://github.com/Saketspradhan/High-Resolution-Remote-Sensing-Image-Captioning-Based-on-Structured-Attention)

[Exploring transformer and multilabel classification for remote sensing image captioning](https://ieeexplore.ieee.org/abstract/document/9855519)|GRSL 2022|-

[NWPU-captions dataset and mlca-net for remote sensing image captioning](https://ieeexplore.ieee.org/abstract/document/9866055)|TGRS 2022|-

[Remote Sensing Image Change Captioning With Dual-Branch Transformers: A New Method and a Large Scale Dataset](https://ieeexplore.ieee.org/abstract/document/9934924)|TGRS 2022|[code](https://github.com/Chen-Yang-Liu/RSICC)

[Transforming remote sensing images to textual descriptions](https://www.sciencedirect.com/science/article/pii/S0303243422000678)|INT J APPL EARTH OBS 2022|-

[Remote-sensing image captioning based on multilayer aggregated transformer](https://ieeexplore.ieee.org/abstract/document/9709791)|GRSL 2022|-

[Vlca: vision-language aligning model with cross-modal attention for bilingual remote sensing image captioning](https://ieeexplore.ieee.org/abstract/document/10066217)|J SYST ENG ELECTRON 2023|-

[Multi-source interactive stair attention for remote sensing image captioning](https://www.mdpi.com/2072-4292/15/3/579)|Remote Sensing 2023|-

[Changes to Captions: An Attentive Network for Remote Sensing Change Captioning](https://arxiv.org/abs/2304.01091)|arxiv 2023|[code](https://github.com/shizhenchang/chg2cap)

[Bootstrapping Interactive Image-Text Alignment for Remote Sensing Image Captioning](https://arxiv.org/abs/2312.01191)|arxiv 2023|[code](https://github.com/yangcong356/BITA)

## Text-based Image Generation

| Paper                                             |  Published in | Code/Project|                                  

|---------------------------------------------------|:-------------:|:------------:|

[Retro-Remote Sensing: Generating Images From Ancient Texts](https://ieeexplore.ieee.org/abstract/document/8660422)|J-STARS 2019|-

[Remote sensing image augmentation based on text description for waterside change detection](https://www.mdpi.com/2072-4292/13/10/1894)|Remote Sensing 2021|-

[Text-to-remote-sensing-image generation with structured generative adversarial networks](https://ieeexplore.ieee.org/abstract/document/9390223)|GRSL 2021|-

[Txt2img-MHN:Remote sensing image generation from text using modern hopfield network](https://arxiv.org/abs/2208.04441)|arxiv 2022|[code](https://github.com/YonghaoXu/Txt2Img-MHN)

## Image-text Retrieval

| Paper                                             |  Published in | Code/Project|                                  

|---------------------------------------------------|:-------------:|:------------:|

[Textrs: Deep bidirectional triplet network for matching text to remote sensing images.](https://www.mdpi.com/2072-4292/12/3/405)|Remote Sensing 2020|-

[Deep unsupervised embedding for remote sensing image retrieval using textual cues](https://www.mdpi.com/2076-3417/10/24/8931)|Applied Sciences 2020|-

[A deep semantic alignment network for the cross-modal image-text retrieval in remote sensing](https://ieeexplore.ieee.org/abstract/document/9395191)|J-STARS 2021|-

[A lightweight multi-scale crossmodal text-image retrieval method in remote sensing](https://ieeexplore.ieee.org/abstract/document/9594840)|TGRS 2021|[code](https://github.com/xiaoyuan1996/retrievalSystem)

[Remote sensing cross-modal text-image retrieval based on global and local information](https://ieeexplore.ieee.org/abstract/document/9745546)|TGRS 2022|[code](https://github.com/xiaoyuan1996/GaLR)

[Multilanguage transformer for improved text to remote sensing image retrieval](https://ieeexplore.ieee.org/abstract/document/9925582)|J-STARS 2022|-

[Exploring a fine-grained multiscale method for cross-modal remote sensing image retrieva](https://arxiv.org/abs/2204.09868)|TGRS 2022|[code](https://github.com/xiaoyuan1996/AMFMN)

[Contrasting dual transformer architectures for multi-modal remote sensing image retrieval](https://www.mdpi.com/2076-3417/13/1/282)|Applied Sciences 2023|-

[Parameter-Efficient Transfer Learning for Remote Sensing Image-Text Retrieval](https://arxiv.org/abs/2308.12509)|arxiv 2023|-

[Direction-Oriented Visual-semantic Embedding Model for Remote Sensing Image-text Retrieval](https://arxiv.org/abs/2310.08276)|arxiv 2023|-

## Visual Question Answering

| Paper                                             |  Published in | Code/Project|                                  

|---------------------------------------------------|:-------------:|:------------:|

[RSVQA: Visual question answering for remote sensing data](https://ieeexplore.ieee.org/abstract/document/9088993)|TGRS 2020|[code](https://github.com/syvlo/RSVQA)

[Mutual Attention Inception Network for Remote Sensing Visual Question Answering](https://ieeexplore.ieee.org/document/9444570)|TGRS 2021|[code](https://github.com/spectralpublic/RSIVQA)

[How to find a good image-text embedding for remote sensing visual question answering?](https://arxiv.org/abs/2109.11848)|ECML-PKDD 2021|-

[Cross-Modal Visual Question Answering for Remote Sensing Data: The International Conference on Digital Image Computing: Techniques and Applications](https://ieeexplore.ieee.org/abstract/document/9647287)|DICTA 2021|-

[RSVQA meets bigearthnet: a new,large-scale, visual question answering dataset for remote sensing](https://ieeexplore.ieee.org/abstract/document/9553307)|IGARSS 2021|[code](https://github.com/syvlo/RSVQAxBEN)

[Self-Paced Curriculum Learning for Visual Question Answering on Remote Sensing Data](https://ieeexplore.ieee.org/abstract/document/9553624)|IGARSS 2021|-

[From easy to hard: Learning language-guided curriculum for visual question answering on remote sensing data](https://ieeexplore.ieee.org/abstract/document/9771224)|TGRS 2022|[code](https://github.com/YZHJessica/VQA-easy2hard)

[Language transformers for remote sensing visual question answering](https://ieeexplore.ieee.org/abstract/document/9884036)|IGARSS 2022|-

[Open-ended remote sensing visual question answering with transformers](https://www.tandfonline.com/doi/abs/10.1080/01431161.2022.2145583)|IJRS 2022|-

[Bi-modal transformer-based approach for visual question answering in remote sensing imagery](https://ieeexplore.ieee.org/abstract/document/9832935)|TGRS 2022|-

[Prompt-RSVQA: Prompting visual context to a language model for remote sensing visual question answering](https://openaccess.thecvf.com/content/CVPR2022W/EarthVision/html/Chappuis_Prompt-RSVQA_Prompting_Visual_Context_to_a_Language_Model_for_Remote_CVPRW_2022_paper.html)|CVPRW 2022|-

[Change detection meets visual question answering](https://ieeexplore.ieee.org/abstract/document/9901476)|TGRS 2022|[code](https://github.com/YZHJessica/CDVQA)

[A spatial hierarchical reasoning network for remote sensing visual question answering](https://ieeexplore.ieee.org/abstract/document/10018408)|TGRS 2023|-

[Multilingual Augmentation for Robust Visual Question Answering in Remote Sensing Images](https://ieeexplore.ieee.org/abstract/document/10144189)|JURSE 2023|-

[LiT-4-RSVQA: Lightweight Transformer-based Visual Question Answering in Remote Sensing](https://arxiv.org/abs/2306.00758)|IGARSS 2023|[code](https://git.tu-berlin.de/rsim/lit4rsvqa)

[Charting New Territories: Exploring the Geographic and Geospatial Capabilities of Multimodal LLMs](https://arxiv.org/abs/2311.14656)|arXiv 2023|[code](https://github.com/jonathan-roberts1/charting-new-territories)

## Visual Grounding

| Paper                                             |  Published in | Code/Project|                                  

|---------------------------------------------------|:-------------:|:------------:|

|[Visual Grounding in Remote Sensing Images](https://dl.acm.org/doi/abs/10.1145/3503161.3548316)|ACMMM 2022|[data](https://sunyuxi.github.io/publication/GeoVG)|

|[RSVG: Exploring data and models for visual grounding on remote sensing data](https://ieeexplore.ieee.org/abstract/document/10056343)|TGRS 2023 |[code](https://github.com/ZhanYang-nwpu/RSVG-pytorch)

 

## Scene Classification

| Paper                                             |  Published in | Code/Project|                                  

|---------------------------------------------------|:-------------:|:------------:|

[Zero-shot scene classification for high spatial resolution remote sensing images](https://ieeexplore.ieee.org/abstract/document/7902107)|TGRS 2017|-

[Fine-grained object recognition and zero-shot learning in remote sensing imagery](https://ieeexplore.ieee.org/abstract/document/8071030)|TGRS 2017|-

[Structural alignment based zero-shot classification for remote sensing scenes](https://ieeexplore.ieee.org/abstract/document/8645056)|ICECE 2018|-

[A distance-constrained semantic autoencoder for zero-shot remote sensing scene classification](https://ieeexplore.ieee.org/abstract/document/9633210)|J-STARS 2021|-

[Learning deep crossmodal embedding networks for zero-shot remote sensing image scene classification](https://ieeexplore.ieee.org/abstract/document/9321719)|TGRS 2021|-

[Generative adversarial networks for zero-shot remote sensing scene classification](https://www.mdpi.com/2076-3417/12/8/3760)|Applied Sciences 2022|-

[APPLeNet: Visual Attention Parameterized Prompt Learning for Few-Shot Remote Sensing Image Generalization using CLIP](https://arxiv.org/abs/2304.05995)|CVPR 2023|[code](https://github.com/mainaksingha01/APPLeNet)

## Object Detection

| Paper                                             |  Published in | Code/Project|                                  

|---------------------------------------------------|:-------------:|:------------:|

[Text semantic fusion relation graph reasoning for few-shot object detection on remote sensing images](https://www.mdpi.com/2072-4292/15/5/1187)|Remote Sensing 2023|-

[Few-shot object detection in aerial imagery guided by textmodal knowledge](https://ieeexplore.ieee.org/abstract/document/10056362)|TGRS 2023|-

## Semantic Segmentation

| Paper                                             |  Published in | Code/Project|                                  

|---------------------------------------------------|:-------------:|:------------:|

[Semi-supervised contrastive learning for few-shot segmentation of remote sensing images](https://www.mdpi.com/2072-4292/14/17/4254)|Remote Sensing 2022|-

[Few-shot segmentation of remote sensing images using deep metric learning](https://ieeexplore.ieee.org/abstract/document/9721235)|GRSL 2022.

[Language-aware domain generalization network for cross-scene hyperspectral image classification](https://ieeexplore.ieee.org/abstract/document/10005113)|TGRS 2023|[code](https://github.com/YuxiangZhang-BIT/IEEE_TGRS_LDGnet)

[RSPrompter: Learning to Prompt for Remote Sensing Instance Segmentation based on Visual Foundation Model](https://arxiv.org/abs/2306.16269)|arxiv 2023|[code](https://github.com/KyanChen/RSPrompter)

[RRSIS: Referring Remote Sensing Image Segmentation](https://arxiv.org/abs/2306.08625)|arxiv 2023|-

[CPSeg: Finer-grained Image Semantic Segmentation via Chain-of-Thought Language Prompting](https://arxiv.org/abs/2310.16069)|arxiv 2023|-

## Others

# Dataset

## Image Captioning Dataset

| Dataset                                            |  Home/Github | Download link|                                  

|---------------------------------------------------|:-------------:|:------------:|

[RSICD](https://ieeexplore.ieee.org/abstract/document/8240966)|[Github](https://github.com/201528014227051/RSICD_optimal)|[[BaiduYun]](https://pan.baidu.com/s/1bp71tE3#list/path=%2F)  [[Google Drive]](https://drive.google.com/open?id=0B1jt7lJDEXy3aE90cG9YSl9ScUk)

[Sydney-Captions](https://ieeexplore.ieee.org/abstract/document/7546397)|[Github](https://github.com/201528014227051/RSICD_optimal)|[[BaiduYun]](https://pan.baidu.com/s/1hujEmcG#list/path=%2F)

[UCM-Captions](https://ieeexplore.ieee.org/abstract/document/7546397)|[Github](https://github.com/201528014227051/RSICD_optimal)|[[BaiduYun]](https://pan.baidu.com/s/1mjPToHq#list/path=%2F)

[NWPU-RESISC45](https://ieeexplore.ieee.org/abstract/document/7891544)|[Github](https://gcheng-nwpu.github.io/#Datasets)|[[BaiduYun]](https://pan.baidu.com/s/1mifR6tU#list/path=%2F)  [[OneDrive]](https://1drv.ms/u/s!AmgKYzARBl5ca3HNaHIlzp_IXjs)

[DIOR-Captions](https://ieeexplore.ieee.org/abstract/document/10066217)|-|-

[RS-5M](https://github.com/om-ai-lab/RS5M)|[Github](https://github.com/om-ai-lab/RS5M)|[[HuggingFace]](https://huggingface.co/datasets/Zilun/RS5M/viewer/Zilun--RS5M/train?row=0)

[LEVIR-CC](https://ieeexplore.ieee.org/abstract/document/9934924)|[Github](https://github.com/Chen-Yang-Liu/RSICC)|[Google Drive](https://drive.google.com/drive/folders/1cEv-BXISfWjw1RTzL39uBojH7atjLdCG) |

[SkyScript](https://arxiv.org/abs/2312.12856)|[github](https://github.com/wangzhecheng/SkyScript)|

## Text-based Image Generation Dataset

## Text-based Image Retrieval Dataset

| Dataset                                            |  Home/Project | Download link|                                  

|---------------------------------------------------|:-------------:|:------------:|

[RSITMD](https://arxiv.org/abs/2204.09868)|[Github](https://github.com/xiaoyuan1996/AMFMN)|[[BaiduYun]](https://pan.baidu.com/s/1gDj38mzUL-LmQX32PYxr0Q?pwd=NIST)  [[Google Drive]](https://drive.google.com/file/d/1NJY86TAAUd8BVs7hyteImv8I2_Lh95W6/view?usp=sharing)

## Visual Question Answering Dataset

| Dataset                                            |  Home/Project | Download link|                                  

|---------------------------------------------------|:-------------:|:------------:|

[RSVQA](https://ieeexplore.ieee.org/abstract/document/9088993)|[Home](https://github.com/syvlo/RSVQA)|[[data]](https://rsvqa.sylvainlobry.com/)

[RSVQA×BEN](https://ieeexplore.ieee.org/abstract/document/9553307)|[[Github]](https://github.com/syvlo/RSVQAxBEN) [[Home]](https://rsvqa.sylvainlobry.com/)|-

[RSIVQA](https://ieeexplore.ieee.org/document/9444570)|[Github](https://github.com/spectralpublic/RSIVQA)|-

[CDVQA](https://ieeexplore.ieee.org/abstract/document/9901476)|[Github](https://github.com/YZHJessica/CDVQA)|-

## Visual Grounding Dataset

| Dataset                                            |  Home/Project | Download link|                                  

|---------------------------------------------------|:-------------:|:------------:|

[DIOR-RSVG](https://ieeexplore.ieee.org/abstract/document/10056343) |[Github](https://github.com/ZhanYang-nwpu/RSVG-pytorch)|[[Google Drive]](https://drive.google.com/drive/folders/1hTqtYsC6B-m4ED2ewx5oKuYZV13EoJp_?usp=sharing)

## Scene Classification Dataset

| Dataset                                            |  Home/Project | Download link|                                  

|---------------------------------------------------|:-------------:|:------------:|

[NWPU-RESISC45](https://ieeexplore.ieee.org/abstract/document/7891544)|[Home](https://gcheng-nwpu.github.io/#Datasets)|[[OneDrive]](https://1drv.ms/u/s!AmgKYzARBl5ca3HNaHIlzp_IXjs)   [[BaiduYun]](https://pan.baidu.com/s/1mifR6tU)

[AID](https://ieeexplore.ieee.org/abstract/document/7907303)|[Home](https://captain-whu.github.io/AID/)|[[OneDrive]](https://1drv.ms/u/s!AthY3vMZmuxChNR0Co7QHpJ56M-SvQ)  [[BaiduYun]](https://pan.baidu.com/s/1mifOBv6#list/path=%2F)

[UC Merced Land-Use(UCM)](https://dl.acm.org/doi/abs/10.1145/1869790.1869829)|[Home](http://weegee.vision.ucmerced.edu/datasets/landuse.html)|-

[SATIN](https://arxiv.org/abs/2304.11619)|[Home](https://satinbenchmark.github.io/)|[[HuggingFace]](https://huggingface.co/datasets/jonathan-roberts1/SATIN)

## Object Detection Dataset

| Dataset                                            |  Home/Project | Download link|                                  

|---------------------------------------------------|:-------------:|:------------:|

[NWPU VHR-10](https://www.sciencedirect.com/science/article/abs/pii/S0924271614002524#preview-section-introduction)|[Home](https://gcheng-nwpu.github.io/#Datasets)|[[OneDrive]](https://1drv.ms/u/s!AmgKYzARBl5cczaUNysmiFRH4eE)   [[BaiduYun]](https://pan.baidu.com/s/1hqwzXeG#list/path=%2F) 

[DIOR](https://www.sciencedirect.com/science/article/abs/pii/S0924271619302825)|[Home](https://gcheng-nwpu.github.io/#Datasets)|[[Google Drive]](https://drive.google.com/drive/folders/1UdlgHk49iu6WpcJ5467iT-UqNPpx__CC)    [[BaiduYun]](https://pan.baidu.com/s/1iLKT0JQoKXEJTGNxt5lSMg#list/path=%2F)

[FAIR1M](https://www.sciencedirect.com/science/article/abs/pii/S0924271621003269)|-|[[BaiduYun]](https://pan.baidu.com/share/init?surl=alWnbCbucLOQJJhi4WsZAw?pwd=u2xg)

## Semantic Segmentation Dataset

| Dataset                                            |  Home/Project | Download link|                                  

|---------------------------------------------------|:-------------:|:------------:|

Vaihingen|[Home](https://www.isprs.org/education/benchmarks/UrbanSemLab/Default.aspx)|[[BaiduYun]](https://pan.baidu.com/s/1EShNi22VfuIu3e6VygMb8g?pwd=3gsr)

Potsdam|[Home](https://www.isprs.org/education/benchmarks/UrbanSemLab/Default.aspx)|[[BaiduYun]](https://pan.baidu.com/s/13rdBXUN_ZdelWNlQZ3Y1TQ?pwd=6c3y)

Toronto|[Home](https://www.isprs.org/education/benchmarks/UrbanSemLab/Default.aspx)|-

[GID](https://www.sciencedirect.com/science/article/abs/pii/S0034425719303414)|[Home](https://x-ytong.github.io/project/GID.html)|[[BaiduYun code:GID5]](https://pan.baidu.com/s/1_DQluiDgJ4Z7dXSnciVx1A#list/path=%2F)   [[OneDrive]](https://whueducn-my.sharepoint.com/personal/xinyi_tong_whu_edu_cn/_layouts/15/onedrive.aspx?id=%2Fpersonal%2Fxinyi%5Ftong%5Fwhu%5Fedu%5Fcn%2FDocuments%2FGID&ga=1)
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/lzw-lzw/awesome-remote-sensing-vision-language-models

Awesome Lists containing this project

README