Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

awesome-vision-language-models-for-earth-observation

A curated list of awesome vision and language resources for earth observation.
https://github.com/geoaigroup/awesome-vision-language-models-for-earth-observation

Last synced: 4 days ago
JSON representation

  • Vision-Language Remote Sensing Datasets

    • Link
    • Link
    • Link
    • Link
    • Link
    • Link
    • Link
    • Link - text paired dataset<br>|
    • Link - text pairs in total, covering more than 29K distinct semantic tags |
    • Link - 4292/15/8/2139) | Size : 2864 videos and 14,320 captions, where each video is paired with five unique captions |
    • Link
    • Link
    • Training_Set - v1.0, but the extremely small instances (less than 10 pixels) are also annotated. Moreover, a new category, ”container crane” is added. <br> Use: object detection in aerial images <br>|
    • Link
    • Link
    • Link
    • Link
    • Link
    • link - annotated captions and 936 visual question-answer pairs with rich information and open-ended questions and answers.<br> Can be used for Image Captioning and Visual Question-Answering tasks <br> |
    • Link
    • Link
    • Link
    • Link
    • Link
    • Link
    • Link - 2 and Open Street Map<br>Use: Remote Sensing Visual Question Answering <br>|
    • Link
    • Link - resolution (15cm) <br> Platforms: Sentinel-2, BigEarthNet and Open Street Map<br>Use: Remote Sensing Visual Question Answering <br>|
    • Link
    • Link - question-answer triplets <br>A small part of RSIVQA is annotated by human. Others are automatically generated using existing scene classification datasets and object detection datasets<br>Use: Remote Sensing Visual Question Answering <br>|
    • Link - image pairs <br>Resolution : 224 x 224 <br> Platforms: UAV-DJI Mavic Pro quadcopters, after Hurricane Harvey<br>Use: Remote Sensing Visual Question Answering <br>|
    • link - 5B <br> mean height of 633.0 pixels (up to 9,999) and mean width of 843.7 pixels (up to 19,687) <br> Platforms : Based on LAION-5B <br> |
    • Link
    • Link
    • Link
    • Link
    • images_Link - Captions/blob/main/dataset_nwpu.json) | [Paper Link](https://ieeexplore.ieee.org/document/9866055/) | Size: 31,500 images with 157,500 sentences <br> Number of Classes: 45 <br>Resolution : 256 x 256 pixels<br> Platforms: based on NWPU-RESISC45 dataset <br> Use: Remote Sensing Image Captioning <br>|
    • Link - Text Retrieval<br>|
    • Link
    • Link - 4292/10/6/964) | Size: 2,100 images <br> Number of Classes: 21 <br>Resolution : 256 x 256 <br> Platforms: Extension of the UC Merced <br> Use: Remote Sensing Image Retrieval (RSIR), Classification and Semantic Segmentation<br>|
    • Link - query pairs and 17,402 RS images<br>Number of Classes: 20<br>Resolution : 800 x 800 <br> Platforms: DIOR dataset <br> Use: Remote Sensing Visual Grounding <br>|
    • link - Scale_LanguageAware_Visual_Grounding_on_Remote_Sensing_Data) | Size : 25,452 Images and 48,952 expression in English and Chinese <br> Number of Classes : 14 <br> Resolution : 800 x 800 |
    • link
    • Link
    • Link - m to 1.2-m <br> Platforms: Google Earth and Baidu Map <br> Use: Remote Sensing Object Detection <br>|
    • Link
    • aircraft
    • link - http://gpcv.whu.edu.cn/data/building_dataset.html | [Paper Link](https://arxiv.org/pdf/2208.00657v1.pdf) | Size: more than 220, 000 independent buildings <br>Number of Classes: 1<br>Resolution : 0.075 m spatial resolution and 450 km2 covering in Christchurch, New Zealand <br> Platforms: QuickBird, Worldview series, IKONOS, ZY-3 and 6 neighboring satellite images covering 550 km2 on East Asia with 2.7 m ground resolution.<br> Use: Remote Sensing Building detection and change detection <br>|
    • DOTA-v2.0 - v1.5, it further adds the new categories of ”airport” and ”helipad”. <br> Use: object detection in aerial images <br>|
    • Training_Set - scale_Dataset_for_Instance_Segmentation_in_Aerial_Images_CVPRW_2019_paper.pdf) | Size: 2,806 images with 655,451 object instances<br>Number of Classes: 15<br>Resolution : high resolution <br> Platforms: Dota Dataset <br> Use: semantic segmentation or object detection <br>|
    • link
    • link
    • Link
    • Link
    • Link
    • Link
    • Link
    • Link
    • Link
    • Link
    • Link
    • Link
    • Link
  • Foundation Models

  • Image Captioning

  • Visual Question Answering

    • paper - wang.github.io/homepage/EarthVQA) | AAAI 2024 |
    • paper
    • paper - wang.github.io/homepage/EarthVQA) | AAAI 2024 |
    • paper - berlin.de/rsim/lit4rsvqa) | IEEE IGARSS |
    • paper - data/MQVQA) | IEEE TGRS
    • paper
    • paper - D-Wang/RSAdapter) | |
    • paper
    • paper
    • paper - easy2hard) | IEEE TGRS |
    • paper
    • paper - berlin.de/rsim/multi-modal-fusion-transformer-for-vqa-in-rs) | SPIE Image and Signal Processing for Remote Sensing |
    • paper
    • paper
    • paper
    • paper
    • paper
  • Visual Grounding

  • Text-Image Retrieval