Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/om-ai-lab/awesome-RSVLM

Collection of Remote Sensing Vision-Language Models
https://github.com/om-ai-lab/awesome-RSVLM

List: awesome-RSVLM

Last synced: 3 months ago
JSON representation

Collection of Remote Sensing Vision-Language Models

Awesome Lists containing this project

README

        

# Awesome Remote Sensing Vision-Language Models & Papers
Collection of Remote Sensing Vision-Language models and papers

To add your work to this repo, feel free to submit the request or contact me at [email protected]

## Paper List

- **EarthVQA: Towards Queryable Earth via Relational Reasoning-Based Remote Sensing Visual Question Answering** (2023.12) [[pdf]](https://www.researchgate.net/publication/376519677_EarthVQA_Towards_Queryable_Earth_via_Relational_Reasoning-Based_Remote_Sensing_Visual_Question_Answering)
- Junjue Wang, Zhuo Zheng, Zihang Chen, Ailong Ma, and Yanfei Zhong

- **A Prior Instruction Representation Framework for Remote Sensing Image-text Retrieval** (2023.10) [[pdf]](https://dl.acm.org/doi/10.1145/3581783.3612374)
- Jiancheng Pan, Qing Ma, Cong Bai

- **A Fine-Grained Semantic Alignment Method Specific to Aggregate Multi-Scale Information for Cross-Modal Remote Sensing Image Retrieval** (2023.10) [[pdf]](https://www.mdpi.com/1424-8220/23/20/8437)
- Fuzhong Zheng, Xu Wang, Luyao Wang, Xiong Zhang, Hongze Zhu, Long Wang and Haisu Zhang

- **Multilanguage Transformer for Improved Text to Remote Sensing Image Retrieval** (2023.10) [[pdf]](https://ieeexplore.ieee.org/document/9925582)
- Mohamad M. Al Rahhal; Yakoub Bazi; Norah A. Alsharif; Laila Bashmal; Naif Alajlan; Farid Melgani


- **A Fusion Encoder with Multi-Task Guidance for Cross-Modal Text–Image Retrieval in Remote Sensing** (2023.09) [[pdf]](https://www.mdpi.com/2072-4292/15/18/4637)
- Xiong Zhang, Weipeng Li , Xu Wang, Luyao Wang, Fuzhong Zheng, Long Wang and Haisu Zhang


- **Parameter-Efficient Transfer Learning for Remote Sensing Image-Text Retrieval** (2023.09) [[pdf]](https://arxiv.org/abs/2308.12509)
- Yuan Yuan, Yang Zhan, Zhitong Xiong


- **Hypersphere-based remote sensing cross-modal text–image retrieval via curriculum learning** (2023.09) [[pdf]](https://ieeexplore.ieee.org/document/10261223)
- Weihang Zhang, Jihao Li, Shuoke Li, Jialiang Chen, Wenkai Zhang, Xin Gao, Xian Sun

- **RS5M: A Large Scale Vision-Language Dataset for Remote Sensing Vision-Language Foundation Model** (2023.06) [[pdf]](https://arxiv.org/abs/2306.11300)
- Zilun Zhang, Tiancheng Zhao, Yulong Guo, Jianwei Yin


- **RemoteCLIP: A Vision Language Foundation Model for Remote Sensing** (2023.06) [[pdf]](https://arxiv.org/abs/2306.11029)
- Fan Liu, Delong Chen, Zhangqingyun Guan, Xiaocong Zhou, Jiale Zhu, Jun Zhou

- **Reducing Semantic Confusion: Scene-aware Aggregation Network for Remote Sensing Cross-modal Retrieval** (2023.06) [[pdf]](https://dl.acm.org/doi/abs/10.1145/3591106.3592236)
- Jiancheng Pan, Qing Ma, Cong Bai


- **Vision-Language Models in Remote Sensing: Current Progress and Future Trends** (2023.05) [[pdf]](https://arxiv.org/abs/2305.05726)
- Congcong Wen, Yuan Hu, Xiang Li, Zhenghang Yuan, Xiao Xiang Zhu

- **MCRN: A Multi-source Cross-modal Retrieval Network for remote sensing** (2022.12) [[pdf]](https://www.sciencedirect.com/science/article/pii/S156984322200259X)
- Zhiqiang Yuan, Wenkai Zhang, Changyuan Tian, Yongqiang Mao, Ruixue Zhou, Hongqi Wang, Kun Fu, Xian Sun

- **RSVG: Exploring Data and Models for Visual Grounding on Remote Sensing Data** (2022.10) [[pdf]](https://arxiv.org/abs/2210.12634)
- Yang Zhan, Zhitong Xiong, Yuan Yuan


- **Learning to Evaluate Performance of Multi-modal Semantic Localization** (2022.09) [[pdf]](https://arxiv.org/abs/2209.06515)
- Zhiqiang Yuan, Wenkai Zhang, Chongyang Li, Zhaoying Pan, Yongqiang Mao, Jialiang Chen, Shouke Li, Hongqi Wang, Xian Sun


- **Knowledge-Aware Cross-Modal Text-Image Retrieval for Remote Sensing Images** (2022.09) [[pdf]](https://ceur-ws.org/Vol-3207/paper4.pdf)
- Li Mi, Siran Li, Christel Chappuis, Devis Tuia


- **CLIP-RS: A Cross-modal Remote Sensing Image Retrieval Based on CLIP, a Northern Virginia Case Study** (2022.05) [[pdf]](https://vtechworks.lib.vt.edu/handle/10919/110853)
- Djoufack Basso, Larissa


- **Exploring a Fine-Grained Multiscale Method for Cross-Modal Remote Sensing Image Retrieval** (2022.04) [[pdf]](https://arxiv.org/abs/2204.09868)
- Zhiqiang Yuan, Wenkai Zhang, Kun Fu, Xuan Li, Chubo Deng, Hongqi Wang, Xian Sun


- **Remote Sensing Cross-Modal Text-Image Retrieval Based on Global and Local Information** (2022.04) [[pdf]](https://arxiv.org/abs/2204.09860)
- Zhiqiang Yuan, Wenkai Zhang, Changyuan Tian, Xuee Rong, Zhengyuan Zhang, Hongqi Wang, Kun Fu, Xian Sun

- **Fine tuning CLIP with Remote Sensing (Satellite) images and captions** (2021.10) [[pdf]](https://huggingface.co/blog/fine-tune-clip-rsicd)
- Arto, Dev Vidhani, Goutham, Mayank Bhaskar, Sujit Pal