https://github.com/ailln/vqa-roadmap

🍌Visual Question Answering Roadmap.
https://github.com/ailln/vqa-roadmap

roadmap visual-question-answering vqa

Last synced: 3 months ago
JSON representation

🍌Visual Question Answering Roadmap.

Host: GitHub
URL: https://github.com/ailln/vqa-roadmap
Owner: Ailln
License: mit
Created: 2020-01-05T15:14:54.000Z (over 6 years ago)
Default Branch: master
Last Pushed: 2020-01-06T17:51:31.000Z (over 6 years ago)
Last Synced: 2025-06-14T09:06:44.759Z (12 months ago)
Topics: roadmap, visual-question-answering, vqa
Size: 3.91 KB
Stars: 3
Watchers: 2
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          # vqa-roadmap

🍌Visual Question Answering Roadmap.

## ORIGIN

- Official website: [visualqa.org](https://visualqa.org/index.html)

- VQA dataset: [Full release (v2.0)](https://visualqa.org/download.html)

- paper: [VQA: Visual Question Answering (ICCV 2015)](https://arxiv.org/pdf/1505.00468.pdf)

## Read List

### Survey

- paper:

  - [Visual Question Answering: Datasets, Algorithms, and Future Challenges](https://arxiv.org/pdf/1610.01465.pdf)

  - [Visual question answering: A survey of methods and datasets](https://arxiv.org/pdf/1607.05910.pdf)

  - [Survey of Visual Question Answering: Datasets and Techniques](https://arxiv.org/pdf/1705.03865.pdf)

- code:

  - [pythia](https://github.com/facebookresearch/pythia)

- blog:

  - [awesome-vqa](https://github.com/chingyaoc/awesome-vqa)

  - [awesome-visual-question-answering](https://github.com/jokieleung/awesome-visual-question-answering)

  - [Tutorial on Answering Questions about Images with Deep Learning](https://arxiv.org/pdf/1610.01076.pdf)

### Origin VQA

- paper: [VQA: Visual Question Answering](https://arxiv.org/pdf/1505.00468.pdf)

- code: [Visual-Question-Answering](https://github.com/Axe--/Visual-Question-Answering)

- blog: None

### DFAF

- paper: [Dynamic Fusion with Intra- and Inter- Modality Attention Flow for Visual Question Answering](https://arxiv.org/pdf/1812.05252.pdf)

- code: None

- blog: [利用intra-和inter-modality关系的VQA模型](https://zhuanlan.zhihu.com/p/65497795)

### MUREL

- paper: [MUREL: Multimodal Relational Reasoning for Visual Question Answering](https://arxiv.org/pdf/1902.09487.pdf)

- code: None

- blog: [基于多模态关系推理的VQA模型](https://zhuanlan.zhihu.com/p/60972299)

## Expert

- [周博磊](http://people.csail.mit.edu/bzhou/)

- [吴琦](http://www.qi-wu.me/)

## Reference

- [2017 VQA Challenge 第一名技术报告](https://zhuanlan.zhihu.com/p/29688475)

- [由浅及深，细致解读图像问答 VQA 2018 Challenge 冠军模型 Pythia](https://zhuanlan.zhihu.com/p/56505674)

- [一文带你了解视觉问答VQA](https://www.jianshu.com/p/76d2e081e303)

- [【CV+NLP】更有智慧的眼睛：图像描述（Image Caption）&视觉问答（VQA）综述（上）](https://zhuanlan.zhihu.com/p/52499758)

- [DeepMind提出视觉问答新模型，CLEVR精度98.8％](https://zhuanlan.zhihu.com/p/41546921)

- [学界 | 视觉问答全景概述：从数据集到技术方法](https://mp.weixin.qq.com/s?__biz=MzA3MzI4MjgzMw==&mid=2650727250&idx=4&sn=161751cc5999099b1b217ed659a17aa2)

- [梅涛：深度学习为视觉和语言之间搭建了一座桥梁](https://www.msra.cn/zh-cn/news/features/vision-and-language-20170713)

- [Visual Question Answering 简介 + 近年文章](https://zhuanlan.zhihu.com/p/57207832)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/ailln/vqa-roadmap

Awesome Lists containing this project

README