https://github.com/declare-lab/lg-vqa
https://github.com/declare-lab/lg-vqa
Last synced: about 1 year ago
JSON representation
- Host: GitHub
- URL: https://github.com/declare-lab/lg-vqa
- Owner: declare-lab
- Created: 2023-10-07T23:43:05.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2023-11-03T04:36:44.000Z (over 2 years ago)
- Last Synced: 2025-03-27T18:21:40.335Z (about 1 year ago)
- Language: Python
- Size: 29.4 MB
- Stars: 7
- Watchers: 2
- Forks: 1
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# LG-VQA
The repository contains the code for the paper [Language Guided Visual Question Answering: Elevate Your Multimodal Language Model Using Knowledge-Enriched Prompts](https://arxiv.org/abs/2310.20159) published at Findings of EMNLP 2023.
# Usage
Download the images for the respective datasets and update the image paths in the `data/*/*.json` files.
The code for guidance genenration can be found in [guidance](https://github.com/declare-lab/LG-VQA/tree/main/guidance). We have pre-computed the guidances and uploaded it in the `data` folder.
The VQA models can be trained using the `train.py` script. Some examples commands are shown in the `run.sh` file.