Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/xiaoman-zhang/PMC-VQA
PMC-VQA is a large-scale medical visual question-answering dataset, which contains 227k VQA pairs of 149k images that cover various modalities or diseases.
https://github.com/xiaoman-zhang/PMC-VQA
Last synced: about 1 month ago
JSON representation
PMC-VQA is a large-scale medical visual question-answering dataset, which contains 227k VQA pairs of 149k images that cover various modalities or diseases.
- Host: GitHub
- URL: https://github.com/xiaoman-zhang/PMC-VQA
- Owner: xiaoman-zhang
- License: mit
- Created: 2023-05-16T03:31:15.000Z (over 1 year ago)
- Default Branch: master
- Last Pushed: 2024-03-21T11:34:27.000Z (9 months ago)
- Last Synced: 2024-08-01T13:17:44.821Z (4 months ago)
- Language: Python
- Size: 290 KB
- Stars: 158
- Watchers: 3
- Forks: 12
- Open Issues: 13
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- awesome-multimodal-in-medical-imaging - PMC-VQA
README
# PMC-VQA
The official codes for [**PMC-VQA: Visual Instruction Tuning for Medical Visual Question Answering**](https://arxiv.org/pdf/2305.10415.pdf)
[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/pmc-vqa-visual-instruction-tuning-for-medical/medical-visual-question-answering-on-pmc-vqa)](https://paperswithcode.com/sota/medical-visual-question-answering-on-pmc-vqa?p=pmc-vqa-visual-instruction-tuning-for-medical)[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/pmc-vqa-visual-instruction-tuning-for-medical/medical-visual-question-answering-on-vqa-rad)](https://paperswithcode.com/sota/medical-visual-question-answering-on-vqa-rad?p=pmc-vqa-visual-instruction-tuning-for-medical)
We propose a generative-based model for medical visual understanding by aligning visual information from a pre-trained vision encoder with a large language model, and establish a scalable pipeline to construct a large-scale medical visual question-answering dataset, named PMC-VQA, which contains 227k VQA pairs of 149k images that cover various modalities or diseases.
The dataset is available at [Huggingface](https://huggingface.co/datasets/xmcmic/PMC-VQA/)
The model checkpoints are available at [MedVInT-TE](https://huggingface.co/xmcmic/MedVInT-TE/) and [MedVInT-TD](https://huggingface.co/xmcmic/MedVInT-TD/).
**The previous checkpoint of MedVInT-TD was mistakenly uploaded.
We have rectified the issue and updated the model's checkpoint on July 31.
Now, you can access the correct and improved version of the model.**- [PMC-VQA](#pmc-vqa)
- [Usage](#usage)
- [1. Create Environment](#1-create-environment)
- [2. Prepare Dataset](#2-prepare-dataset)
- [3. Model Checkpoints](#3-checkpoints)
- [Acknowledgement](#acknowledgement)
- [Contribution](#contribution)
- [Cite](#cite)## Usage
### 1. Create Environment
Please refer to https://github.com/chaoyi-wu/PMC-LLaMA
### 2. Prepare Dataset
Download from [Huggingface](https://huggingface.co/datasets/xmcmic/PMC-VQA/) and save into ./PMC-VQA
### 3. Model Checkpoints
Download the pre-trained [MedVInT-TE](https://huggingface.co/xmcmic/MedVInT-TE/), and save into `./src/MedVInT_TE/Results `directly.
Download the pre-trained [MedVInT-TD](https://huggingface.co/xmcmic/MedVInT-TD/), and save into `./src/MedVInT_TD/Results `directly.
See [MedVInT_TE](./src/MedVInT_TE/README.md) and [MedVInT_TD](./src/MedVInT_TD/README.md) for the details of training **MedVInT_TE** and **MedVInT_TD**.
## Acknowledgement
CLIP -- https://github.com/openai/CLIP
PMC-CLIP -- https://github.com/WeixiongLin/PMC-CLIP
PMC-LLaMA -- [https://github.com/zphang/minimal-llama](https://github.com/chaoyi-wu/PMC-LLaMA)
LLaMA: Open and Efficient Foundation Language Models -- https://arxiv.org/abs/2302.13971
We thank the authors for their open-sourced code and encourage users to cite their works when applicable.
## Contribution
Please raise an issue if you need help, any contributions are welcomed.
## Citation
If you use this code or use our pre-trained weights for your research, please cite our [paper](https://arxiv.org/abs/2305.10415)
```
@article{zhang2023pmcvqa,
title={PMC-VQA: Visual Instruction Tuning for Medical Visual Question Answering},
author={Xiaoman Zhang and Chaoyi Wu and Ziheng Zhao and Weixiong Lin and Ya Zhang and Yanfeng Wang and Weidi Xie},
year={2023},
journal={arXiv preprint arXiv:2305.10415},
}
```