https://github.com/xiaoman-zhang/PMC-VQA

PMC-VQA is a large-scale medical visual question-answering dataset, which contains 227k VQA pairs of 149k images that cover various modalities or diseases.
https://github.com/xiaoman-zhang/PMC-VQA

Last synced: 8 months ago
JSON representation

PMC-VQA is a large-scale medical visual question-answering dataset, which contains 227k VQA pairs of 149k images that cover various modalities or diseases.

Host: GitHub
URL: https://github.com/xiaoman-zhang/PMC-VQA
Owner: xiaoman-zhang
License: mit
Created: 2023-05-16T03:31:15.000Z (over 2 years ago)
Default Branch: master
Last Pushed: 2024-12-06T02:24:55.000Z (12 months ago)
Last Synced: 2024-12-06T04:28:04.771Z (12 months ago)
Language: Python
Size: 290 KB
Stars: 177
Watchers: 3
Forks: 11
Open Issues: 13
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

awesome-multimodal-in-medical-imaging - PMC-VQA

README

          # PMC-VQA

The official codes for [**PMC-VQA: Visual Instruction Tuning for Medical Visual Question Answering**](https://arxiv.org/pdf/2305.10415.pdf)  

	

[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/pmc-vqa-visual-instruction-tuning-for-medical/medical-visual-question-answering-on-pmc-vqa)](https://paperswithcode.com/sota/medical-visual-question-answering-on-pmc-vqa?p=pmc-vqa-visual-instruction-tuning-for-medical)

[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/pmc-vqa-visual-instruction-tuning-for-medical/medical-visual-question-answering-on-vqa-rad)](https://paperswithcode.com/sota/medical-visual-question-answering-on-vqa-rad?p=pmc-vqa-visual-instruction-tuning-for-medical)

We propose a generative-based model for medical visual understanding by aligning visual information from a pre-trained vision encoder with a large language model, and establish a scalable pipeline to construct a large-scale medical visual question-answering dataset, named PMC-VQA, which contains 227k VQA pairs of 149k images that cover various modalities or diseases.

The dataset is available at [Huggingface](https://huggingface.co/datasets/xmcmic/PMC-VQA/)

The model checkpoints are available at [MedVInT-TE](https://huggingface.co/xmcmic/MedVInT-TE/) and [MedVInT-TD](https://huggingface.co/xmcmic/MedVInT-TD/).

**The previous checkpoint of MedVInT-TD was mistakenly uploaded. 

We have rectified the issue and updated the model's checkpoint on July 31. 

Now, you can access the correct and improved version of the model.**

- [PMC-VQA](#pmc-vqa)

  - [Usage](#usage)

    - [1. Create Environment](#1-create-environment)

    - [2. Prepare Dataset](#2-prepare-dataset)

    - [3. Model Checkpoints](#3-checkpoints)

  - [Acknowledgement](#acknowledgement)

  - [Contribution](#contribution)

  - [Cite](#cite)

## Usage

### 1. Create Environment 

Please refer to https://github.com/chaoyi-wu/PMC-LLaMA

### 2. Prepare Dataset 

Download from [Huggingface](https://huggingface.co/datasets/xmcmic/PMC-VQA/) and save into ./PMC-VQA

### 3. Model Checkpoints

Download the pre-trained [MedVInT-TE](https://huggingface.co/xmcmic/MedVInT-TE/), and save into `./src/MedVInT_TE/Results `directly.  

Download the pre-trained [MedVInT-TD](https://huggingface.co/xmcmic/MedVInT-TD/), and save into `./src/MedVInT_TD/Results `directly.  

See [MedVInT_TE](./src/MedVInT_TE/README.md) and [MedVInT_TD](./src/MedVInT_TD/README.md)  for the details of training **MedVInT_TE** and **MedVInT_TD**. 

## Acknowledgement

CLIP -- https://github.com/openai/CLIP

PMC-CLIP -- https://github.com/WeixiongLin/PMC-CLIP

PMC-LLaMA -- [https://github.com/zphang/minimal-llama](https://github.com/chaoyi-wu/PMC-LLaMA)

LLaMA: Open and Efficient Foundation Language Models -- https://arxiv.org/abs/2302.13971

We thank the authors for their open-sourced code and encourage users to cite their works when applicable.

## Contribution

Please raise an issue if you need help, any contributions are welcomed.

## Citation

If you use this code or use our pre-trained weights for your research, please cite our [paper](https://arxiv.org/abs/2305.10415)

```

@article{zhang2023pmcvqa,

      title={PMC-VQA: Visual Instruction Tuning for Medical Visual Question Answering}, 

      author={Xiaoman Zhang and Chaoyi Wu and Ziheng Zhao and Weixiong Lin and Ya Zhang and Yanfeng Wang and Weidi Xie},

      year={2023},

      journal={arXiv preprint arXiv:2305.10415},

}

```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/xiaoman-zhang/PMC-VQA

Awesome Lists containing this project

README