Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/slinusc/path-vqa-blip

Fine-tuning BLIP for pathological visual question answering.
https://github.com/slinusc/path-vqa-blip

blip multimodal-deep-learning pathology

Last synced: 5 days ago
JSON representation

Fine-tuning BLIP for pathological visual question answering.

Awesome Lists containing this project

README

        

### Abstract

This project fine-tunes the BLIP (Bootstrapping Language-Image Pre-training) model for pathological Visual Question Answering (VQA) to improve accuracy in pathological yes/no questions. Utilizing the PathVQA dataset with 32,799 question-answer pairs from 4,998 pathology images, the model was trained using the AdamW optimizer, learning rate scheduling, and mixed precision training. Hyperparameter optimization via Optuna led to significant performance improvements: accuracy rose from 0.5164 to 0.8554, precision from 0.5344 to 0.8560, recall from 0.8122 to 0.8805, and F1 score from 0.6447 to 0.8681.