https://github.com/thisisiron/QFormer_Pretraining

Implementation of Qformer pre-training
https://github.com/thisisiron/QFormer_Pretraining

blip-2 blip2 qformer vision-language-model vlm

Last synced: about 1 month ago
JSON representation

Implementation of Qformer pre-training

Host: GitHub
URL: https://github.com/thisisiron/QFormer_Pretraining
Owner: thisisiron
License: mit
Created: 2024-09-24T14:30:09.000Z (8 months ago)
Default Branch: main
Last Pushed: 2024-11-18T13:32:56.000Z (6 months ago)
Last Synced: 2025-03-16T06:43:29.187Z (2 months ago)
Topics: blip-2, blip2, qformer, vision-language-model, vlm
Language: Python
Homepage:
Size: 141 KB
Stars: 1
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# Pre-training Q-Former
This repository contains code for pre-training Q-Former using the transformers library. The code supports training converting pre-trained LAVIS BLIP-2 models to the PyTorch transformers format.

### Features

- Pre-train Q-Former from scratch using transformers library
- Convert LAVIS BLIP-2 Q-Former weights to transformers format

## Usage

### From LAVIS BLIP-2
To run the script for pre-training Q-Former from lavis, use the following command:
```
```

## Citation
```bibtex
@article{blip2,
title={BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models},
author={Junnan Li and Dongxu Li and Silvio Savarese and Steven Hoi},
journal={arXiv:2301.12597},
year={2023}
}
```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/thisisiron/QFormer_Pretraining

Awesome Lists containing this project

README