https://github.com/kyegomez/qformer
Implementation of Qformer from BLIP2 in Zeta Lego blocks.
https://github.com/kyegomez/qformer
ai artificial-intelligence attention-mechanism blip2 machine machine-learning multi-modal multi-modality
Last synced: 28 days ago
JSON representation
Implementation of Qformer from BLIP2 in Zeta Lego blocks.
- Host: GitHub
- URL: https://github.com/kyegomez/qformer
- Owner: kyegomez
- License: mit
- Created: 2023-12-29T03:55:46.000Z (almost 2 years ago)
- Default Branch: main
- Last Pushed: 2024-11-11T12:26:18.000Z (11 months ago)
- Last Synced: 2025-08-18T07:32:08.908Z (about 2 months ago)
- Topics: ai, artificial-intelligence, attention-mechanism, blip2, machine, machine-learning, multi-modal, multi-modality
- Language: Python
- Homepage: https://discord.gg/GYbXvDGevY
- Size: 2.19 MB
- Stars: 42
- Watchers: 2
- Forks: 0
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- Funding: .github/FUNDING.yml
- License: LICENSE
Awesome Lists containing this project
README
[](https://discord.gg/qUtxnK2NMf)
# Qformer
Implementation of Qformer from BLIP2 in Zeta Lego blocks. The implementation is here straight from Figure 2. In particular the image block and text block.## Install
`pip3 install qformer`## Usage
```python
import torch
from qformer import QFormer# Create a random tensor of shape (1, 32, 512)
x = torch.randn(1, 32, 512)# Create a random image tensor of shape (1, 3, 224, 224)
img = torch.randn(1, 3, 224, 224)# Create an instance of the QFormer model with the following parameters:
# - input_size: 512
# - num_heads: 8
# - num_layers: 8
# - dropout: 0.1
# - num_classes: 2
# - num_patches: 2
qformer = QFormer(512, 8, 8, 0.1, 2, 2)# Apply the QFormer model to the input tensors x and img
y = qformer(x, img)# Print the shape of the output tensor y
print(y.shape)```
# License
MIT# Citation
```bibtext
@misc{li2023blip2,
title={BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models},
author={Junnan Li and Dongxu Li and Silvio Savarese and Steven Hoi},
year={2023},
eprint={2301.12597},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```