https://github.com/kyegomez/qformer

Implementation of Qformer from BLIP2 in Zeta Lego blocks.
https://github.com/kyegomez/qformer

ai artificial-intelligence attention-mechanism blip2 machine machine-learning multi-modal multi-modality

Last synced: 28 days ago
JSON representation

Implementation of Qformer from BLIP2 in Zeta Lego blocks.

Host: GitHub
URL: https://github.com/kyegomez/qformer
Owner: kyegomez
License: mit
Created: 2023-12-29T03:55:46.000Z (almost 2 years ago)
Default Branch: main
Last Pushed: 2024-11-11T12:26:18.000Z (11 months ago)
Last Synced: 2025-08-18T07:32:08.908Z (about 2 months ago)
Topics: ai, artificial-intelligence, attention-mechanism, blip2, machine, machine-learning, multi-modal, multi-modality
Language: Python
Homepage: https://discord.gg/GYbXvDGevY
Size: 2.19 MB
Stars: 42
Watchers: 2
Forks: 0
Open Issues: 1
Metadata Files:
- Readme: README.md
- Funding: .github/FUNDING.yml
- License: LICENSE

Awesome Lists containing this project

README

          [![Multi-Modality](agorabanner.png)](https://discord.gg/qUtxnK2NMf)

# Qformer

Implementation of Qformer from BLIP2 in Zeta Lego blocks. The implementation is here straight from Figure 2. In particular the image block and text block.

## Install

`pip3 install qformer`

## Usage

```python

import torch

from qformer import QFormer

# Create a random tensor of shape (1, 32, 512)

x = torch.randn(1, 32, 512)

# Create a random image tensor of shape (1, 3, 224, 224)

img = torch.randn(1, 3, 224, 224)

# Create an instance of the QFormer model with the following parameters:

# - input_size: 512

# - num_heads: 8

# - num_layers: 8

# - dropout: 0.1

# - num_classes: 2

# - num_patches: 2

qformer = QFormer(512, 8, 8, 0.1, 2, 2)

# Apply the QFormer model to the input tensors x and img

y = qformer(x, img)

# Print the shape of the output tensor y

print(y.shape)

```

# License

MIT

# Citation

```bibtext

@misc{li2023blip2,

    title={BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models}, 

    author={Junnan Li and Dongxu Li and Silvio Savarese and Steven Hoi},

    year={2023},

    eprint={2301.12597},

    archivePrefix={arXiv},

    primaryClass={cs.CV}

}

```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/kyegomez/qformer

Awesome Lists containing this project

README