https://github.com/kyegomez/multimodalcrossattn

The open source implementation of the cross attention mechanism from the paper: "JOINTLY TRAINING LARGE AUTOREGRESSIVE MULTIMODAL MODELS"
https://github.com/kyegomez/multimodalcrossattn

artificial-intelligence attention attention-is-all-you-need attention-mechanism attn gpt4 multimodal multimodal-deep-learning

Last synced: 5 months ago
JSON representation

The open source implementation of the cross attention mechanism from the paper: "JOINTLY TRAINING LARGE AUTOREGRESSIVE MULTIMODAL MODELS"

Host: GitHub
URL: https://github.com/kyegomez/multimodalcrossattn
Owner: kyegomez
License: mit
Created: 2023-10-02T05:00:03.000Z (about 2 years ago)
Default Branch: main
Last Pushed: 2024-03-11T10:50:11.000Z (over 1 year ago)
Last Synced: 2025-05-02T02:03:51.446Z (6 months ago)
Topics: artificial-intelligence, attention, attention-is-all-you-need, attention-mechanism, attn, gpt4, multimodal, multimodal-deep-learning
Language: Python
Homepage: https://discord.gg/qUtxnK2NMf
Size: 223 KB
Stars: 27
Watchers: 2
Forks: 1
Open Issues: 2
Metadata Files:
- Readme: README.md
- Funding: .github/FUNDING.yml
- License: LICENSE

Awesome Lists containing this project

README

          [![Multi-Modality](agorabanner.png)](https://discord.gg/qUtxnK2NMf)

# MultiModalCrossAttn

The open source implementation of the cross attention mechanism from the paper: "JOINTLY TRAINING LARGE AUTOREGRESSIVE MULTIMODAL MODELS"

[Paper Link](https://arxiv.org/pdf/2309.15564.pdf)

# Appreciation

* Lucidrains

* Agorians

# Install

`pip install cross-attn`

# Usage

```python

import torch

from cross_attn.main import MultiModalCrossAttention

# Test the MultiModalCrossAttention module

dim = 512  # For example

num_heads = 8

cross_attn = MultiModalCrossAttention(dim, num_heads)

Hllm_sample = torch.randn(32, dim, dim)  # Batch size = 32, Sequence length = 10

Himg_sample = torch.randn(32, dim, dim)

output = cross_attn(Hllm_sample, Himg_sample)

print(output)

print(output.shape)  # Expected: [32, 10, 512]

```

# License

MIT

# Citations

```

@misc{2309.15564,

Author = {Emanuele Aiello and Lili Yu and Yixin Nie and Armen Aghajanyan and Barlas Oguz},

Title = {Jointly Training Large Autoregressive Multimodal Models},

Year = {2023},

Eprint = {arXiv:2309.15564},

}

```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/kyegomez/multimodalcrossattn

Awesome Lists containing this project

README