https://github.com/kyegomez/multimodalcrossattn
The open source implementation of the cross attention mechanism from the paper: "JOINTLY TRAINING LARGE AUTOREGRESSIVE MULTIMODAL MODELS"
https://github.com/kyegomez/multimodalcrossattn
artificial-intelligence attention attention-is-all-you-need attention-mechanism attn gpt4 multimodal multimodal-deep-learning
Last synced: 5 months ago
JSON representation
The open source implementation of the cross attention mechanism from the paper: "JOINTLY TRAINING LARGE AUTOREGRESSIVE MULTIMODAL MODELS"
- Host: GitHub
- URL: https://github.com/kyegomez/multimodalcrossattn
- Owner: kyegomez
- License: mit
- Created: 2023-10-02T05:00:03.000Z (about 2 years ago)
- Default Branch: main
- Last Pushed: 2024-03-11T10:50:11.000Z (over 1 year ago)
- Last Synced: 2025-05-02T02:03:51.446Z (6 months ago)
- Topics: artificial-intelligence, attention, attention-is-all-you-need, attention-mechanism, attn, gpt4, multimodal, multimodal-deep-learning
- Language: Python
- Homepage: https://discord.gg/qUtxnK2NMf
- Size: 223 KB
- Stars: 27
- Watchers: 2
- Forks: 1
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- Funding: .github/FUNDING.yml
- License: LICENSE
Awesome Lists containing this project
README
[](https://discord.gg/qUtxnK2NMf)
# MultiModalCrossAttn
The open source implementation of the cross attention mechanism from the paper: "JOINTLY TRAINING LARGE AUTOREGRESSIVE MULTIMODAL MODELS"[Paper Link](https://arxiv.org/pdf/2309.15564.pdf)
# Appreciation
* Lucidrains
* Agorians# Install
`pip install cross-attn`# Usage
```python
import torch
from cross_attn.main import MultiModalCrossAttention# Test the MultiModalCrossAttention module
dim = 512 # For example
num_heads = 8cross_attn = MultiModalCrossAttention(dim, num_heads)
Hllm_sample = torch.randn(32, dim, dim) # Batch size = 32, Sequence length = 10
Himg_sample = torch.randn(32, dim, dim)output = cross_attn(Hllm_sample, Himg_sample)
print(output)print(output.shape) # Expected: [32, 10, 512]
```
# License
MIT# Citations
```
@misc{2309.15564,
Author = {Emanuele Aiello and Lili Yu and Yixin Nie and Armen Aghajanyan and Barlas Oguz},
Title = {Jointly Training Large Autoregressive Multimodal Models},
Year = {2023},
Eprint = {arXiv:2309.15564},
}
```