https://github.com/kyegomez/lumiere
Implementation of the text to video model LUMIERE from the paper: "A Space-Time Diffusion Model for Video Generation" by Google Research
https://github.com/kyegomez/lumiere
agents ai artificial-intelligence machine-learning multi-modal swarms text-to-video ttv
Last synced: about 1 year ago
JSON representation
Implementation of the text to video model LUMIERE from the paper: "A Space-Time Diffusion Model for Video Generation" by Google Research
- Host: GitHub
- URL: https://github.com/kyegomez/lumiere
- Owner: kyegomez
- License: mit
- Created: 2024-02-05T16:50:40.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2025-01-27T16:47:02.000Z (over 1 year ago)
- Last Synced: 2025-03-31T11:05:42.104Z (about 1 year ago)
- Topics: agents, ai, artificial-intelligence, machine-learning, multi-modal, swarms, text-to-video, ttv
- Language: Python
- Homepage: https://discord.gg/GYbXvDGevY
- Size: 2.18 MB
- Stars: 51
- Watchers: 4
- Forks: 5
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- Funding: .github/FUNDING.yml
- License: LICENSE
Awesome Lists containing this project
README
[](https://discord.gg/qUtxnK2NMf)
# Lumiere
Implementation of the text to video model LUMIERE from the paper: "A Space-Time Diffusion Model for Video Generation" by Google Research. I will mostly be implementing the modules from the diagram a and b in figure 4
## Install
`pip install lumiere`
## Usage
```python
import torch
from lumiere.model import AttentionBasedInflationBlock
# B, T, H, W, D
x = torch.randn(1, 4, 224, 224, 512)
# Model
model = AttentionBasedInflationBlock(dim=512, heads=4, dropout=0.1)
# Forward pass
out = model(x)
# print
print(out.shape) # Expected shape: [1, 4, 224, 224, 512]
```
## Test
```
poetry run pytest
# or
pytest tests/
```
# License
MIT