An open API service indexing awesome lists of open source software.

https://github.com/kyegomez/lumiere

Implementation of the text to video model LUMIERE from the paper: "A Space-Time Diffusion Model for Video Generation" by Google Research
https://github.com/kyegomez/lumiere

agents ai artificial-intelligence machine-learning multi-modal swarms text-to-video ttv

Last synced: about 1 year ago
JSON representation

Implementation of the text to video model LUMIERE from the paper: "A Space-Time Diffusion Model for Video Generation" by Google Research

Awesome Lists containing this project

README

          

[![Multi-Modality](agorabanner.png)](https://discord.gg/qUtxnK2NMf)

# Lumiere
Implementation of the text to video model LUMIERE from the paper: "A Space-Time Diffusion Model for Video Generation" by Google Research. I will mostly be implementing the modules from the diagram a and b in figure 4

## Install
`pip install lumiere`

## Usage
```python
import torch
from lumiere.model import AttentionBasedInflationBlock

# B, T, H, W, D
x = torch.randn(1, 4, 224, 224, 512)

# Model
model = AttentionBasedInflationBlock(dim=512, heads=4, dropout=0.1)

# Forward pass
out = model(x)

# print
print(out.shape) # Expected shape: [1, 4, 224, 224, 512]

```

## Test
```
poetry run pytest

# or
pytest tests/

```

# License
MIT