https://github.com/kyegomez/lumiere

Implementation of the text to video model LUMIERE from the paper: "A Space-Time Diffusion Model for Video Generation" by Google Research
https://github.com/kyegomez/lumiere

agents ai artificial-intelligence machine-learning multi-modal swarms text-to-video ttv

Last synced: about 1 year ago
JSON representation

Implementation of the text to video model LUMIERE from the paper: "A Space-Time Diffusion Model for Video Generation" by Google Research

Host: GitHub
URL: https://github.com/kyegomez/lumiere
Owner: kyegomez
License: mit
Created: 2024-02-05T16:50:40.000Z (over 2 years ago)
Default Branch: main
Last Pushed: 2025-01-27T16:47:02.000Z (over 1 year ago)
Last Synced: 2025-03-31T11:05:42.104Z (about 1 year ago)
Topics: agents, ai, artificial-intelligence, machine-learning, multi-modal, swarms, text-to-video, ttv
Language: Python
Homepage: https://discord.gg/GYbXvDGevY
Size: 2.18 MB
Stars: 51
Watchers: 4
Forks: 5
Open Issues: 1
Metadata Files:
- Readme: README.md
- Funding: .github/FUNDING.yml
- License: LICENSE

Awesome Lists containing this project

README

          [![Multi-Modality](agorabanner.png)](https://discord.gg/qUtxnK2NMf)

# Lumiere 

Implementation of the text to video model LUMIERE from the paper: "A Space-Time Diffusion Model for Video Generation" by Google Research. I will mostly be implementing the modules from the diagram a and b in figure 4

## Install

`pip install lumiere`

## Usage

```python

import torch

from lumiere.model import AttentionBasedInflationBlock

# B, T, H, W, D

x = torch.randn(1, 4, 224, 224, 512)

# Model

model = AttentionBasedInflationBlock(dim=512, heads=4, dropout=0.1)

# Forward pass

out = model(x)

# print

print(out.shape)  # Expected shape: [1, 4, 224, 224, 512]

```

## Test

```

poetry run pytest

# or

pytest tests/

```

# License

MIT

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/kyegomez/lumiere

Awesome Lists containing this project

README