https://github.com/nateraw/stable-diffusion-videos

Create 🔥 videos with Stable Diffusion by exploring the latent space and morphing between text prompts
https://github.com/nateraw/stable-diffusion-videos

ai-art huggingface huggingface-diffusers machine-learning stable-diffusion

Last synced: 9 months ago
JSON representation

Create 🔥 videos with Stable Diffusion by exploring the latent space and morphing between text prompts

Host: GitHub
URL: https://github.com/nateraw/stable-diffusion-videos
Owner: nateraw
License: apache-2.0
Created: 2022-09-06T18:21:50.000Z (over 3 years ago)
Default Branch: main
Last Pushed: 2024-09-21T05:28:49.000Z (over 1 year ago)
Last Synced: 2025-05-10T09:39:31.435Z (9 months ago)
Topics: ai-art, huggingface, huggingface-diffusers, machine-learning, stable-diffusion
Language: Python
Homepage:
Size: 9.75 MB
Stars: 4,579
Watchers: 55
Forks: 441
Open Issues: 54
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

awesome-generative-ai - nateraw/stable-diffusion-videos
StarryDivineSky - nateraw/stable-diffusion-videos
awesome-stable-diffusion - stable-diffusion-videos

README

          # stable-diffusion-videos

Try it yourself in Colab: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/nateraw/stable-diffusion-videos/blob/main/stable_diffusion_videos.ipynb)

**Example** - morphing between "blueberry spaghetti" and "strawberry spaghetti"

https://user-images.githubusercontent.com/32437151/188721341-6f28abf9-699b-46b0-a72e-fa2a624ba0bb.mp4

## Installation

```bash

pip install stable_diffusion_videos

```

## Usage

Check out the [examples](./examples) folder for example scripts 👀

### Making Videos

Note: For Apple M1 architecture, use ```torch.float32``` instead, as ```torch.float16``` is not available on MPS.

```python

from stable_diffusion_videos import StableDiffusionWalkPipeline

import torch

pipeline = StableDiffusionWalkPipeline.from_pretrained(

    "CompVis/stable-diffusion-v1-4",

    torch_dtype=torch.float16,

).to("cuda")

video_path = pipeline.walk(

    prompts=['a cat', 'a dog'],

    seeds=[42, 1337],

    num_interpolation_steps=3,

    height=512,  # use multiples of 64 if > 512. Multiples of 8 if < 512.

    width=512,   # use multiples of 64 if > 512. Multiples of 8 if < 512.

    output_dir='dreams',        # Where images/videos will be saved

    name='animals_test',        # Subdirectory of output_dir where images/videos will be saved

    guidance_scale=8.5,         # Higher adheres to prompt more, lower lets model take the wheel

    num_inference_steps=50,     # Number of diffusion steps per image generated. 50 is good default

)

```

### Making Music Videos

*New!* Music can be added to the video by providing a path to an audio file. The audio will inform the rate of interpolation so the videos move to the beat 🎶

```python

from stable_diffusion_videos import StableDiffusionWalkPipeline

import torch

pipeline = StableDiffusionWalkPipeline.from_pretrained(

    "CompVis/stable-diffusion-v1-4",

    torch_dtype=torch.float16,

).to("cuda")

# Seconds in the song.

audio_offsets = [146, 148]  # [Start, end]

fps = 30  # Use lower values for testing (5 or 10), higher values for better quality (30 or 60)

# Convert seconds to frames

num_interpolation_steps = [(b-a) * fps for a, b in zip(audio_offsets, audio_offsets[1:])]

video_path = pipeline.walk(

    prompts=['a cat', 'a dog'],

    seeds=[42, 1337],

    num_interpolation_steps=num_interpolation_steps,

    audio_filepath='audio.mp3',

    audio_start_sec=audio_offsets[0],

    fps=fps,

    height=512,  # use multiples of 64 if > 512. Multiples of 8 if < 512.

    width=512,   # use multiples of 64 if > 512. Multiples of 8 if < 512.

    output_dir='dreams',        # Where images/videos will be saved

    guidance_scale=7.5,         # Higher adheres to prompt more, lower lets model take the wheel

    num_inference_steps=50,     # Number of diffusion steps per image generated. 50 is good default

)

```

### Using the UI

```python

from stable_diffusion_videos import StableDiffusionWalkPipeline, Interface

import torch

pipeline = StableDiffusionWalkPipeline.from_pretrained(

    "CompVis/stable-diffusion-v1-4",

    torch_dtype=torch.float16,

).to("cuda")

interface = Interface(pipeline)

interface.launch()

```

## Credits

This work built off of [a script](https://gist.github.com/karpathy/00103b0037c5aaea32fe1da1af553355

) shared by [@karpathy](https://github.com/karpathy). The script was modified to [this gist](https://gist.github.com/nateraw/c989468b74c616ebbc6474aa8cdd9e53), which was then updated/modified to this repo. 

## Contributing

You can file any issues/feature requests [here](https://github.com/nateraw/stable-diffusion-videos/issues)

Enjoy 🤗

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/nateraw/stable-diffusion-videos

Awesome Lists containing this project

README