https://github.com/google/prompt-to-prompt

Last synced: 6 months ago
JSON representation

Host: GitHub
URL: https://github.com/google/prompt-to-prompt
Owner: google
License: apache-2.0
Created: 2022-10-03T09:58:43.000Z (about 3 years ago)
Default Branch: main
Last Pushed: 2024-05-14T08:06:32.000Z (over 1 year ago)
Last Synced: 2025-04-08T20:07:43.449Z (6 months ago)
Language: Jupyter Notebook
Size: 5.89 MB
Stars: 3,272
Watchers: 24
Forks: 311
Open Issues: 60
Metadata Files:
- Readme: README.md
- Contributing: contributing.md
- License: LICENSE

Awesome Lists containing this project

awesome-diffusion-categorized - [Code
StarryDivineSky - google/prompt-to-prompt - to-prompt是一个基于潜在扩散模型（Latent Diffusion）和Stable Diffusion的开源图像编辑工具，通过控制扩散模型中的交叉注意力机制实现精准的文本提示引导编辑。主要功能包括：提示编辑：支持替换、细化、重加权三种模式，通过调整注意力权重修改生成图像内容（如替换物体、添加细节、调节属性权重）。真实图像编辑：结合Null-text反演技术，利用DDIM锚定优化，直接对真实图像进行基于文本的局部编辑。 (图像生成 / 资源传输下载)

README

          # Prompt-to-Prompt

> *Latent Diffusion* and *Stable Diffusion* Implementation

![teaser](docs/teaser.png)

### [Project Page](https://prompt-to-prompt.github.io)   [Paper](https://prompt-to-prompt.github.io/ptp_files/Prompt-to-Prompt_preprint.pdf)

## Setup

This code was tested with Python 3.8, [Pytorch](https://pytorch.org/) 1.11 using pre-trained models through [huggingface / diffusers](https://github.com/huggingface/diffusers#readme).

Specifically, we implemented our method over  [Latent Diffusion](https://huggingface.co/CompVis/ldm-text2im-large-256) and  [Stable Diffusion](https://huggingface.co/CompVis/stable-diffusion-v1-4).

Additional required packages are listed in the requirements file.

The code was tested on a Tesla V100 16GB but should work on other cards with at least **12GB** VRAM.

## Quickstart

In order to get started, we recommend taking a look at our notebooks: [**prompt-to-prompt_ldm**][p2p-ldm] and [**prompt-to-prompt_stable**][p2p-stable]. The notebooks contain end-to-end examples of usage of prompt-to-prompt on top of *Latent Diffusion* and *Stable Diffusion* respectively. Take a look at these notebooks to learn how to use the different types of prompt edits and understand the API.

## Prompt Edits

In our notebooks, we perform our main logic by implementing the abstract class `AttentionControl` object, of the following form:

``` python

class AttentionControl(abc.ABC):

    @abc.abstractmethod

    def forward (self, attn, is_cross: bool, place_in_unet: str):

        raise NotImplementedError

```

The `forward` method is called in each attention layer of the diffusion model during the image generation, and we use it to modify the weights of the attention. Our method (See Section 3 of our [paper](https://arxiv.org/abs/2208.01626)) edits images with the procedure above, and  each different prompt edit type modifies the weights of the attention in a different manner.

The general flow of our code is as follows, with variations based on the attention control type:

``` python

prompts = ["A painting of a squirrel eating a burger", ...]

controller = AttentionControl(prompts, ...)

run_and_display(prompts, controller, ...)

```

### Replacement

In this case, the user swaps tokens of the original prompt with others, e.g., the editing the prompt `"A painting of a squirrel eating a burger"` to `"A painting of a squirrel eating a lasagna"` or `"A painting of a lion eating a burger"`. For this we define the class `AttentionReplace`.

### Refinement

In this case, the user adds new tokens to the prompt, e.g., editing the prompt `"A painting of a squirrel eating a burger"` to `"A watercolor painting of a squirrel eating a burger"`. For this we define the class `AttentionEditRefine`.

### Re-weight

In this case, the user changes the weight of certain tokens in the prompt, e.g., for the prompt `"A photo of a poppy field at night"`, strengthen or weaken the extent to which the word `night` affects the resulting image. For this we define the class `AttentionReweight`.

## Attention Control Options

 * `cross_replace_steps`: specifies the fraction of steps to edit the cross attention maps. Can also be set to a dictionary `[str:float]` which specifies fractions for different words in the prompt.

 * `self_replace_steps`: specifies the fraction of steps to replace the self attention maps.

 * `local_blend` (optional):  `LocalBlend` object which is used to make local edits. `LocalBlend` is initialized with the words from each prompt that correspond with the region in the image we want to edit.

 * `equalizer`: used for attention Re-weighting only. A vector of coefficients to multiply each cross-attention weight

## Citation

``` bibtex

@article{hertz2022prompt,

  title = {Prompt-to-Prompt Image Editing with Cross Attention Control},

  author = {Hertz, Amir and Mokady, Ron and Tenenbaum, Jay and Aberman, Kfir and Pritch, Yael and Cohen-Or, Daniel},

  journal = {arXiv preprint arXiv:2208.01626},

  year = {2022},

}

```

# Null-Text Inversion for Editing Real Images

### [Project Page](https://null-text-inversion.github.io/)   [Paper](https://arxiv.org/abs/2211.09794)

Null-text inversion enables intuitive text-based editing of **real images** with the Stable Diffusion model. We use an initial DDIM inversion as an anchor for our optimization which only tunes the null-text embedding used in classifier-free guidance.

![teaser](docs/null_text_teaser.png)

## Editing Real Images

Prompt-to-Prompt editing of real images by first using Null-text inversion is provided in this [**Notebooke**][null_text].

``` bibtex

@article{mokady2022null,

  title={Null-text Inversion for Editing Real Images using Guided Diffusion Models},

  author={Mokady, Ron and Hertz, Amir and Aberman, Kfir and Pritch, Yael and Cohen-Or, Daniel},

  journal={arXiv preprint arXiv:2211.09794},

  year={2022}

}

```

## Disclaimer

This is not an officially supported Google product.

[p2p-ldm]: prompt-to-prompt_ldm.ipynb

[p2p-stable]: prompt-to-prompt_stable.ipynb

[null_text]: null_text_w_ptp.ipynb

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/google/prompt-to-prompt

Awesome Lists containing this project

README