https://github.com/woctezuma/stylegan2-projecting-images

Projecting images to latent space with StyleGAN2.
https://github.com/woctezuma/stylegan2-projecting-images
colab colab-notebook colaboratory gan generative-adversarial-network google-colab google-colab-notebook google-colaboratory style-gan stylegan stylegan-model stylegan2
Last synced: about 1 month ago
JSON representation
Projecting images to latent space with StyleGAN2.
Host: GitHub
URL: https://github.com/woctezuma/stylegan2-projecting-images
Owner: woctezuma
License: mit
Created: 2020-02-14T19:09:00.000Z (over 5 years ago)
Default Branch: master
Last Pushed: 2023-02-06T09:55:08.000Z (over 2 years ago)
Last Synced: 2025-04-02T08:11:43.628Z (about 2 months ago)
Topics: colab, colab-notebook, colaboratory, gan, generative-adversarial-network, google-colab, google-colab-notebook, google-colaboratory, style-gan, stylegan, stylegan-model, stylegan2
Language: Jupyter Notebook
Homepage: https://colab.research.google.com/
Size: 29.2 MB
Stars: 298
Watchers: 6
Forks: 27
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project

README

        # StyleGAN2: projecting images

The goal of this [Google Colab](https://colab.research.google.com/) notebook is to project images to latent space with StyleGAN2.

## Usage

To discover how to project a real image using the original StyleGAN2 implementation, run:

-   [`stylegan2_projecting_images.ipynb`][stylegan2_projecting_images]

[![Open In Colab][colab-badge]][stylegan2_projecting_images]

To process the projection of **a batch** of images, using either `W(1,*)` (original) or `W(18,*)` (extended), run:

-   [`stylegan2_projecting_images_with_my_fork.ipynb`][stylegan2_projecting_images_with_fork]

[![Open In Colab][colab-badge]][stylegan2_projecting_images_with_fork]

To edit latent vectors of projected images, run:

-   [`stylegan2_editing_latent_vectors.ipynb`][stylegan2_editing_latent_vectors]

[![Open In Colab][colab-badge]][stylegan2_editing_latent_vectors]

For more information about `W(1,*)` and `W(18,*)`, please refer to the [the original paper][stylegan2-paper] (section 5 on page 7):

> Inverting the synthesis network $g$ is an interesting problem that has many applications.

> Manipulating a given image in the latent feature space requires finding a matching latent code $w$ for it first.

The following is about `W(18,*)`:

> Previous research suggests that instead of finding a common latent code $w$, the results improve if a separate $w$ is chosen for each layer of the generator.

> The same approach was used in an early encoder implementation.

The following is about `W(1,*)`, which is the approach used in the original implementation:

> While extending the latent space in this fashion finds a closer match to a given image, it also enables projecting arbitrary images that should have no latent representation.

> Instead, we concentrate on finding latent codes in the original, unextended latent space, as these correspond to images that the generator could have produced.

## Data

Data consists of:

-   1 picture of the French president Emmanuel Macron, found on [Nice Matin][french-president] ([archive][french-president-archive]),

-   37 individual pictures of the French government, found on [Wikipedia][french-government] ([list][french-government-archive]),

-   5 pictures of famous paintings, found on Wikipedia ([list][famous-paintings-archive]):

    - [La Joconde](https://fr.wikipedia.org/wiki/La_Joconde), 

    - [Le Condottière](https://fr.wikipedia.org/wiki/Le_Condotti%C3%A8re_(Antonello_de_Messine)),

    - [La Naissance de Vénus](https://fr.wikipedia.org/wiki/La_Naissance_de_V%C3%A9nus_(Botticelli)),

    - [Ginevra de' Benci](https://fr.wikipedia.org/wiki/Portrait_de_Ginevra_de%27_Benci),

    - [La Jeune Fille à la perle](https://fr.wikipedia.org/wiki/La_Jeune_Fille_%C3%A0_la_perle).



## Pre-processing

There are two possible pre-processing methods:

-   either center-cropping (to 1024x1024 resolution) as sole pre-processing,



-   or the same pre-processing as for the [FFHQ dataset]:

    1. first, an alignment based on 68 face landmarks returned by [dlib],

    2. then reproduce `recreate_aligned_images()`, as detailed in [FFHQ pre-processing code].





Finally, the pre-processed image can be projected to the latent space of the StyleGAN2 model trained with configuration f on the Flickr-Faces-HQ (FFHQ) dataset.

## Results: influence of pre-processing

NB: results are different if the code is run twice, even if the same pre-processing is used.

### With center-cropping as sole pre-processing

The result below is obtained with center-cropping as sole pre-processing, hence some issues with the projection.

![Projection (with issues) as GIF](https://raw.githubusercontent.com/wiki/woctezuma/stylegan2-projecting-images/gif/movie0000-opt.gif)

From left to right: the target image, the result obtained at the start of the projection, and the final result of the projection.



From left to right: the target image, the result obtained at the start of the projection, intermediate results, and the final result.

![Projection results (with issues) as PNG](https://raw.githubusercontent.com/wiki/woctezuma/stylegan2-projecting-images/results/result0000.jpg)

The background, the hair, the ears, and the suit are relatively well reproduced, but the face is wrong, especially the neck (in the original image) is confused with the chin (in the projected images).

It is possible that the face is too small relatively to the rest of the image, compared to the FFHQ training dataset, hence the poor results of the projection.

### With the same pre-processing as for the FFHQ dataset

The result below is obtained with the same pre-processing as for the FFHQ dataset, which allows to avoid the projection issues mentioned above.

![Projection (without issues) as GIF](https://raw.githubusercontent.com/wiki/woctezuma/stylegan2-projecting-images/gif/movie0001-opt.gif)

From left to right: the target image, the result obtained at the start of the projection, and the final result of the projection.



From left to right: the target image, the result obtained at the start of the projection, intermediate results, and the final result.

![Projection results as PNG](https://raw.githubusercontent.com/wiki/woctezuma/stylegan2-projecting-images/results/result0001.jpg)

## Results: comparison with the extended projection

For the rest of the repository, the same-preprocessing as for the FFHQ dataset is used.

### Shared data on Google Drive

Additional projection results are shown [on the Wiki][wiki-all-the-projections].

To make it easier to download them, they are also shared on [Google Drive][additional-projection-results].

The directory structure is as follows:

```

stylegan2_projections/

├ aligned_images/

├ └ emmanuel-macron_01.png    # FFHQ-aligned image

├ generated_images_no_tiled/  # projections with `W(18,*)`

├ ├ emmanuel-macron_01.npy    # - latent code

├ └ emmanuel-macron_01.png    # - projected image

├ generated_images_tiled/     # projections with `W(1,*)`

├ ├ emmanuel-macron_01.npy    # - latent code

├ └ emmanuel-macron_01.png    # - projected image

├ aligned_images.tar.gz             # folder archive

├ generated_images_no_tiled.tar.gz  # folder archive

└ generated_images_tiled.tar.gz     # folder archive

```

### Projection results

Images below allow us to compare results obtained with the original projection `W(1,*)` and the extended projection `W(18,*)`.

A projected image obtained with `W(18,*)` is expected to be closer to the target image, [at the expense of semantics][extended-projection-limitations].

If image fidelity is very important, `W(18,*)` can be run for a higher number of iterations (default is 1000 steps), but truncation might be needed for later applications.

#### French politicians

From top to bottom: aligned target image, projection with `W(1,*)`, projection with `W(18,*)`.







From top to bottom: aligned target image, projection with `W(1,*)`, projection with `W(18,*)`.







#### Art

From top to bottom: aligned target image, projection with `W(1,*)`, projection with `W(18,*)`.







## Applications

In the following, we assume that real images have been projected, so that we have access to their latent codes, of shape `(1, 512)` or `(18, 512)` depending on the projection method.

There are three main applications:

1.   [morphing][wiki-application-morphing] (linear interpolation),

2.   [style transfer][wiki-application-style-transfer] (crossover),

3.   [expression transfer][wiki-application-expression-transfer] (adding a vector and a scaled difference vector).

### Shared data on Google Drive

Results corresponding to each application are:

-   shown [on the Wiki][wiki-all-the-applications],

-   shared on [Google Drive][google-drive-application-results].

The directory structure is as follows:

```

stylegan2_editing/

├ expression/                   # expression transfer

| ├ no_tiled/                   # - `W(18,*)`

| | └ expression_01_age.jpg     # face n°1 ; age

| └ tiled/                      # - `W(1,*)`

|   └ expression_01_age.jpg

├ morphing/                     # morphing

| ├ no_tiled/                   # - `W(18,*)`

| | └ morphing_07_01.jpg        # face n°7 to face n°1

| └ tiled/                      # - `W(1,*)`

|   └ morphing_07_01.jpg

├ style_mixing/                 # style transfer

| ├ no_tiled/                   # - `W(18,*)`

| | └ style_mixing_42-07-10-29-41_42-07-22-39.jpg

| └ tiled/                      # - `W(1,*)`

|   └ style_mixing_42-07-10-29-41_42-07-22-39.jpg

├ video_style_mixing/           # style transfer

| ├ no_tiled/                   # - `W(18,*)`

| | └ video_style_mixing_000.000.jpg

| ├ tiled/                      # - `W(1,*)`

| | └ video_style_mixing_000.000.jpg

| ├ no_tiled_small.mp4          # with 2 reference faces

| ├ no_ tiled.mp4               # with 4 reference faces

| ├ tiled_small.mp4

| └ tiled.mp4

├ expression_transfer.tar.gz    # folder archive

├ morphing.tar.gz               # folder archive

├ style_mixing.tar.gz           # folder archive

└ video_style_mixing.tar.gz     # folder archive

```

### 1. Morphing

Morphing consists in a linear interpolation between two latent vectors (two faces).

Results are shown [on the Wiki][wiki-application-morphing].

#### With the original projection `W(1,*)`

![Morphing](https://raw.githubusercontent.com/wiki/woctezuma/stylegan2-projecting-images/applications/morphing/tiled/morphing_42_22.jpg)

![Morphing](https://raw.githubusercontent.com/wiki/woctezuma/stylegan2-projecting-images/applications/morphing/tiled/morphing_42_18.jpg)

![Morphing](https://raw.githubusercontent.com/wiki/woctezuma/stylegan2-projecting-images/applications/morphing/tiled/morphing_07_22.jpg)

![Morphing](https://raw.githubusercontent.com/wiki/woctezuma/stylegan2-projecting-images/applications/morphing/tiled/morphing_07_18.jpg)

#### With the extended projection `W(18,*)`

![Morphing](https://raw.githubusercontent.com/wiki/woctezuma/stylegan2-projecting-images/applications/morphing/no_tiled/morphing_42_22.jpg)

![Morphing](https://raw.githubusercontent.com/wiki/woctezuma/stylegan2-projecting-images/applications/morphing/no_tiled/morphing_42_18.jpg)

![Morphing](https://raw.githubusercontent.com/wiki/woctezuma/stylegan2-projecting-images/applications/morphing/no_tiled/morphing_07_22.jpg)

![Morphing](https://raw.githubusercontent.com/wiki/woctezuma/stylegan2-projecting-images/applications/morphing/no_tiled/morphing_07_18.jpg)

### 2. Style transfer

Style transfer consists in a crossover of latent vectors at the layer level (cf. [this piece of code][style-transfer-code]).

There are 18 layers for the generator.

The latent vector of the reference face is used for the first 7 layers.

The latent vector of the face whose style has to be copied is used for the remaining 11 layers.

Results are shown [on the Wiki][wiki-application-style-transfer].

#### With the original projection `W(1,*)`

Thanks to morphing of the faces whose style is copied, style transfer can be [watched as a video][style-transfer-video-tiled].

![Style Transfer](https://raw.githubusercontent.com/wiki/woctezuma/stylegan2-projecting-images/applications/video_style_mixing/tiled/video_style_mixing_006.017.jpg)

#### With the extended projection `W(18,*)`

Thanks to morphing of the faces whose style is copied, style transfer can be [watched as a video][style-transfer-video-no-tiled].

![Style Transfer](https://raw.githubusercontent.com/wiki/woctezuma/stylegan2-projecting-images/applications/video_style_mixing/no_tiled/video_style_mixing_006.017.jpg)

### 3. Expression transfer

Expression transfer consists in the addition of:

-   a latent vector (a face),

-   a scaled difference vector (an expression).

Expressions were defined, learnt, and [shared on Github][learnt-latent-directions] by a Chinese speaker:

1.   age

1.   angle_horizontal

1.   angle_pitch

1.   beauty

1.   emotion_angry

1.   emotion_disgust

1.   emotion_easy

1.   emotion_fear

1.   emotion_happy

1.   emotion_sad

1.   emotion_surprise

1.   eyes_open

1.   face_shape

1.   gender

1.   glasses

1.   height

1.   race_black

1.   race_white

1.   race_yellow

1.   smile

1.   width

Results are shown [on the Wiki][wiki-application-expression-transfer].

#### With the original projection `W(1,*)`

- Age:

![Expression Transfer](https://raw.githubusercontent.com/wiki/woctezuma/stylegan2-projecting-images/applications/expression/tiled/expression_42_age.jpg)

- Smile:

![Expression Transfer](https://raw.githubusercontent.com/wiki/woctezuma/stylegan2-projecting-images/applications/expression/tiled/expression_42_smile.jpg)

- Age:

![Expression Transfer](https://raw.githubusercontent.com/wiki/woctezuma/stylegan2-projecting-images/applications/expression/tiled/expression_07_age.jpg)

- Smile:

![Expression Transfer](https://raw.githubusercontent.com/wiki/woctezuma/stylegan2-projecting-images/applications/expression/tiled/expression_07_smile.jpg)

#### With the extended projection `W(18,*)`

- Age:

![Expression Transfer](https://raw.githubusercontent.com/wiki/woctezuma/stylegan2-projecting-images/applications/expression/no_tiled/expression_42_age.jpg)

- Smile:

![Expression Transfer](https://raw.githubusercontent.com/wiki/woctezuma/stylegan2-projecting-images/applications/expression/no_tiled/expression_42_smile.jpg)

- Age:

![Expression Transfer](https://raw.githubusercontent.com/wiki/woctezuma/stylegan2-projecting-images/applications/expression/no_tiled/expression_07_age.jpg)

- Smile:

![Expression Transfer](https://raw.githubusercontent.com/wiki/woctezuma/stylegan2-projecting-images/applications/expression/no_tiled/expression_07_smile.jpg)

## References

-   StyleGAN2:

    -   [StyleGAN2][stylegan2-official-repository] / [StyleGAN2-ADA][stylegan2-ada-official-repository] / [StyleGAN2-ADA-PyTorch][stylegan2-ada-pytorch-repository]

    -   [Steam-StyleGAN2][stylegan2-applied-to-steam-banners]

    -   My [fork][stylegan2-fork] of StyleGAN2 to project a batch of images, using any projection (original or extended)

-   Programming resources:

    -   [rolux/stylegan2encoder][rolux-repository]: align faces based on detected landmarks (FFHQ pre-processing)

    -   Learnt [latent directions][learnt-latent-directions] tailored for StyleGAN2 (required for expression transfer)

    -   Minimal [example code][minimal-example-latent-edition] for morphing and expression transfer

-   Experimenting materials:

    -   The website [ArtBreeder][artbreeder-website] by Joel Simon

    -   Colab [user interface][colab-user-interface] for extended projection and expression transfer

    -   A fast projector called [`encoder4editing`][repo-encoder4editing] and released in 2021 [![Open In Colab][colab-badge]][colab-encoder4editing]

-   Reading materials:

    -   A blog post about editing projected images to add a [cartoon][toonify-blog-post] effect

    -   On the Wiki: [GIF editing][wiki-gif-editing] with [MoviePy][moviepy] and [Gifsicle][gifsicle]

-   Papers:

    - [Karras, Tero, et al. *A Style-Based Generator Architecture for Generative Adversarial Networks*. CVPR 2019.][stylegan1-paper]

    - [Abdal, Rameen, et al. *Image2StyleGAN: How to Embed Images Into the StyleGAN Latent Space?*. ICCV 2019.][image2stylegan-paper]

    - [Karras, Tero, et al. *Analyzing and Improving the Image Quality of StyleGAN*. CVPR 2020.][stylegan2-paper]

[french-president]: 

[french-president-archive]: 

[french-government]: 

[french-government-archive]: 

[famous-paintings-archive]: 

[FFHQ dataset]: 

[dlib]: 

[FFHQ pre-processing code]: 

[stylegan2_projecting_images]: 

[stylegan2_projecting_images_with_fork]: 

[stylegan2_editing_latent_vectors]: 

[lpips-paper]: 

[image2stylegan-paper]: 

[stylegan1-paper]: 

[stylegan2-paper]: 

[stylegan2-fork]: 

[stylegan2-official-repository]: 

[stylegan2-ada-official-repository]: 

[stylegan2-ada-pytorch-repository]: 

[stylegan2-applied-to-steam-banners]: 

[rolux-repository]: 

[learnt-latent-directions]: 

[colab-user-interface]: 

[artbreeder-website]: 

[style-transfer-code]: 

[additional-projection-results]: 

[google-drive-application-results]: 

[style-transfer-video-tiled]: 

[style-transfer-video-no-tiled]: 

[extended-projection-limitations]: 

[minimal-example-latent-edition]: 

[wiki-all-the-projections]: 

[wiki-all-the-applications]: 

[wiki-application-morphing]: 

[wiki-application-style-transfer]: 

[wiki-application-expression-transfer]: 

[repo-encoder4editing]: 

[colab-encoder4editing]: 

[wiki-gif-editing]: 

[moviepy]: 

[gifsicle]: 

[toonify-blog-post]: 

[interfacegan]: 

[ganspace]: 

[ALAE]: 

[closed-form]: 

[rosasalberto-fork]: 

[rosasalberto-sample-from-latents]: 

[rosasalberto-edit-latents]: 

[colab-badge]:
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/woctezuma/stylegan2-projecting-images

Awesome Lists containing this project

README