Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/giannisdaras/ylg
[CVPR 2020] Official Implementation: "Your Local GAN: Designing Two Dimensional Local Attention Mechanisms for Generative Models".
https://github.com/giannisdaras/ylg
attention-mechanism computer-vision cvpr cvpr20 cvpr2020 gans generative-adversarial-networks inverse-problems machine-learning
Last synced: 18 days ago
JSON representation
[CVPR 2020] Official Implementation: "Your Local GAN: Designing Two Dimensional Local Attention Mechanisms for Generative Models".
- Host: GitHub
- URL: https://github.com/giannisdaras/ylg
- Owner: giannisdaras
- License: gpl-3.0
- Created: 2019-11-27T16:41:10.000Z (almost 5 years ago)
- Default Branch: master
- Last Pushed: 2022-11-22T08:50:05.000Z (almost 2 years ago)
- Last Synced: 2024-10-15T04:12:06.222Z (about 1 month ago)
- Topics: attention-mechanism, computer-vision, cvpr, cvpr20, cvpr2020, gans, generative-adversarial-networks, inverse-problems, machine-learning
- Language: Python
- Homepage:
- Size: 17.6 MB
- Stars: 136
- Watchers: 5
- Forks: 17
- Open Issues: 6
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
## Official Code: Your Locan GAN — Designing Two Dimensional Local Attention Mechanisms for Generative Models
![ylg](https://img.shields.io/badge/ylg-Your%20Local%20GAN-brightgreen)
![Python 3.6](https://img.shields.io/badge/python-3.6-green.svg?style=plastic)
![tensorflow](https://img.shields.io/badge/tensorflow-2.0-brightgreen)This repository hosts the official Tensorflow implementation of the paper: "**Your Local GAN: Designing Two Dimensional Local Attention Mechanisms for Generative Models**".
The paper is accepted at **CVPR 2020** (poster).
Abstract:
> We introduce a new local sparse attention layer that preserves two-dimensional geometry and locality. We show that by just replacing the dense attention layer of SAGAN with our construction, we obtain very significant FID, Inception score and pure visual improvements. FID score is improved from 18.65 to **15.94** on ImageNet, keeping all other parameters the same. The sparse attention patterns that we propose for our new layer are designed using a novel information theoretic criterion that uses information flow graphs.
We also present a novel way to invert Generative Adversarial Networks with attention. Our method extracts from the attention layer of the discriminator a saliency map, which we use to construct a new loss function for the inversion. This allows us to visualize the newly introduced attention heads and show that they indeed capture interesting aspects of two-dimensional geometry of real images.You can read the full paper [here](https://arxiv.org/abs/1911.12287).
### Teasers
#### Generated Images
![Teaser](./generated/collage.jpg)
*Images generated by YLG - SAGAN after 1M steps training on ImageNet.*#### Inverted images
![Teaser_inversion](./inversions/inverted.jpg)
*Left: Real image, Right: Inverted image using our technique.*#### Latent space interpolation
![gif_teaser](maltese.gif)*Generated gif from interpolating latent variables for maltese dogs.*
## Explore our model
Probably the easiest way to explore our model is to play directly with it in [this Collab Notebook](https://colab.research.google.com/drive/10MO4dVoQIhS1ZpeplWTnA4KVqkvqN4Jd).
However, trying it locally should be easy, following the instructions bellow.
### Installation
We recommend installing YLG using an Anaconda virtual environment.
For installing Anaconda refer to the [official docs](https://docs.anaconda.com/anaconda/install/).First, create a new virtual environment with Python 3.6:
```
conda create -n ylg python=3.6
conda activate ylg
```Next, install the project requirements:
` pip install -r requirements.txt `### Pre-trained models
We make available pre-trained model for YLG SAGAN, after 1M steps training on ImageNet.If you want to try the model, download it from [here](https://drive.google.com/open?id=1Nikmw2WLcSnN_Yv0FbvwrZcjgu-HPkJH).
We recommend saving the pre-trained model under the `ylg/` folder, but you can also choose another location and set the `pretrained_path` appropriately.### Generate images
Generating images for any category of the ImageNet dataset is one command away.Just run: `python generate_images.py --category=valley` to generate valleys! For a complete list of the categories names, please check `categories.py` file.
There are several parameters that you can control, such as the number of generated images. You can discover them by running: `python generate_images.py --help`
As you can see, the model is able to generate some really good-looking images, but not all generated images are photo-realistic. We expect that training bigger architectures, such as BigGAN, with our 2-d local sparse attention layers, will improve significantly the quality of the generated images.
### Invert your own images
In our paper, we present a new inversion technique: we extract a saliency map for the real image out of the attention layer of the Discriminator and we use it to weight a novel loss function in the discriminator's embedding space.To the best of our knowledge, inversion of big models with attention is achieved in a satisfying degree.
You are one command away of trying it out!Just run: `python inverse_image.py` to invert a cute maltese dog that is saved in the `real_images/` folder. You can run with your own images as well! `python inverse_image.py --image_path= --category=` is the command to run.
### Train from scratch
We totally understand that you might want to train your own model for a variety of reasons: experimentation with new modules, different datasets, etc. For that reason, we have created the branch `train`, which slighly changes the API of Generator and Discriminator for the training. You can checkout in this branch and then use the `train_experiment_main.py` script for training YLG from scratch. Please refer to the [instructions](https://github.com/tensorflow/gan/tree/master/tensorflow_gan/examples/self_attention_estimator) of the tensorflow-gan library for setting up your training environment (host VM, TPUs/GPUs, bucket, etc) and feel free to open us an issue if you encounter any problem, so we can look on it.## Acknowledgments
We would like to wholeheartedly thank the **TensorFlow Research Cloud (TFRC)** program that gave us access to v3-8 Cloud TPUs and GCP credits to train our models on ImageNet.
The code of this repository is heavily based in the [tensorflow-gan](https://github.com/tensorflow/gan) library. We add the library as a dependency and we only re-implement parts that need modification for YLG. Every file which is modified from tensorflow-gan has a header indicating that it is subject to the license of the tensorflow-gan library.