Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/giovanni-gatti/dcgan-celeba

Notebook to train a DCGAN model on the CelebA dataset using PyTorch.
https://github.com/giovanni-gatti/dcgan-celeba

deep-learning generative-adversarial-network pytorch

Last synced: 3 days ago
JSON representation

Notebook to train a DCGAN model on the CelebA dataset using PyTorch.

Host: GitHub
URL: https://github.com/giovanni-gatti/dcgan-celeba
Owner: giovanni-gatti
Created: 2023-06-13T14:55:40.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2024-01-04T23:40:01.000Z (about 1 year ago)
Last Synced: 2024-12-08T10:11:31.879Z (about 2 months ago)
Topics: deep-learning, generative-adversarial-network, pytorch
Language: Jupyter Notebook
Homepage:
Size: 622 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# DCGAN CelebA

## Project Description
The notebook shows how to implement and train a Deep Convolutional GAN (DCGAN) model to perform image synthesis on the CelebA (CelebFaces Attributes) dataset, which contains just over 200 thousand face images from over 10 thousand celebrities with different pose variations and backgrounds. The code is developed entirely using the PyTorch library.

### Model Architecture
The proposed implementation tries to follow the guidelines set out in the original DCGAN paper, but also includes some more advanced regularization techniques. In detail, the generator is composed of fractional-strided (transposed) convolution layers, to upsample the latent vector, while the discriminator consists of simple strided convolutions. Fully connected layers are avoided, and batch normalization is applied after every convolution layer, both in the discriminator and generator. Furthermore, ReLU activations are used in the generator (only the final layer uses a hyperbolic tangent activation), while LeakyReLU functions are used in the discriminator. The optimizer of choice is Adam, and the loss function used is a modified binary cross-entropy (BCE). Finally, all weights are initialized following a Normal distribution.
On top of that, since early experiments with this configuration showed signs of mode collapse and non-convergence, the following regularization techniques are also adopted in the code:

- Spectral normalization
- One-sided label smoothing
- Gaussian noise
- Spatial dropout
- Separate mini-batches
- Different learning rates for generator and discriminator

The follwing plots show the architecture components with output shape for every convolutional layer:

Generator network.

Discriminator network.

### Training Example
To provide an example of the model capabilities, the DCGAN described above and coded in the notebook was trained for 60 epochs on a Tesla T4 Cloud GPU, with a batch size of 128, a latent vector size of 128, and image size of 64 × 64. No extra data augmentations were applied to the training data, except scaling their range and applying a random horizontal flip.
The images below compare generated outputs after the final epoch and a sample of real training data:

training data

Real data.

generated data

Genereted data.

While still not perfect and with room for improvement, the results are impressive, considering the small size of the model (below 25 million parameters), the low resolution of the images and the limited amount of training data and resources available. Moreover, proper regularization avoided the issues of non-convergence and mode collapse.

## References
- [https://arxiv.org/abs/1511.06434](https://arxiv.org/abs/1511.06434)
- [https://arxiv.org/abs/1606.03498](https://arxiv.org/abs/1606.03498)
- [https://arxiv.org/abs/1701.04862](https://arxiv.org/abs/1701.04862)
- [https://arxiv.org/abs/1411.7766](https://arxiv.org/abs/1411.7766)