Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/lord3008/gans
See the readme file for details, code etc.
https://github.com/lord3008/gans
Last synced: 4 days ago
JSON representation
See the readme file for details, code etc.
- Host: GitHub
- URL: https://github.com/lord3008/gans
- Owner: Lord3008
- Created: 2024-07-03T17:09:00.000Z (5 months ago)
- Default Branch: main
- Last Pushed: 2024-07-06T15:14:38.000Z (5 months ago)
- Last Synced: 2024-07-06T16:24:52.082Z (5 months ago)
- Homepage:
- Size: 13.7 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# GANs
Generative Adversarial Networks (GANs) are a class of machine learning frameworks invented by Ian Goodfellow and his colleagues in 2014. They are composed of two neural networks, a generator and a discriminator, that contest with each other in a game-theoretic framework.
1. **Generator**: This network generates new data instances that resemble the training data. It starts with random noise and tries to transform this noise into data that can fool the discriminator.
2. **Discriminator**: This network evaluates the authenticity of data. It distinguishes between real data from the training set and fake data produced by the generator.
The training process involves both networks improving together:
- The generator improves its capability to create more realistic data as it receives feedback from the discriminator on how to improve.
- The discriminator becomes better at identifying real vs. fake data.This adversarial process continues until the discriminator can no longer distinguish between real and generated data, indicating the generator is producing high-quality data.
**Applications of GANs**:
1. **Image Generation**: GANs can create realistic images from textual descriptions (e.g., generating images of people, animals, or objects that do not exist).
2. **Video Generation**: GANs can be used to generate video frames, leading to applications in video synthesis and editing.
3. **Data Augmentation**: They are used to generate additional training data for other machine learning models, improving performance where data is limited.
4. **Super-Resolution**: GANs can enhance the resolution of images, producing high-quality images from low-resolution inputs.
5. **Style Transfer**: GANs can transfer artistic styles from one image to another.**Challenges with GANs**:
1. **Training Instability**: The training process can be unstable, with the generator and discriminator not converging properly.
2. **Mode Collapse**: The generator might produce a limited variety of outputs, ignoring the diversity of the real data.
3. **Evaluation**: Evaluating the performance of GANs can be difficult since traditional metrics are not always applicable.Despite these challenges, GANs represent a powerful and versatile tool in the field of generative models, revolutionizing tasks involving data generation and manipulation.
Creating a Generative Adversarial Network (GAN) involves defining two models: a generator and a discriminator. The generator creates new data instances, while the discriminator evaluates them. Both models are trained together in a way that the generator improves its ability to create realistic data, while the discriminator gets better at distinguishing between real and fake data.
Here's a simple example of constructing a GAN using TensorFlow and Keras:
### Step-by-Step Code Example
1. **Install Dependencies**: Ensure you have TensorFlow and Keras installed. You can install them using pip if you haven't already.
```bash
pip install tensorflow
```2. **Import Libraries**:
```python
import tensorflow as tf
from tensorflow.keras.layers import Dense, Flatten, Reshape, LeakyReLU, BatchNormalization
from tensorflow.keras.models import Sequential
from tensorflow.keras.optimizers import Adam
import numpy as np
```3. **Define the Generator**:
```python
def build_generator():
model = Sequential()
model.add(Dense(256, input_dim=100))
model.add(LeakyReLU(alpha=0.2))
model.add(BatchNormalization(momentum=0.8))
model.add(Dense(512))
model.add(LeakyReLU(alpha=0.2))
model.add(BatchNormalization(momentum=0.8))
model.add(Dense(1024))
model.add(LeakyReLU(alpha=0.2))
model.add(BatchNormalization(momentum=0.8))
model.add(Dense(28*28*1, activation='tanh'))
model.add(Reshape((28, 28, 1)))
return model
```4. **Define the Discriminator**:
```python
def build_discriminator():
model = Sequential()
model.add(Flatten(input_shape=(28, 28, 1)))
model.add(Dense(512))
model.add(LeakyReLU(alpha=0.2))
model.add(Dense(256))
model.add(LeakyReLU(alpha=0.2))
model.add(Dense(1, activation='sigmoid'))
return model
```5. **Compile the Models**:
```python
def compile_models(generator, discriminator):
optimizer = Adam(0.0002, 0.5)discriminator.compile(loss='binary_crossentropy', optimizer=optimizer, metrics=['accuracy'])
discriminator.trainable = False
z = tf.keras.Input(shape=(100,))
img = generator(z)
validity = discriminator(img)combined = tf.keras.Model(z, validity)
combined.compile(loss='binary_crossentropy', optimizer=optimizer)return combined
```6. **Training the GAN**:
```python
def train(generator, discriminator, combined, epochs, batch_size=128, save_interval=50):
(X_train, _), (_, _) = tf.keras.datasets.mnist.load_data()
X_train = X_train / 127.5 - 1.0
X_train = np.expand_dims(X_train, axis=3)valid = np.ones((batch_size, 1))
fake = np.zeros((batch_size, 1))for epoch in range(epochs):
idx = np.random.randint(0, X_train.shape[0], batch_size)
imgs = X_train[idx]noise = np.random.normal(0, 1, (batch_size, 100))
gen_imgs = generator.predict(noise)d_loss_real = discriminator.train_on_batch(imgs, valid)
d_loss_fake = discriminator.train_on_batch(gen_imgs, fake)
d_loss = 0.5 * np.add(d_loss_real, d_loss_fake)noise = np.random.normal(0, 1, (batch_size, 100))
g_loss = combined.train_on_batch(noise, valid)print(f"{epoch} [D loss: {d_loss[0]} | D accuracy: {100 * d_loss[1]}] [G loss: {g_loss}]")
if epoch % save_interval == 0:
save_imgs(generator, epoch)def save_imgs(generator, epoch, num_imgs=10):
noise = np.random.normal(0, 1, (num_imgs, 100))
gen_imgs = generator.predict(noise)gen_imgs = 0.5 * gen_imgs + 0.5
import matplotlib.pyplot as pltplt.figure(figsize=(10, 1))
for i in range(num_imgs):
plt.subplot(1, num_imgs, i + 1)
plt.imshow(gen_imgs[i, :, :, 0], cmap='gray')
plt.axis('off')
plt.show()
```7. **Main Script**:
```python
if __name__ == "__main__":
generator = build_generator()
discriminator = build_discriminator()
combined = compile_models(generator, discriminator)
train(generator, discriminator, combined, epochs=10000, batch_size=64, save_interval=1000)
```This example demonstrates a basic GAN trained on the MNIST dataset. The generator learns to create images of handwritten digits, while the discriminator learns to distinguish between real and generated images. Adjusting the architecture, parameters, and training data can extend this framework to more complex applications.
Generative Adversarial Networks (GANs) have evolved significantly since their inception, leading to various types designed to address specific challenges or to achieve different objectives. Here are some of the main types of GANs:
### 1. **Vanilla GAN (Basic GAN)**
The original GAN proposed by Ian Goodfellow in 2014 consists of two neural networks: a generator and a discriminator, which compete against each other. The generator tries to produce realistic data, while the discriminator attempts to distinguish between real and fake data.### 2. **Conditional GAN (cGAN)**
Conditional GANs add extra information to both the generator and discriminator, conditioning the generation process on this additional data. This can be labels or other attributes. This allows for more controlled generation of outputs.
- **Example**: Generating images of specific classes (e.g., digits from the MNIST dataset).### 3. **Deep Convolutional GAN (DCGAN)**
DCGANs employ convolutional layers in the generator and discriminator instead of fully connected layers. This architecture is particularly effective for image data, as it can capture spatial hierarchies and features more efficiently.
- **Key Features**: Use of strided convolutions, batch normalization, and ReLU activations.### 4. **Wasserstein GAN (WGAN)**
Wasserstein GANs modify the loss function to improve training stability. Instead of using the Jensen-Shannon divergence, they use the Wasserstein distance (Earth Mover’s Distance), which provides smoother gradients and alleviates the problem of vanishing gradients.
- **Variant**: WGAN-GP (Wasserstein GAN with Gradient Penalty) introduces a gradient penalty term to enforce the Lipschitz constraint more effectively.### 5. **Least Squares GAN (LSGAN)**
LSGANs use a least-squares objective function for the discriminator, which helps in stabilizing training and generating higher quality images. This loss function penalizes samples based on their distance from the decision boundary, providing more stable gradients.### 6. **Pix2Pix**
Pix2Pix is an example of an image-to-image translation GAN, where the goal is to learn a mapping from input images to output images. It uses a conditional GAN framework to enforce the generated image to be conditioned on the input image.
- **Example**: Converting sketches to photos, day to night scenes.### 7. **CycleGAN**
CycleGANs enable image-to-image translation without needing paired examples. They introduce a cycle consistency loss to ensure that translating an image to another domain and back results in the original image.
- **Example**: Translating photos of horses to zebras and vice versa.### 8. **Progressive GAN (ProGAN)**
Progressive GANs train by gradually increasing the resolution of the generated images. The model starts with low-resolution images and progressively adds layers to increase the resolution, which helps in stabilizing the training process and producing high-quality images.
- **Example**: Generating high-resolution images of faces.### 9. **StyleGAN**
StyleGANs introduce a novel generator architecture that allows for more control over the generated images' style at different levels. It separates the generation process into different stages, enabling style mixing and more detailed control over the attributes of generated images.
- **Key Feature**: Style transfer capabilities and fine-grained control over image features.### 10. **BigGAN**
BigGANs scale up the GAN architecture and training process using large models and large batches, which significantly improves the quality of generated images. They are particularly effective at generating high-resolution and diverse images.
- **Key Features**: Use of large-scale datasets and models, increased batch sizes.### 11. **SRGAN (Super-Resolution GAN)**
SRGANs are designed for image super-resolution tasks. They aim to generate high-resolution images from low-resolution inputs, enhancing the details and quality of the images.
- **Application**: Enhancing the resolution of photos and videos.### 12. **InfoGAN**
InfoGANs extend GANs by maximizing the mutual information between a subset of the latent variables and the observations, enabling the model to learn interpretable and disentangled representations in an unsupervised manner.
- **Key Feature**: Learning meaningful and interpretable features.### Summary
Each type of GAN addresses different challenges and applications, ranging from improving training stability to enhancing control over the generated data. These variations have significantly expanded the applicability and effectiveness of GANs in various fields such as image generation, style transfer, and data augmentation.