{"id":13936465,"url":"https://github.com/khanrc/tf.gans-comparison","last_synced_at":"2025-04-06T09:10:36.352Z","repository":{"id":67212314,"uuid":"102315245","full_name":"khanrc/tf.gans-comparison","owner":"khanrc","description":"Implementations of (theoretical) generative adversarial networks and comparison without cherry-picking","archived":false,"fork":false,"pushed_at":"2018-03-09T16:52:59.000Z","size":8099,"stargazers_count":466,"open_issues_count":3,"forks_count":73,"subscribers_count":24,"default_branch":"master","last_synced_at":"2025-03-30T08:11:12.487Z","etag":null,"topics":["began","celeba","dcgan","dragan","ebgan","gan","gans","lsgan","lsun","tensorflow","wgan","wgan-gp"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/khanrc.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2017-09-04T03:31:59.000Z","updated_at":"2024-12-11T11:08:11.000Z","dependencies_parsed_at":"2023-02-21T07:45:49.381Z","dependency_job_id":null,"html_url":"https://github.com/khanrc/tf.gans-comparison","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/khanrc%2Ftf.gans-comparison","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/khanrc%2Ftf.gans-comparison/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/khanrc%2Ftf.gans-comparison/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/khanrc%2Ftf.gans-comparison/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/khanrc","download_url":"https://codeload.github.com/khanrc/tf.gans-comparison/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247457803,"owners_count":20941906,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["began","celeba","dcgan","dragan","ebgan","gan","gans","lsgan","lsun","tensorflow","wgan","wgan-gp"],"created_at":"2024-08-07T23:02:41.829Z","updated_at":"2025-04-06T09:10:36.334Z","avatar_url":"https://github.com/khanrc.png","language":"Python","funding_links":[],"categories":["Python","GAN, VAE"],"sub_categories":["Korean"],"readme":"# GANs comparison without cherry-picking\n\nImplementations of some theoretical generative adversarial nets: DCGAN, EBGAN, LSGAN, WGAN, WGAN-GP, BEGAN, DRAGAN and CoulombGAN. \n\nI implemented the structure of model equal to the structure in paper and compared it on the CelebA dataset and LSUN dataset without cherry-picking.\n\n\n## Table of Contents\n\n* [Features](#features)\n* [Models](#models)\n* [Dataset](#dataset)\n   * [CelebA](#celeba)\n   * [LSUN](#lsun-bedroom)\n* [Results](#results)\n   * [DCGAN](#dcgan)\n   * [EBGAN](#ebgan)\n   * [LSGAN](#lsgan)\n   * [WGAN](#wgan)\n   * [WGAN-GP](#wgan-gp)\n   * [BEGAN](#began)\n   * [DRAGAN](#dragan)\n   * [CoulombGAN](#coulombgan)\n* [Conclusion](#conclusion)\n* [Usage](#usage)\n   * [Requirements](#requirements)\n* [Similar works](#similar-works)\n\n\n## Features\n\n- Model architectures are same as the architectures proposed in each paper\n- Each model was not much tuned, so the results can be improved\n- Well-structured (was my goal at the start, but I don't know whether it succeed!)\n    - TensorFlow queue runner is used for input pipeline\n    - Single trainer (and single evaluator) - multi model structure\n    - Logs in training and configuration are recorded on the TensorBoard\n\n## Models\n\n- DCGAN\n- LSGAN\n- WGAN\n- WGAN-GP\n- EBGAN\n- BEGAN\n- DRAGAN\n- CoulombGAN\n\nThe family of conditional GANs are excluded (CGAN, acGAN, and so on). \n\n## Dataset \n\n### CelebA\n\nhttp://mmlab.ie.cuhk.edu.hk/projects/CelebA.html\n\n- All experiments were performed on 64x64 CelebA dataset\n- The dataset has 202599 images\n- 1 epoch consists of about 1.58k iterations for batch size 128\n\n### LSUN bedroom\n\nhttp://lsun.cs.princeton.edu/2017/\n\n- The dataset has 3033042 images\n- 1 epoch consists of about 23.7k iterations for batch size 128\n\nThis dataset is provided in [LMDB](http://www.lmdb.tech/) format. https://github.com/fyu/lsun provides documentation and demo code to use it. \n\n## Results\n\n- I implemented the same as the proposed model in each paper, but ignored some details (or the paper did not describe details of model)\n  - Granted, a little details make great differences in the results due to the very unstable GAN training\n  - So if you had a better results, let me know the settings 🙂\n- Default batch_size=128 and z_dim=100 (from DCGAN)\n\n### DCGAN\n\nRadford, Alec, Luke Metz, and Soumith Chintala. \"Unsupervised representation learning with deep convolutional generative adversarial networks.\" arXiv preprint arXiv:1511.06434 (2015).\n\n- Relatively simple networks\n- Learning rate for discriminator (D_lr) is 2e-4 and learning rate for generator (G_lr) is 2e-4 (proposed in the paper) and 1e-3\n\n|                G_lr=2e-4                 |                G_lr=1e-3                 |\n| :--------------------------------------: | :--------------------------------------: |\n|                   50k                    |                   30k                    |\n| ![dcgan.G2e-4.50k](assets/celeba/dcgan.G2e-4.50k.png) | ![dcgan.G1e-3.30k](assets/celeba/dcgan.G1e-3.30k.png) |\n\nSecond row (50k, 30k) indicates each training iteration.\n\nHigher learning rate (1e-3) for generator made better results. In this case, however, the generator has been collapsed sometimes due to its large learning rate. Lowering both learning rate may bring stability like https://ajolicoeur.wordpress.com/cats/ in which suggested D_lr=5e-5 and G_lr=2e-4.\n\n|                   LSUN                   |\n| :--------------------------------------: |\n|                   100k                   |\n| ![dcgan.100k](assets/lsun/dcgan.100k.png) |\n\n\n\n### EBGAN\n\nZhao, Junbo, Michael Mathieu, and Yann LeCun. \"Energy-based generative adversarial network.\" arXiv preprint arXiv:1609.03126 (2016).\n\n- I like energy concept, so this paper is very interesting for me :)\n  - But there is criticism: [Are Energy-Based GANs any more energy-based than normal GANs?](http://www.inference.vc/are-energy-based-gans-actually-energy-based/)\n- Anyway, the energy concept and autoencoder based loss function are impressive, and the results are also fine\n- But I have a question for Pulling-away Term (PT), which prevents mode-collapse theoretically. This is the same idea as minibatch discrimination (T. Salimans et al).\n\n\n|             pt weight = 0.1              |                No pt loss                |\n| :--------------------------------------: | :--------------------------------------: |\n|                   30k                    |                   30k                    |\n| ![ebgan.pt.30k](assets/celeba/ebgan.pt.30k.png) | ![ebgan.nopt.30k](assets/celeba/ebgan.nopt.30k.png) |\n\nThe model using PT generates slightly better sample visually. However, it is not clear from this results whether PT prevents mode-collapse. Furthermore, I could not distinguish what setting is better from repeated experiments.\n\n\n|             pt weight = 0.1              |                No pt loss                |\n| :--------------------------------------: | :--------------------------------------: |\n| ![ebgan.pt.graph](assets/celeba/ebgan.pt.graph.png) | ![ebgan.nopt.graph](assets/celeba/ebgan.nopt.graph.png) |\n\npt_loss decreases a little faster in the left which used pt_weight=0.1 but there is no big difference and even at the end the right which used no pt_loss showed a lower pt_loss. So I wonder: is the PT loss really working for preventing mode-collapse as described in the paper?\n\n|                  LSUN                   |\n| :-------------------------------------: |\n|                   80k                   |\n| ![ebgan.80k](assets/lsun/ebgan.80k.png) |\n\n### LSGAN\n\nMao, Xudong, et al. \"Least squares generative adversarial networks.\" arXiv preprint ArXiv:1611.04076 (2016).\n\n- Unusually, LSGAN used large latent space dimension (z_dim=1024)\n- But in my experiment, z_dim=100 makes better results than z_dim=1024 which is originally used in paper\n\n|                z_dim=100                 |                z_dim=1024                |\n| :--------------------------------------: | :--------------------------------------: |\n|                   30k                    |                   30k                    |\n| ![lsgan.100.30k](assets/celeba/lsgan.100.30k.png) | ![lsgan.1024.30k](assets/celeba/lsgan.1024.30k.png) |\n\n|                   LSUN                   |\n| :--------------------------------------: |\n|                   150k                   |\n| ![lsgan.150k](assets/lsun/lsgan.150k.png) |\n\n### WGAN\n\nArjovsky, Martin, Soumith Chintala, and Léon Bottou. \"Wasserstein gan.\" arXiv preprint arXiv:1701.07875 (2017).\n\n- The samples from WGAN are not that impressive - compared to the very impressive theory\n- Also no specific network structure proposed, so DCGAN architecture was used for experiments\n- In the [author's implementation](https://github.com/martinarjovsky/WassersteinGAN), they used higher n_critic in the early stage of training and per 500 iterations\n\n|                   30k                   |                W distance                |\n| :-------------------------------------: | :--------------------------------------: |\n| ![wgan.30k](assets/celeba/wgan.30k.png) | ![wgan.w_dist](assets/celeba/wgan.w_dist.png) |\n\n|                  LSUN                   |\n| :-------------------------------------: |\n|                  230k                   |\n| ![wgan.230k](assets/lsun/wgan.230k.png) |\n\n### WGAN-GP\n\nGulrajani, Ishaan, et al. \"Improved training of wasserstein gans.\" arXiv preprint arXiv:1704.00028 (2017).\n\n- I tried two network architectures, which are DCGAN architecture and ResNet architecture in appendix C\n- ResNet has more complicated architecture and better performance than DCGAN architecture\n- The interesting thing is that the visual quality of samples improves very quickly (ResNet WGAN-GP has best samples on 7k iterations) and it gets worse when continue training\n- According to DRAGAN, constraints of WGAN are too restrictive to learn good generator\n\n|            DCGAN architecture            |           ResNet architecture            |\n| :--------------------------------------: | :--------------------------------------: |\n|                   30k                    |           7k, batch size = 64            |\n| ![wgan-gp.dcgan.30k](assets/celeba/wgan-gp.dcgan.30k.png) | ![wgan-gp.good.7k](assets/celeba/wgan-gp.good.7k.png) |\n\n|                   LSUN                   |\n| :--------------------------------------: |\n|        100k, ResNet architecture         |\n| ![wgan-gp.150k](assets/lsun/wgan-gp.150k.png) |\n\n#### Face collapse phenomenon\n\nWGAN-GP was collapsed more than other models when the iteration increases.\n\n**DCGAN architecture**\n\n|                   10k                    |                   20k                    |                   30k                    |\n| :--------------------------------------: | :--------------------------------------: | :--------------------------------------: |\n| ![wgan-gp.dcgan.10k](assets/celeba/wgan-gp.dcgan.10k.png) | ![wgan-gp.dcgan.20k](assets/celeba/wgan-gp.dcgan.20k.png) | ![wgan-gp.dcgan.30k](assets/celeba/wgan-gp.dcgan.30k.png) |\n\n\n**ResNet architecture**\n\nResNet architecture showed the best visual quality sample in the very early stage, 7k iteration in my criteria. This maybe due to the residual architecture.\n\nbatch_size=64.\n\n|                    5k                    |                    7k                    |                   10k                    |                   15k                    |\n| :--------------------------------------: | :--------------------------------------: | :--------------------------------------: | :--------------------------------------: |\n| ![wgan-gp.good.5k](assets/celeba/wgan-gp.good.5k.png) | ![wgan-gp.good.7k](assets/celeba/wgan-gp.good.7k.png) | ![wgan-gp.good.10k](assets/celeba/wgan-gp.good.10k.png) | ![wgan-gp.good.15k](assets/celeba/wgan-gp.good.15k.png) |\n|                   20k                    |                   25k                    |                   30k                    |                   40k                    |\n| ![wgan-gp.good.20k](assets/celeba/wgan-gp.good.20k.png) | ![wgan-gp.good.25k](assets/celeba/wgan-gp.good.25k.png) | ![wgan-gp.good.30k](assets/celeba/wgan-gp.good.30k.png) | ![wgan-gp.good.40k](assets/celeba/wgan-gp.good.40k.png) |\n\nRegardless of the face collapse phenomenon, the Wasserstein distance decreased steadily. It should come from that the critic (discriminator) network failed to find the supremum and K-Lipschitz function.\n\n|            DCGAN architecture            |           ResNet architecture            |\n| :--------------------------------------: | :--------------------------------------: |\n| ![wgan-gp.dcgan.w_dist](assets/celeba/wgan-gp.dcgan.w_dist.png) | ![wgan-gp.good.w_dist](assets/celeba/wgan-gp.good.w_dist.png) |\n| ![wgan-gp.dcgan.w_dist.expand](assets/celeba/wgan-gp.dcgan.w_dist.expand.png) | ![wgan-gp.good.w_dist.expand](assets/celeba/wgan-gp.good.w_dist.expand.png) |\n\nThe plots in the last row of the table are just expanded version of the plots in the second row.\n\nIt is interesting that W_dist \u003c 0 at the end of the training. This indicates that E[fake] \u003e E[real] and, in the point of original GAN view, it means the generator dominates the discriminator. \n\n### BEGAN\n\nBerthelot, David, Tom Schumm, and Luke Metz. \"Began: Boundary equilibrium generative adversarial networks.\" arXiv preprint arXiv:1703.10717 (2017).\n\n- The best model that generates samples with the best visual quality as far as I know\n- It also showed the best performance in this project\n  - Even though optional improvements was not implemented (section 3.5.1 in the paper)\n- However, the samples generated by BEGAN give a slightly different feel from other models - it seems like disappearing details.\n- So I just wonder what the results are for different datasets\n\nbatch_size=16, z_dim=64, gamma=0.5.\n\n|                   30k                    |                   50k                    |                   75k                    |\n| :--------------------------------------: | :--------------------------------------: | :--------------------------------------: |\n| ![began.30k](assets/celeba/began.30k.png) | ![began.50k](assets/celeba/began.50k.png) | ![began.75k](assets/celeba/began.75k.png) |\n\n|         Convergence measure M         |\n| :-----------------------------------: |\n| ![began.M](assets/celeba/began.M.png) |\n\nI also tried to reduce speck-like artifacts as suggested in [Heumi/BEGAN-tensorflow](https://github.com/Heumi/BEGAN-tensorflow/), but it did not go away. \n\n\u003c!-- #### Speck-like artifacts phenomenon\n\nAs you can see above results, the samples of BEGAN has speckle artifacts. It can be reduced by adjusting gamma.\n\n| gamma=0.3 | gam  |\n| --------- | ---- |\n|           |      | --\u003e\n\n\n\nBEGAN in the LSUN datset works terribly. Not only severe mode-collapse was observed, but also generated images were not realistic.\n\n|                   LSUN                   |                   LSUN                   |\n| :--------------------------------------: | :--------------------------------------: |\n|                   100k                   |                   150k                   |\n| ![began.100k](assets/lsun/began.100k.png) | ![began.150k](assets/lsun/began.150k.png) |\n|                   200k                   |                   250k                   |\n| ![began.200k](assets/lsun/began.200k.png) | ![began.250k](assets/lsun/began.250k.png) |\n\n\n### DRAGAN\n\nKodali, Naveen, et al. \"How to Train Your DRAGAN.\" arXiv preprint arXiv:1705.07215 (2017).\n\n- Different with other papers, DRAGAN was motivated from the game theory for improving performance of GAN\n- This approach through the game theory is highly unique and interesting\n- But, IMHO, there is not much real contribution. The algorithm is similar to WGAN-GP\n\n|            DCGAN architecture            |\n| :--------------------------------------: |\n|                   120k                   |\n| ![dragan.30k](assets/celeba/dragan.fixed.120k.png) |\n\nThe original paper has some bugs. One of those is image x is pertured only positive-sided. I applied two-sided perturbation as the author admitted this bug on the [GitHub](https://github.com/kodalinaveen3/DRAGAN).\n\n|                   LSUN                   |\n| :--------------------------------------: |\n|                   200k                   |\n| ![dragan.200k](assets/lsun/dragan.200k.png) |\n\n### CoulombGAN\n\nUnterthiner, Thomas, et al. \"Coulomb GANs: Provably Optimal Nash Equilibria via Potential Fields.\" arXiv preprint arXiv:1708.08819 (2017).\n\n- CoulombGAN has also very interesting perspective - \"Coulomb potential\".\n- It is very interesting but I don't know whether it is GAN.\n- CoulombGAN tried to solve the diversity problem (mode collapse)\n\n\nG_lr=5e-4, D_lr=25e-5, z_dim=32.\n\n|            DCGAN architecture            |\n| :--------------------------------------: |\n|                   200k                   |\n| ![coulombgan.200k](assets/celeba/coulombgan.200k.png) |\n\nThe disadvantage of this model is that it takes a very long time to train despite the simplicity of network architecture. Further, like original GAN, there is no convergence measure. I thought that the potentials of fake samples served as a convergence measure, but it did not.\n\n\n\n\u003c!--## Conclusion\n\n- BEGAN showed the best performance for CelebA\n  - But it works terrible for LSUN dataset\n  - I wonder if it works great only for face datasets and why\n- WGAN-GP showed the best performance for LSUN\n- BEGAN and WGAN-GP have the most complex network structure\n- It is difficult to rank models except BEGAN due to the lack of quantitative measure. The visual quality of generated samples from each model seemed similar.\n- Conversely speaking, there have been a lot of GANs since DCGAN, but there is not a lot of significant improvement in visual quality 🤔🤔\n  --\u003e\n\n## Usage\n\nDownload CelebA dataset:\n\n```\n$ python download.py celebA\n$ python download.py lsun\n```\n\nConvert images to tfrecords format:   \nOptions for converting are hard-coded, so ensure to modify it before run `convert.py`. In particular, LSUN dataset is provided in LMDB format.\n\n```\n$ python convert.py\n```\n\nTrain:   \nIf you want to change the settings of each model, you must also modify code directly.\n\n```\n$ python train.py --help\nusage: train.py [-h] [--num_epochs NUM_EPOCHS] [--batch_size BATCH_SIZE]\n                [--num_threads NUM_THREADS] --model MODEL [--name NAME]\n                --dataset DATASET [--ckpt_step CKPT_STEP] [--renew]\n\noptional arguments:\n  -h, --help            show this help message and exit\n  --num_epochs NUM_EPOCHS\n                        default: 20\n  --batch_size BATCH_SIZE\n                        default: 128\n  --num_threads NUM_THREADS\n                        # of data read threads (default: 4)\n  --model MODEL         DCGAN / LSGAN / WGAN / WGAN-GP / EBGAN / BEGAN /\n                        DRAGAN / CoulombGAN\n  --name NAME           default: name=model\n  --dataset DATASET, -D DATASET\n                        CelebA / LSUN\n  --ckpt_step CKPT_STEP\n                        # of steps for saving checkpoint (default: 5000)\n  --renew               train model from scratch - clean saved checkpoints and\n                        summaries\n```\n\nMonitor through TensorBoard:\n\n```\n$ tensorboard --logdir=summary/dataset/name\n```\n\nEvaluate (generate fake samples):\n\n```\n$ python eval.py --help\nusage: eval.py [-h] --model MODEL [--name NAME] --dataset DATASET\n               [--sample_size SAMPLE_SIZE]\n\noptional arguments:\n  -h, --help            show this help message and exit\n  --model MODEL         DCGAN / LSGAN / WGAN / WGAN-GP / EBGAN / BEGAN /\n                        DRAGAN / CoulombGAN\n  --name NAME           default: name=model\n  --dataset DATASET, -D DATASET\n                        CelebA / LSUN\n  --sample_size SAMPLE_SIZE, -N SAMPLE_SIZE\n                        # of samples. It should be a square number. (default:\n                        16)\n```\n\n\n### Requirements\n\n- python 2.7\n- tensorflow \u003e= 1.2 (verified on 1.2 and 1.3)\n- tqdm\n- (optional) pynvml - for automatic gpu selection\n\n## Similar works\n\n- https://ajolicoeur.wordpress.com/cats/\n- [wiseodd/generative-models](https://github.com/wiseodd/generative-models)\n- [hwalsuklee/tensorflow-generative-model-collections](https://github.com/hwalsuklee/tensorflow-generative-model-collections)\n- [sanghoon/tf-exercise-gan](https://github.com/sanghoon/tf-exercise-gan)\n- [YadiraF/GAN_Theories](https://github.com/YadiraF/GAN_Theories)\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkhanrc%2Ftf.gans-comparison","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fkhanrc%2Ftf.gans-comparison","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkhanrc%2Ftf.gans-comparison/lists"}