{"id":13481481,"url":"https://github.com/soumith/ganhacks","last_synced_at":"2026-01-27T11:01:59.723Z","repository":{"id":37470664,"uuid":"76050275","full_name":"soumith/ganhacks","owner":"soumith","description":"starter from \"How to Train a GAN?\" at NIPS2016","archived":false,"fork":false,"pushed_at":"2022-01-09T06:29:17.000Z","size":849,"stargazers_count":11533,"open_issues_count":60,"forks_count":1665,"subscribers_count":344,"default_branch":"master","last_synced_at":"2025-03-23T12:42:10.207Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/soumith.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2016-12-09T16:09:27.000Z","updated_at":"2025-03-23T00:02:45.000Z","dependencies_parsed_at":"2022-07-12T15:50:35.804Z","dependency_job_id":null,"html_url":"https://github.com/soumith/ganhacks","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/soumith/ganhacks","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/soumith%2Fganhacks","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/soumith%2Fganhacks/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/soumith%2Fganhacks/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/soumith%2Fganhacks/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/soumith","download_url":"https://codeload.github.com/soumith/ganhacks/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/soumith%2Fganhacks/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28812367,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-27T07:41:26.337Z","status":"ssl_error","status_checked_at":"2026-01-27T07:41:08.776Z","response_time":168,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-07-31T17:00:52.217Z","updated_at":"2026-01-27T11:01:59.700Z","avatar_url":"https://github.com/soumith.png","language":null,"readme":"(this list is no longer maintained, and I am not sure how relevant it is in 2020)\n\n# How to Train a GAN? Tips and tricks to make GANs work\n\nWhile research in Generative Adversarial Networks (GANs) continues to improve the\nfundamental stability of these models,\nwe use a bunch of tricks to train them and make them stable day to day.\n\nHere are a summary of some of the tricks.\n\n[Here's a link to the authors of this document](#authors)\n\nIf you find a trick that is particularly useful in practice, please open a Pull Request to add it to the document.\nIf we find it to be reasonable and verified, we will merge it in.\n\n## 1. Normalize the inputs\n\n- normalize the images between -1 and 1\n- Tanh as the last layer of the generator output\n\n## 2: A modified loss function\n\nIn GAN papers, the loss function to optimize G is `min (log 1-D)`, but in practice folks practically use `max log D`\n  - because the first formulation has vanishing gradients early on\n  - Goodfellow et. al (2014)\n\nIn practice, works well:\n  - Flip labels when training generator: real = fake, fake = real\n\n## 3: Use a spherical Z\n- Dont sample from a Uniform distribution\n\n![cube](images/cube.png \"Cube\")\n\n- Sample from a gaussian distribution\n\n![sphere](images/sphere.png \"Sphere\")\n\n- When doing interpolations, do the interpolation via a great circle, rather than a straight line from point A to point B\n- Tom White's [Sampling Generative Networks](https://arxiv.org/abs/1609.04468) ref code https://github.com/dribnet/plat has more details\n\n\n## 4: BatchNorm\n\n- Construct different mini-batches for real and fake, i.e. each mini-batch needs to contain only all real images or all generated images.\n- when batchnorm is not an option use instance normalization (for each sample, subtract mean and divide by standard deviation).\n\n![batchmix](images/batchmix.png \"BatchMix\")\n\n## 5: Avoid Sparse Gradients: ReLU, MaxPool\n- the stability of the GAN game suffers if you have sparse gradients\n- LeakyReLU = good (in both G and D)\n- For Downsampling, use: Average Pooling, Conv2d + stride\n- For Upsampling, use: PixelShuffle, ConvTranspose2d + stride\n  - PixelShuffle: https://arxiv.org/abs/1609.05158\n\n## 6: Use Soft and Noisy Labels\n\n- Label Smoothing, i.e. if you have two target labels: Real=1 and Fake=0, then for each incoming sample, if it is real, then replace the label with a random number between 0.7 and 1.2, and if it is a fake sample, replace it with 0.0 and 0.3 (for example).\n  - Salimans et. al. 2016\n- make the labels the noisy for the discriminator: occasionally flip the labels when training the discriminator\n\n## 7: DCGAN / Hybrid Models\n\n- Use DCGAN when you can. It works!\n- if you cant use DCGANs and no model is stable, use a hybrid model :  KL + GAN or VAE + GAN\n\n## 8: Use stability tricks from RL\n\n- Experience Replay\n  - Keep a replay buffer of past generations and occassionally show them\n  - Keep checkpoints from the past of G and D and occassionaly swap them out for a few iterations\n- All stability tricks that work for deep deterministic policy gradients\n- See Pfau \u0026 Vinyals (2016)\n\n## 9: Use the ADAM Optimizer\n\n- optim.Adam rules!\n  - See Radford et. al. 2015\n- Use SGD for discriminator and ADAM for generator\n\n## 10: Track failures early\n\n- D loss goes to 0: failure mode\n- check norms of gradients: if they are over 100 things are screwing up\n- when things are working, D loss has low variance and goes down over time vs having huge variance and spiking\n- if loss of generator steadily decreases, then it's fooling D with garbage (says martin)\n\n## 11: Dont balance loss via statistics (unless you have a good reason to)\n\n- Dont try to find a (number of G / number of D) schedule to uncollapse training\n- It's hard and we've all tried it.\n- If you do try it, have a principled approach to it, rather than intuition\n\nFor example\n```\nwhile lossD \u003e A:\n  train D\nwhile lossG \u003e B:\n  train G\n```\n\n## 12: If you have labels, use them\n\n- if you have labels available, training the discriminator to also classify the samples: auxillary GANs\n\n## 13: Add noise to inputs, decay over time\n\n- Add some artificial noise to inputs to D (Arjovsky et. al., Huszar, 2016)\n  - http://www.inference.vc/instance-noise-a-trick-for-stabilising-gan-training/\n  - https://openreview.net/forum?id=Hk4_qw5xe\n- adding gaussian noise to every layer of generator (Zhao et. al. EBGAN)\n  - Improved GANs: OpenAI code also has it (commented out)\n\n## 14: [notsure] Train discriminator more (sometimes)\n\n- especially when you have noise\n- hard to find a schedule of number of D iterations vs G iterations\n\n## 15: [notsure] Batch Discrimination\n\n- Mixed results\n\n## 16: Discrete variables in Conditional GANs\n\n- Use an Embedding layer\n- Add as additional channels to images\n- Keep embedding dimensionality low and upsample to match image channel size\n\n## 17: Use Dropouts in G in both train and test phase\n- Provide noise in the form of dropout (50%).\n- Apply on several layers of our generator at both training and test time\n- https://arxiv.org/pdf/1611.07004v1.pdf\n\n\n## Authors\n- Soumith Chintala\n- Emily Denton\n- Martin Arjovsky\n- Michael Mathieu\n","funding_links":[],"categories":["Getting Started 👨‍💻️","Recommendations","Computer Vision","Others","Useful Resources","Table of Contents","Tutorials","其他_机器学习与深度学习","GenerativeNet","GAN tutorials with easy and simple example code for starters","Deep Learning:"],"sub_categories":["DCGAN","Applied Audio","3D SemanticSeg","3D Object generation","2017"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsoumith%2Fganhacks","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsoumith%2Fganhacks","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsoumith%2Fganhacks/lists"}