{"id":28100718,"url":"https://github.com/mit-han-lab/anycost-gan","last_synced_at":"2025-05-13T18:38:29.281Z","repository":{"id":47456384,"uuid":"344362642","full_name":"mit-han-lab/anycost-gan","owner":"mit-han-lab","description":"[CVPR 2021] Anycost GANs for Interactive Image Synthesis and Editing","archived":false,"fork":false,"pushed_at":"2023-10-03T15:20:42.000Z","size":17526,"stargazers_count":760,"open_issues_count":5,"forks_count":97,"subscribers_count":24,"default_branch":"master","last_synced_at":"2023-11-07T21:36:44.758Z","etag":null,"topics":["computer-graphics","computer-vision","deep-learning","gan","gans","generative-adversarial-network","image-editing","image-generation","image-manipulation","pytorch","stylegan2"],"latest_commit_sha":null,"homepage":"https://hanlab.mit.edu/projects/anycost-gan/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mit-han-lab.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2021-03-04T05:39:06.000Z","updated_at":"2023-11-07T06:38:00.000Z","dependencies_parsed_at":"2022-08-27T03:23:14.315Z","dependency_job_id":null,"html_url":"https://github.com/mit-han-lab/anycost-gan","commit_stats":null,"previous_names":[],"tags_count":0,"template":null,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mit-han-lab%2Fanycost-gan","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mit-han-lab%2Fanycost-gan/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mit-han-lab%2Fanycost-gan/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mit-han-lab%2Fanycost-gan/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mit-han-lab","download_url":"https://codeload.github.com/mit-han-lab/anycost-gan/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254004845,"owners_count":21998138,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["computer-graphics","computer-vision","deep-learning","gan","gans","generative-adversarial-network","image-editing","image-generation","image-manipulation","pytorch","stylegan2"],"created_at":"2025-05-13T18:38:28.635Z","updated_at":"2025-05-13T18:38:29.258Z","avatar_url":"https://github.com/mit-han-lab.png","language":"Python","funding_links":[],"categories":["Python"],"sub_categories":[],"readme":"# Anycost GAN\n\n### [video](https://youtu.be/_yEziPl9AkM) | [paper](https://arxiv.org/abs/2103.03243) | [website](https://hanlab18.mit.edu/projects/anycost-gan/) [![](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/mit-han-lab/anycost-gan/blob/master/notebooks/intro_colab.ipynb)\n\n[Anycost GANs for Interactive Image Synthesis and Editing](https://arxiv.org/abs/2103.03243)\n\n[Ji Lin](http://linji.me/), [Richard Zhang](https://richzhang.github.io/), Frieder Ganz, [Song Han](https://songhan.mit.edu/), [Jun-Yan Zhu](https://www.cs.cmu.edu/~junyanz/)\n\nMIT, Adobe Research, CMU\n\nIn CVPR 2021\n\n![flexible](https://hanlab18.mit.edu/projects/anycost-gan/images/flexible.gif)\n\nAnycost GAN generates consistent outputs under various computational budgets.\n\n\n\n## Demo\n\n\u003ca href=\"https://youtu.be/_yEziPl9AkM?t=90\"\u003e\u003cimg src='assets/figures/demo.gif' width=600\u003e\u003c/a\u003e\n\nHere, we can use the Anycost generator for **interactive image editing**. A full generator takes **~3s** to render an image, which is too slow for editing. While with Anycost generator, we can provide a visually similar preview at **5x faster speed**. After adjustment, we hit the \"Finalize\" button to synthesize the high-quality final output. Check [here](https://youtu.be/_yEziPl9AkM?t=90) for the full demo.\n\n\n\n## Overview\n\nAnycost generators can be run at *diverse computation costs* by using different *channel* and *resolution* configurations. Sub-generators achieve high output consistency compared to the full generator, providing a fast preview.\n\n![overview](https://hanlab18.mit.edu/projects/anycost-gan/images/overall.jpg)\n\n\n\nWith (1) Sampling-based multi-resolution training, (2) adaptive-channel training, and (3) generator-conditioned discriminator, we achieve high image quality and consistency at different resolutions and channels.\n\n![method](https://hanlab18.mit.edu/projects/anycost-gan/images/method_pad.gif)\n\n## Results\n\nAnycost GAN (uniform channel version) supports 4 resolutions and 4 channel ratios, producing visually consistent images with different image fidelity.\n\n![uniform](https://hanlab18.mit.edu/projects/anycost-gan/images/uniform.gif)\n\n\n\nThe consistency retains during image projection and editing:\n\n![](https://hanlab18.mit.edu/projects/anycost-gan/images/teaser.jpg)\n\n![](https://hanlab18.mit.edu/projects/anycost-gan/images/editing.jpg)\n\n\n\n## Usage\n\n### Getting Started\n\n- Clone this repo:\n\n```bash\ngit clone https://github.com/mit-han-lab/anycost-gan.git\ncd anycost-gan\n```\n\n- Install PyTorch 1.7 and other dependeinces.\n\nWe recommend setting up the environment using Anaconda: `conda env create -f environment.yml`\n\n\n\n### Introduction Notebook\n\nWe provide a jupyter notebook example to show how to use the anycost generator for image synthesis at diverse costs: `notebooks/intro.ipynb`.\n\nWe also provide a colab version of the notebook: [![](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/mit-han-lab/anycost-gan/blob/master/notebooks/intro_colab.ipynb). Be sure to select the GPU as the accelerator in runtime options.\n\n\n\n### Interactive Demo\n\nWe provide an interactive demo showing how we can use anycost GAN to enable interactive image editing. To run the demo:\n\n```bash\npython demo.py\n```\n\nIf your computer contains a CUDA GPU, try running with:\n```bash\nFORCE_NATIVE=1 python demo.py\n```\n\nYou can find a video recording of the demo [here](https://youtu.be/_yEziPl9AkM?t=90).\n\n\n\n### Using Pre-trained Models\n\nTo get the pre-trained generator, encoder, and editing directions, run:\n\n```python\nimport models\n\npretrained_type = 'generator'  # choosing from ['generator', 'encoder', 'boundary']\nconfig_name = 'anycost-ffhq-config-f'  # replace the config name for other models\nmodels.get_pretrained(pretrained_type, config=config_name)\n```\n\nWe also provide the face attribute classifier (which is general for different generators) for computing the editing directions. You can get it by running:\n\n```python\nmodels.get_pretrained('attribute-predictor')\n```\n\nThe attribute classifier takes in the face images in FFHQ format.\n\n\n\nAfter loading the Anycost generator, we can run it at a wide range of computational costs. For example:\n\n```python\nfrom models.dynamic_channel import set_uniform_channel_ratio, reset_generator\n\ng = models.get_pretrained('generator', config='anycost-ffhq-config-f')  # anycost uniform\nset_uniform_channel_ratio(g, 0.5)  # set channel\ng.target_res = 512  # set resolution\nout, _ = g(...)  # generate image\nreset_generator(g)  # restore the generator\n```\n\nFor detailed usage and *flexible-channel* anycost generator, please refer to `notebooks/intro.ipynb`.\n\n\n\n### Model Zoo\n\nCurrently, we provide the following pre-trained generators, encoders, and editing directions. We will add more in the future.\n\nFor Anycost generators, by default, we refer to the uniform setting.\n\n| config name                    | generator          | encoder            | edit direction     |\n| ------------------------------ | ------------------ | ------------------ | ------------------ |\n| anycost-ffhq-config-f          | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: |\n| anycost-ffhq-config-f-flexible | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: |\n| anycost-car-config-f           | :heavy_check_mark: |                    |                    |\n| stylegan2-ffhq-config-f        | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: |\n\n`stylegan2-ffhq-config-f` refers to the official StyleGAN2 generator converted from the [repo](https://github.com/NVlabs/stylegan2).\n\n\n\n### Datasets\n\nWe prepare the [FFHQ](https://github.com/NVlabs/ffhq-dataset), [CelebA-HQ](https://github.com/switchablenorms/CelebAMask-HQ), and [LSUN Car](https://github.com/fyu/lsun) datasets into a directory of images, so that it can be easily used with `ImageFolder` from `torchvision`. The dataset layout looks like:\n\n```\n├── PATH_TO_DATASET\n│   ├── images\n│   │   ├── 00000.png\n│   │   ├── 00001.png\n│   │   ├── ...\n```\n\nDue to the copyright issue, you need to download the dataset from official site and process them accordingly.\n\n\n\n### Evaluation\n\nWe provide the code to evaluate some metrics presented in the paper. Some of the code is written with [`horovod`](https://github.com/horovod/horovod) to support distributed evaluation and reduce the cost of inter-GPU communication, which greatly improves the speed. Check its website for a proper installation.\n\n#### Fre ́chet Inception Distance (FID)\n\nBefore evaluating the FIDs, you need to compute the inception features of the real images using scripts like:\n\n```bash\npython tools/calc_inception.py \\\n    --resolution 1024 --batch_size 64 -j 16 --n_sample 50000 \\\n    --save_name assets/inceptions/inception_ffhq_res1024_50k.pkl \\\n    PATH_TO_FFHQ\n```\n\nor you can download the pre-computed inceptions from [here](https://www.dropbox.com/sh/bc8a7ewlvcxa2cf/AAD8NFzDWKmBDpbLef-gGhRZa?dl=0) and put it under `assets/inceptions`.\n\nThen, you can evaluate the FIDs by running:\n\n```bash\nhorovodrun -np N_GPU \\\n    python metrics/fid.py \\\n    --config anycost-ffhq-config-f \\\n    --batch_size 16 --n_sample 50000 \\\n    --inception assets/inceptions/inception_ffhq_res1024_50k.pkl\n    # --channel_ratio 0.5 --target_res 512  # optionally using a smaller resolution/channel\n```\n\n#### Perceptual Path Lenght (PPL)\n\nSimilary, evaluting the PPL with:\n\n```bash\nhorovodrun -np N_GPU \\\n    python metrics/ppl.py \\\n    --config anycost-ffhq-config-f\n```\n\n#### Attribute Consistency\n\nEvaluating the attribute consistency by running:\n\n```bash\nhorovodrun -np N_GPU \\\n    python metrics/attribute_consistency.py \\\n    --config anycost-ffhq-config-f \\\n    --channel_ratio 0.5 --target_res 512  # config for the sub-generator; necessary\n```\n\n#### Encoder Evaluation\n\nTo evaluate the performance of the encoder, run:\n\n```bash\npython metrics/eval_encoder.py \\\n    --config anycost-ffhq-config-f \\\n    --data_path PATH_TO_CELEBA_HQ\n```\n\n\n\n### Training\n\nWe provide the scripts to train Anycost GAN on FFHQ dataset.\n\n- Training the original StyleGAN2 on FFHQ\n\n```\nhorovodrun -np 8 bash scripts/train_stylegan2_ffhq.sh\n```\n\nThe training of original StyleGAN2 is time-consuming. We recommend downloading the converted checkpoints from [here](https://www.dropbox.com/sh/l8g9amoduz99kjh/AAAY9LYZk2CnsO43ywDrLZpEa?dl=0) and place it under `checkpoint/`.\n\n- Training Anycost GAN: mult-resolution \n\n```\nhorovodrun -np 8 bash scripts/train_stylegan2_multires_ffhq.sh\n```\n\nNote that after each epoch, we evaluate the FIDs of two resolutions (1024\u0026512) to better monitor the training progress. We also apply distillation to accelearte the convergence, which is not used for the ablation in the paper.\n\n- Training Anycost GAN: adaptive-channel\n\n```\nhorovodrun -np 8 bash scripts/train_stylegan2_multires_adach_ffhq.sh\n```\n\nHere we set a longer training epoch for a more stable reproduction, which might not be necessary (depending on the randomness).\n\n\n\n**Note**: We trained our models on Titan RTX GPUs with 24GB memory. For GPUs with smaller memory, you may need to reduce the resolution/model size/batch size/etc. and adjust other hyper-parameters accordingly.\n\n\n\n## Citation\n\nIf you use this code for your research, please cite our paper.\n\n```\n@inproceedings{lin2021anycost,\n  author    = {Lin, Ji and Zhang, Richard and Ganz, Frieder and Han, Song and Zhu, Jun-Yan},\n  title     = {Anycost GANs for Interactive Image Synthesis and Editing},\n  booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},\n  year      = {2021},\n}\n```\n\n\n\n## Related Projects\n\n**[GAN Compression](https://github.com/mit-han-lab/gan-compression) | [Once for All](https://github.com/mit-han-lab/once-for-all) | [iGAN](https://github.com/junyanz/iGAN) | [StyleGAN2](https://github.com/NVlabs/stylegan2)**\n\n\n\n## Acknowledgement\n\nWe thank Taesung Park, Zhixin Shu, Muyang Li, and Han Cai for the helpful discussion. Part of the work is supported by NSF CAREER Award #1943349, Adobe, SONY, Naver Corporation, and MIT-IBM Watson AI Lab.\n\nThe codebase is build upon a PyTorch implementation of StyleGAN2: [rosinality/stylegan2-pytorch](https://github.com/rosinality/stylegan2-pytorch). For editing direction extraction, we refer to [InterFaceGAN](https://github.com/genforce/interfacegan).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmit-han-lab%2Fanycost-gan","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmit-han-lab%2Fanycost-gan","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmit-han-lab%2Fanycost-gan/lists"}