{"id":19401088,"url":"https://github.com/google-research/hit-gan","last_synced_at":"2025-10-12T15:31:09.031Z","repository":{"id":41067287,"uuid":"437712724","full_name":"google-research/hit-gan","owner":"google-research","description":"Tensorflow implementation for \"Improved Transformer for High-Resolution GANs\" (NeurIPS 2021).","archived":false,"fork":false,"pushed_at":"2024-07-30T21:38:41.000Z","size":45,"stargazers_count":92,"open_issues_count":5,"forks_count":9,"subscribers_count":4,"default_branch":"main","last_synced_at":"2025-01-21T20:08:14.342Z","etag":null,"topics":["generative-adversarial-network","tensorflow","vision-transformer"],"latest_commit_sha":null,"homepage":"https://arxiv.org/abs/2106.07631","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/google-research.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-12-13T02:35:09.000Z","updated_at":"2024-10-13T02:55:30.000Z","dependencies_parsed_at":"2024-12-24T19:02:29.221Z","dependency_job_id":null,"html_url":"https://github.com/google-research/hit-gan","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/google-research%2Fhit-gan","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/google-research%2Fhit-gan/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/google-research%2Fhit-gan/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/google-research%2Fhit-gan/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/google-research","download_url":"https://codeload.github.com/google-research/hit-gan/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":236239242,"owners_count":19117154,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["generative-adversarial-network","tensorflow","vision-transformer"],"created_at":"2024-11-10T11:17:08.546Z","updated_at":"2025-10-12T15:31:03.612Z","avatar_url":"https://github.com/google-research.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# [HiT-GAN](https://arxiv.org/pdf/2106.07631.pdf) Official TensorFlow Implementation\n\nHiT-GAN presents a Transformer-based generator that is trained based on Generative Adversarial Networks (GANs). It achieves state-of-the-art performance for high-resolution image synthesis. Please check our NeurIPS 2021 paper \"[Improved Transformer for High-Resolution GANs](https://arxiv.org/pdf/2106.07631.pdf)\" for more details.\n\nThis implementation is based on TensorFlow 2.x. We use `tf.keras` layers for building the model and use `tf.data` for our input pipeline. The model is trained using a custom training loop with `tf.distribute` on multiple TPUs/GPUs.\n\n## Environment setup\n\nIt is recommended to run distributed training to train our model with TPUs and evaluate it with GPUs. The code is compatible with TensorFlow 2.x. See requirements.txt for all prerequisites, and you can also install them using the following command.\n\n```\npip install -r requirements.txt\n```\n\n## ImageNet\n\nAt the first time, download ImageNet following `tensorflow_datasets` instruction from the [official guide](https://www.tensorflow.org/datasets/catalog/imagenet2012).\n\n### Train on ImageNet\n\nTo pretrain the model on ImageNet with Cloud TPUs, first check out the [Google Cloud TPU tutorial](https://cloud.google.com/tpu/docs/tutorials/mnist) for basic information on how to use Google Cloud TPUs.\n\nOnce you have created virtual machine with Cloud TPUs, and pre-downloaded the ImageNet data for [tensorflow_datasets](https://www.tensorflow.org/datasets/catalog/imagenet2012), please set the following enviroment variables:\n\n```\nTPU_NAME=\u003ctpu-name\u003e\nSTORAGE_BUCKET=gs://\u003cstorage-bucket\u003e\nDATA_DIR=$STORAGE_BUCKET/\u003cpath-to-tensorflow-dataset\u003e\nMODEL_DIR=$STORAGE_BUCKET/\u003cpath-to-store-checkpoints\u003e\n```\n\nThe following command can be used to train a model on ImageNet (which reflects the default hyperparameters in our paper) on TPUv2 4x4:\n\n```\npython run.py --mode=train --dataset=imagenet2012 \\\n  --train_batch_size=256 --train_steps=1000000 \\\n  --image_crop_size=128 --image_crop_proportion=0.875 \\\n  --save_every_n_steps=2000 \\\n  --latent_dim=256 --generator_lr=0.0001 \\\n  --discriminator_lr=0.0001 --channel_multiplier=1 \\\n  --data_dir=$DATA_DIR --model_dir=$MODEL_DIR \\\n  --use_tpu=True --master=$TPU_NAME\n```\n\nTo train the model on ImageNet with multiple GPUs, try the following command:\n\n```\npython run.py --mode=train --dataset=imagenet2012 \\\n  --train_batch_size=256 --train_steps=1000000 \\\n  --image_crop_size=128 --image_crop_proportion=0.875 \\\n  --save_every_n_steps=2000 \\\n  --latent_dim=256 --generator_lr=0.0001 \\\n  --discriminator_lr=0.0001 --channel_multiplier=1 \\\n  --data_dir=$DATA_DIR --model_dir=$MODEL_DIR \\\n  --use_tpu=False --use_ema_model=False\n```\n\nPlease set `train_batch_size` according to the number of GPUs for training. __Note that storing Exponential Moving Average (EMA) models is not supported with GPUs currently (`--use_ema_model=False`), so training with GPUs will lead to slight performance drop.__\n\n### Evaluate on ImageNet\n\nRun the following command to evaluate the model on GPUs:\n\n```\npython run.py --mode=eval --dataset=imagenet2012 \\\n  --eval_batch_size=128 --train_steps=1000000 \\\n  --image_crop_size=128 --image_crop_proportion=0.875 \\\n  --latent_dim=256 --channel_multiplier=1 \\\n  --data_dir=$DATA_DIR --model_dir=$MODEL_DIR \\\n  --use_tpu=False --use_ema_model=True\n```\n\nThis command runs models with 8 P100 GPUs. Please set `eval_batch_size` according to the number of GPUs for evaluation. Please also note that `train_steps` and `use_ema_model` should be set according to the values used for training.\n\n## CelebA-HQ\n\nAt the first time, download CelebA-HQ following `tensorflow_datasets` instruction from the [official guide](https://www.tensorflow.org/datasets/catalog/celeb_a_hq).\n\n### Train on CelebA-HQ\n\nThe following command can be used to train a model on CelebA-HQ (which reflects the default hyperparameters used for the resolution of 256 in our paper) on TPUv2 4x4:\n\n```\npython run.py --mode=train --dataset=celeb_a_hq/256 \\\n  --train_batch_size=256 --train_steps=250000 \\\n  --image_crop_size=256 --image_crop_proportion=1.0 \\\n  --save_every_n_steps=1000 \\\n  --latent_dim=512 --generator_lr=0.00005 \\\n  --discriminator_lr=0.00005 --channel_multiplier=2 \\\n  --use_consistency_regularization=True \\\n  --data_dir=$DATA_DIR --model_dir=$MODEL_DIR \\\n  --use_tpu=True --master=$TPU_NAME\n```\n\n### Evaluate on CelebA-HQ\n\nRun the following command to evaluate the model on 8 P100 GPUs:\n\n```\npython run.py --mode=eval --dataset=celeb_a_hq/256 \\\n  --eval_batch_size=128 --train_steps=250000 \\\n  --image_crop_size=256 --image_crop_proportion=1.0 \\\n  --latent_dim=512 --channel_multiplier=2 \\\n  --data_dir=$DATA_DIR --model_dir=$MODEL_DIR \\\n  --use_tpu=False --use_ema_model=True\n```\n\n## FFHQ\n\nAt the first time, download the tfrecords of FFHQ from the [official site](https://github.com/NVlabs/ffhq-dataset) and put them into `$DATA_DIR`.\n\n### Train on FFHQ\n\nThe following command can be used to train a model on FFHQ (which reflects the default hyperparameters used for the resolution of 256 in our paper) on TPUv2 4x4:\n\n```\npython run.py --mode=train --dataset=ffhq/256 \\\n  --train_batch_size=256 --train_steps=500000 \\\n  --image_crop_size=256 --image_crop_proportion=1.0 \\\n  --save_every_n_steps=1000 \\\n  --latent_dim=512 --generator_lr=0.00005 \\\n  --discriminator_lr=0.00005 --channel_multiplier=2 \\\n  --use_consistency_regularization=True \\\n  --data_dir=$DATA_DIR --model_dir=$MODEL_DIR \\\n  --use_tpu=True --master=$TPU_NAME\n```\n\n### Evaluate on FFHQ\n\nRun the following command to evaluate the model on 8 P100 GPUs:\n\n```\npython run.py --mode=eval --dataset=ffhq/256 \\\n  --eval_batch_size=128 --train_steps=500000 \\\n  --image_crop_size=256 --image_crop_proportion=1.0 \\\n  --latent_dim=512 --channel_multiplier=2 \\\n  --data_dir=$DATA_DIR --model_dir=$MODEL_DIR \\\n  --use_tpu=False --use_ema_model=True\n```\n\n## Cite\n\n```\n@inproceedings{zhao2021improved,\n  title = {Improved Transformer for High-Resolution {GANs}},\n  author = {Long Zhao and Zizhao Zhang and Ting Chen and Dimitris Metaxas and Han Zhang},\n  booktitle = {Advances in Neural Information Processing Systems (NeurIPS)},\n  year = {2021}\n}\n```\n\n## Disclaimer\n\nThis is not an officially supported Google product.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgoogle-research%2Fhit-gan","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgoogle-research%2Fhit-gan","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgoogle-research%2Fhit-gan/lists"}