{"id":14959587,"url":"https://github.com/compvis/net2net","last_synced_at":"2025-06-25T03:34:45.894Z","repository":{"id":65983455,"uuid":"306027700","full_name":"CompVis/net2net","owner":"CompVis","description":"Network-to-Network Translation with Conditional Invertible Neural Networks","archived":false,"fork":false,"pushed_at":"2022-12-20T12:11:14.000Z","size":78844,"stargazers_count":226,"open_issues_count":7,"forks_count":21,"subscribers_count":13,"default_branch":"master","last_synced_at":"2025-04-24T00:27:31.362Z","etag":null,"topics":["autoencoders","gans","generative-model","inn","lightning","normalizing-flows","pytorch","pytorch-lightning","streamlit"],"latest_commit_sha":null,"homepage":"https://compvis.github.io/net2net/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/CompVis.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2020-10-21T13:07:40.000Z","updated_at":"2025-04-19T13:27:31.000Z","dependencies_parsed_at":"2023-02-19T19:31:25.851Z","dependency_job_id":null,"html_url":"https://github.com/CompVis/net2net","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/CompVis/net2net","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CompVis%2Fnet2net","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CompVis%2Fnet2net/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CompVis%2Fnet2net/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CompVis%2Fnet2net/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/CompVis","download_url":"https://codeload.github.com/CompVis/net2net/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CompVis%2Fnet2net/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":261799256,"owners_count":23211362,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["autoencoders","gans","generative-model","inn","lightning","normalizing-flows","pytorch","pytorch-lightning","streamlit"],"created_at":"2024-09-24T13:20:07.296Z","updated_at":"2025-06-25T03:34:45.870Z","avatar_url":"https://github.com/CompVis.png","language":"Python","readme":"# Net2Net\nCode accompanying the NeurIPS 2020 oral paper\n\n[**Network-to-Network Translation with Conditional Invertible Neural Networks**](https://compvis.github.io/net2net/)\u003cbr/\u003e\n[Robin Rombach](https://github.com/rromb)\\*,\n[Patrick Esser](https://github.com/pesser)\\*,\n[Björn Ommer](https://hci.iwr.uni-heidelberg.de/Staff/bommer)\u003cbr/\u003e\n\\* equal contribution\n\n**tl;dr** Our approach distills the residual information of one model with respect to\nanother's and thereby enables translation between fixed off-the-shelf expert\nmodels such as BERT and BigGAN without having to modify or finetune them.\n\n![teaser](assets/teaser.png)\n[arXiv](https://arxiv.org/abs/2005.13580) | [BibTeX](#bibtex) | [Project Page](https://compvis.github.io/net2net/)\n\n**News Dec 19th, 2020**: added SBERT-to-BigGAN, SBERT-to-BigBiGAN and SBERT-to-AE (COCO) \n## Requirements\nA suitable [conda](https://conda.io/) environment named `net2net` can be created\nand activated with:\n\n```\nconda env create -f environment.yaml\nconda activate net2net\n```\n\n## Datasets\n- **CelebA**: Create a symlink 'data/CelebA' pointing to a folder which contains the following files:\n    ```  \n  .\n    ├── identity_CelebA.txt\n    ├── img_align_celeba\n    ├── list_attr_celeba.txt\n    └── list_eval_partition.txt\n  ```\n  These files can be obtained [here](http://mmlab.ie.cuhk.edu.hk/projects/CelebA.html).\n- **CelebA-HQ**: Create a symlink `data/celebahq` pointing to a folder containing\n  the `.npy` files of CelebA-HQ (instructions to obtain them can be found in\n  the [PGGAN repository](https://github.com/tkarras/progressive_growing_of_gans)).\n- **FFHQ**: Create a symlink `data/ffhq` pointing to the `images1024x1024` folder\n  obtained from the [FFHQ repository](https://github.com/NVlabs/ffhq-dataset).\n- **Anime Faces**: First download the face images from the [Anime Crop dataset](https://www.gwern.net/Crops) and then apply\n  the preprocessing of [FFHQ](https://github.com/NVlabs/ffhq-dataset) to those images. We only keep images \n  where the underlying [dlib face recognition model](http://dlib.net/face_landmark_detection.py.html) recognizes \n  a face. Finally, create a symlink `data/anime` which contains the processed anime face images.\n- **Oil Portraits**: [Download here.](https://heibox.uni-heidelberg.de/f/4f35bdc16eea4158aa47/?dl=1)\n  Unpack the content and place the files in `data/portraits`. It consists of\n  18k oil portraits, which were obtained by running [dlib](http://dlib.net/face_landmark_detection.py.html) on a subset of the [WikiArt dataset](https://www.wikiart.org/)\n  dataset, kindly provided by [A Style-Aware Content Loss for Real-time HD Style Transfer](https://github.com/CompVis/adaptive-style-transfer).\n- **COCO**: Create a symlink `data/coco` containing the images from the 2017\n  split in `train2017` and `val2017`, and their annotations in `annotations`.\n  Files can be obtained from the [COCO webpage](https://cocodataset.org).\n\n## ML4Creativity Demo\nWe include a [streamlit](https://www.streamlit.io/) demo, which utilizes our\napproach to demonstrate biases of datasets and their creative applications.\nMore information can be found in our paper [A Note on Data Biases in Generative\nModels](https://drive.google.com/file/d/1PGhBTEAgj2A_FnYMk_1VU-uOxcWY076B/view?usp=sharing) from the [Machine Learning for Creativity and Design](https://neurips2020creativity.github.io/) at [NeurIPS 2020](https://nips.cc/Conferences/2020). Download the models from\n\n- [2020-11-30T23-32-28_celeba_celebahq_ffhq_256](https://k00.fr/lro927bu)\n- [2020-12-02T13-58-19_anime_photography_256](https://heibox.uni-heidelberg.de/d/075e81e16de948aea7a1/)\n- [2020-12-02T16-19-39_portraits_photography_256](https://k00.fr/y3rvnl3j)\n\n\nand place them into `logs`. Run the demo with\n\n```\nstreamlit run ml4cad.py\n```\n\n## Training\nOur code uses [Pytorch-Lightning](https://www.pytorchlightning.ai/) and thus natively supports\nthings like 16-bit precision, multi-GPU training and gradient accumulation. Training details for any model need to be specified in a dedicated `.yaml` file.\nIn general, such a config file is structured as follows:\n```\nmodel:\n  base_learning_rate: 4.5e-6\n  target: \u003cpath/to/lightning/module\u003e\n  params:\n    ...\ndata:\n  target: translation.DataModuleFromConfig\n  params:\n    batch_size: ...\n    num_workers: ...\n    train:\n      target: \u003cpath/to/train/dataset\u003e\n      params:\n        ...\n    validation:\n      target: \u003cpath/to/validation/dataset\u003e\n      params:\n        ...\n```\nAny Pytorch-Lightning model specified under `model.target` is then trained on the specified data\nby running the command:\n```\npython translation.py --base \u003cpath/to/yaml\u003e -t --gpus 0,\n```\nAll available Pytorch-Lightning [trainer](https://pytorch-lightning.readthedocs.io/en/stable/trainer.html) arguments can be added via the command line, e.g. run\n```\npython translation.py --base \u003cpath/to/yaml\u003e -t --gpus 0,1,2,3 --precision 16 --accumulate_grad_batches 2\n```\nto train a model on 4 GPUs using 16-bit precision and a 2-step gradient accumulation.\nMore details are provided in the examples below.\n\n### Training a cINN\nTraining a cINN for network-to-network translation usually utilizes the Lighnting Module `net2net.models.flows.flow.Net2NetFlow`\nand makes a few further assumptions on the configuration file and model interface:\n```\nmodel:\n  base_learning_rate: 4.5e-6\n  target: net2net.models.flows.flow.Net2NetFlow\n  params:\n    flow_config:\n      target: \u003cpath/to/cinn\u003e\n      params:\n        ...\n\n    cond_stage_config:\n      target: \u003cpath/to/network1\u003e\n      params:\n        ...\n\n    first_stage_config:\n      target: \u003cpath/to/network2\u003e\n      params:\n        ...\n```\nHere, the entries under `flow_config` specifies the architecture and parameters of the conditional INN; \n`cond_stage_config` specifies the first network whose representation is to be translated into another network\nspecified by `first_stage_config`.  Our model `net2net.models.flows.flow.Net2NetFlow` expects that the first  \nnetwork has a `.encode()` method which produces the representation of interest, while the second network should\nhave an `encode()` and a `decode()` method, such that both of them applied sequentially produce the networks output. This allows for a modular combination of arbitrary models of interest. For more details, see the examples below.\n\n### Training a cINN - Superresolution\n![superres](assets/superresolutionfigure.png) \nTraining details for a cINN to concatenate two autoencoders from different image scales for stochastic\nsuperresolution are specified in `configs/translation/faces32-to-256.yaml`. \n\nTo train a model for translating from 32 x 32 images to 256 x 256 images on GPU 0, run\n```\npython translation.py --base configs/translation/faces32-to-faces256.yaml -t --gpus 0, \n``` \nand specify any additional training commands as described above. Note that this setup requires two\npretrained autoencoder models, one on 32 x 32 images and the other on 256 x 256. If you want to\ntrain them yourself on a combination of FFHQ and CelebA-HQ, run\n```\npython translation.py --base configs/autoencoder/faces32.yaml -t --gpus \u003cn\u003e, \n```\nfor the 32 x 32 images; and \n```\npython translation.py --base configs/autoencoder/faces256.yaml -t --gpus \u003cn\u003e, \n```\nfor the model on 256 x 256 images. After training, adopt the corresponding model paths in `configs/translation/faces32-to-faces256.yaml`. Additionally, we provide weights of pretrained autoencoders for both settings: \n[Weights 32x32](https://heibox.uni-heidelberg.de/f/b0b103af8406467abe48/);  [Weights256x256](https://k00.fr/94lw2vlg). \nTo run the training as described above, put them into \n`logs/2020-10-16T17-11-42_FacesFQ32x32/checkpoints/last.ckpt`and \n`logs/2020-09-16T16-23-39_FacesXL256z128/checkpoints/last.ckpt`, respectively.\n\n### Training a cINN - Unpaired Translation\n![superres](assets/unpairedtranslationfigure.png) \nAll training scenarios for unpaired translation are specified in the configs in `configs/creativity`.\nWe provide code and pretrained autoencoder models for three different translation tasks:\n- **Anime** ⟷ **Photography**; see `configs/creativity/anime_photography_256.yaml`. \n  Download autoencoder checkpoint ([Download Anime+Photography](https://heibox.uni-heidelberg.de/f/315c628c8b0e40238132/)) and place into `logs/2020-09-30T21-40-22_AnimeAndFHQ/checkpoints/epoch=000007.ckpt`. \n- **Oil-Portrait** ⟷ **Photography**; see `configs/creativity/portraits_photography_256.yaml`\n  Download autoencoder checkpoint ([Download Portrait+Photography](https://heibox.uni-heidelberg.de/f/4f9449418a2e4025bb5f/)) and place into `logs/2020-09-29T23-47-10_PortraitsAndFFHQ/checkpoints/epoch=000004.ckpt`.\n- **FFHQ** ⟷ **CelebA-HQ** ⟷ **CelebA**; see `configs/creativity/celeba_celebahq_ffhq_256.yaml`\n  Download autoencoder checkpoint ([Download FFHQ+CelebAHQ+CelebA](https://k00.fr/94lw2vlg)) and place into `logs/2020-09-16T16-23-39_FacesXL256z128/checkpoints/last.ckpt`. \n  Note that this is the same autoencoder checkpoint as for the stochastic superresolution experiment.\n\nTo train a cINN on one of these unpaired transfer tasks using the first GPU, simply run\n```\npython translation.py --base configs/creativity/\u003ctask-of-interest\u003e.yaml -t --gpus 0,\n```\nwhere `\u003ctask-of-interest\u003e.yaml` is one of `portraits_photography_256.yaml`, `celeba_celebahq_ffhq_256.yaml` \nor `anime_photography_256.yaml`. Providing additional arguments to the pytorch-lightning\ntrainer object is also possible as described above.\n\nIn our framework, unpaired translation between domains is formulated as a\ntranslation between expert 1, a model which can infer the domain a given image\nbelongs to, and expert 2, a model which can synthesize images of each domain.\nIn the examples provided, we assume that the domain label comes with the\ndataset and provide the `net2net.modules.labels.model.Labelator` module, which\nsimply returns a one hot encoding of this label. However, one could also use a\nclassification model which infers the domain label from the image itself.\nFor expert 2, our examples use an autoencoder trained jointly on all domains,\nwhich is easily achieved by concatenating datasets together. The provided\n`net2net.data.base.ConcatDatasetWithIndex` concatenates datasets and returns\nthe corresponding dataset label for each example, which can then be used by the\n`Labelator` class for the translation. The training configurations for the\nautoencoders used in the creativity experiments are included in\n`configs/autoencoder/anime_photography_256.yaml`,\n`configs/autoencoder/celeba_celebahq_ffhq_256.yaml` and\n`configs/autoencoder/portraits_photography_256.yaml`.\n\n#### Unpaired Translation on Custom Datasets\nCreate pytorch datasets for each\nof your domains, create a concatenated dataset with `ConcatDatasetWithIndex`\n(follow the example in `net2net.data.faces.CCFQTrain`), train an\nautoencoder on the concatenated dataset (adjust the `data` section in\n`configs/autoencoder/celeba_celebahq_ffhq_256.yaml`) and finally train a\nnet2net translation model between a `Labelator` and your autoencoder (adjust\nthe sections `data` and `first_stage_config` in\n`configs/creativity/celeba_celebahq_ffhq_256.yaml`). You can then also add your\nnew model to the available modes in the `ml4cad.py` demo to visualize the\nresults.\n\n\n### Training a cINN - Text-to-Image\n![texttoimage](assets/texttoimage.jpg)\nWe provide code to obtain a text-to-image model by translating between a text\nmodel ([SBERT](https://www.sbert.net/)) and an image decoder. To show the\nflexibility of our approach, we include code for three different\ndecoders: BigGAN, as described in the paper,\n[BigBiGAN](https://deepmind.com/research/open-source/BigBiGAN-Large-Scale-Adversarial-Representation-Learning),\nwhich is only available as a [tensorflow](https://www.tensorflow.org/) model\nand thus nicely shows how our approach can work with black-box experts, and an\nautoencoder.\n\n#### SBERT-to-BigGAN\nTrain with\n```\npython translation.py --base configs/translation/sbert-to-biggan256.yaml -t --gpus 0,\n```\nWhen running it for the first time, the required models will be downloaded\nautomatically.\n\n#### SBERT-to-BigBiGAN\nSince BigBiGAN is only available on\n[tensorflow-hub](https://tfhub.dev/s?q=bigbigan), this example has an\nadditional dependency on tensorflow. A suitable environment is provided in\n`env_bigbigan.yaml`, and you will need COCO for training. You can then start\ntraining with\n```\npython translation.py --base configs/translation/sbert-to-bigbigan.yaml -t --gpus 0,\n```\nNote that the `BigBiGAN` class is just a naive wrapper, which converts pytorch\ntensors to numpy arrays, feeds them to the tensorflow graph and again converts\nthe result to pytorch tensors. It does not require gradients of the expert\nmodel and serves as a good example on how to use black-box experts.\n\n#### SBERT-to-AE\nSimilarly to the other examples, you can also train your own autoencoder on\nCOCO with\n```\npython translation.py --base configs/autoencoder/coco256.yaml -t --gpus 0,\n```\nor [download a pre-trained\none](https://k00.fr/fbti4058), and translate\nto it by running\n```\npython translation.py --base configs/translation/sbert-to-ae-coco256.yaml -t --gpus 0,\n```\n\n## Shout-outs\nThanks to everyone who makes their code and models available.\n\n- BigGAN code and weights from: [LoreGoetschalckx/GANalyze](https://github.com/LoreGoetschalckx/GANalyze)\n- Code and weights for the captioning model: [https://github.com/sgrvinod/a-PyTorch-Tutorial-to-Image-Captioning](https://github.com/sgrvinod/a-PyTorch-Tutorial-to-Image-Captioning)\n\n\n## BibTeX\n\n```\n@misc{rombach2020networktonetwork,\n      title={Network-to-Network Translation with Conditional Invertible Neural Networks},\n      author={Robin Rombach and Patrick Esser and Björn Ommer},\n      year={2020},\n      eprint={2005.13580},\n      archivePrefix={arXiv},\n      primaryClass={cs.CV}\n}\n```\n\n```\n@misc{esser2020note,\n      title={A Note on Data Biases in Generative Models}, \n      author={Patrick Esser and Robin Rombach and Björn Ommer},\n      year={2020},\n      eprint={2012.02516},\n      archivePrefix={arXiv},\n      primaryClass={cs.CV}\n}\n```\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcompvis%2Fnet2net","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcompvis%2Fnet2net","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcompvis%2Fnet2net/lists"}