{"id":13561862,"url":"https://github.com/danielroich/PTI","last_synced_at":"2025-04-03T17:31:58.976Z","repository":{"id":37609774,"uuid":"374950504","full_name":"danielroich/PTI","owner":"danielroich","description":"Official Implementation for \"Pivotal Tuning for Latent-based editing of Real Images\" (ACM TOG 2022) https://arxiv.org/abs/2106.05744","archived":false,"fork":false,"pushed_at":"2024-08-01T08:59:46.000Z","size":8628,"stargazers_count":905,"open_issues_count":22,"forks_count":114,"subscribers_count":24,"default_branch":"main","last_synced_at":"2024-11-04T13:38:09.863Z","etag":null,"topics":["gan-inversion","generative-adversarial-network","image-editing","stylegan","stylegan2-ada-pytorch"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/danielroich.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-06-08T09:11:10.000Z","updated_at":"2024-10-24T09:08:38.000Z","dependencies_parsed_at":"2022-08-08T21:01:04.310Z","dependency_job_id":"6ba94d58-69e4-4035-b31b-93061fad98eb","html_url":"https://github.com/danielroich/PTI","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/danielroich%2FPTI","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/danielroich%2FPTI/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/danielroich%2FPTI/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/danielroich%2FPTI/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/danielroich","download_url":"https://codeload.github.com/danielroich/PTI/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247047086,"owners_count":20874768,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["gan-inversion","generative-adversarial-network","image-editing","stylegan","stylegan2-ada-pytorch"],"created_at":"2024-08-01T13:01:02.036Z","updated_at":"2025-04-03T17:31:53.961Z","avatar_url":"https://github.com/danielroich.png","language":"Jupyter Notebook","funding_links":[],"categories":["Jupyter Notebook","Learning"],"sub_categories":["GAN inversion (and editing)"],"readme":"# PTI: Pivotal Tuning for Latent-based editing of Real Images     (ACM TOG 2022)\n\n\u003c!-- \u003e Recently, a surge of advanced facial editing techniques have been proposed\nthat leverage the generative power of a pre-trained StyleGAN. To successfully\nedit an image this way, one must first project (or invert) the image into\nthe pre-trained generator’s domain. As it turns out, however, StyleGAN’s\nlatent space induces an inherent tradeoff between distortion and editability,\ni.e. between maintaining the original appearance and convincingly altering\nsome of its attributes. Practically, this means it is still challenging to\napply ID-preserving facial latent-space editing to faces which are out of the\ngenerator’s domain. In this paper, we present an approach to bridge this\ngap. Our technique slightly alters the generator, so that an out-of-domain\nimage is faithfully mapped into an in-domain latent code. The key idea is\npivotal tuning — a brief training process that preserves the editing quality\nof an in-domain latent region, while changing its portrayed identity and\nappearance. In Pivotal Tuning Inversion (PTI), an initial inverted latent code\nserves as a pivot, around which the generator is fined-tuned. At the same\ntime, a regularization term keeps nearby identities intact, to locally contain\nthe effect. This surgical training process ends up altering appearance features\nthat represent mostly identity, without affecting editing capabilities.\nTo supplement this, we further show that pivotal tuning can also adjust the\ngenerator to accommodate a multitude of faces, while introducing negligible\ndistortion on the rest of the domain. We validate our technique through\ninversion and editing metrics, and show preferable scores to state-of-the-art\nmethods. We further qualitatively demonstrate our technique by applying\nadvanced edits (such as pose, age, or expression) to numerous images of\nwell-known and recognizable identities. Finally, we demonstrate resilience\nto harder cases, including heavy make-up, elaborate hairstyles and/or headwear,\nwhich otherwise could not have been successfully inverted and edited\nby state-of-the-art methods. --\u003e\n\n\u003ca href=\"https://arxiv.org/abs/2106.05744\"\u003e\u003cimg src=\"https://img.shields.io/badge/arXiv-2008.00951-b31b1b.svg\"\u003e\u003c/a\u003e\n\u003ca href=\"https://opensource.org/licenses/MIT\"\u003e\u003cimg src=\"https://img.shields.io/badge/License-MIT-yellow.svg\"\u003e\u003c/a\u003e  \nInference Notebook: \u003ca href=\"https://colab.research.google.com/github/danielroich/PTI/blob/main/notebooks/inference_playground.ipynb\"\u003e\u003cimg src=\"https://colab.research.google.com/assets/colab-badge.svg\" height=20\u003e\u003c/a\u003e  \n\n\u003cp align=\"center\"\u003e\n\u003cimg src=\"docs/teaser.jpg\"/\u003e  \n\u003cbr\u003e\nPivotal Tuning Inversion (PTI) enables employing off-the-shelf latent based\nsemantic editing techniques on real images using StyleGAN. \nPTI excels in identity preserving edits, portrayed through recognizable figures —\nSerena Williams and Robert Downey Jr. (top), and in handling faces which\nare clearly out-of-domain, e.g., due to heavy makeup (bottom).\n\u003c/br\u003e\n\u003c/p\u003e\n\n## Description   \nOfficial Implementation of our PTI paper + code for evaluation metrics. PTI introduces an optimization mechanizem for solving the StyleGAN inversion task.\nProviding near-perfect reconstruction results while maintaining the high editing abilitis of the native StyleGAN latent space W. For more details, see \u003ca href=\"https://arxiv.org/abs/2106.05744\"\u003e\u003cimg src=\"https://img.shields.io/badge/arXiv-2008.00951-b31b1b.svg\"\u003e\u003c/a\u003e\n\n## Recent Updates\n**2021.07.01**: Fixed files download phase in the inference notebook. Which might caused the notebook not to run smoothly.\n\n**2021.06.29**: Added support for CPU. In order to run PTI on CPU please change `device` parameter under `configs/global_config.py` to \"cpu\" instead of \"cuda\".\n\n**2021.06.25** : Adding mohawk edit using StyleCLIP+PTI in inference notebook.\n\t      Updating documentation in inference notebook due to Google Drive rate limit reached.\n\t      Currently, Google Drive does not allow to download the pretrined models using Colab automatically. Manual intervention might be needed.\n\n## Getting Started\n### Prerequisites\n- Linux or macOS\n- NVIDIA GPU + CUDA CuDNN (Not mandatory bur recommended)\n- Python 3\n\n### Installation\n- Dependencies:  \n\t1. lpips\n\t2. wandb\n\t3. pytorch\n\t4. torchvision\n\t5. matplotlib\n\t6. dlib\n- All dependencies can be installed using *pip install* and the package name\n\n## Pretrained Models\nPlease download the pretrained models from the following links.\n\n### Auxiliary Models\nWe provide various auxiliary models needed for PTI inversion task.  \nThis includes the StyleGAN generator and pre-trained models used for loss computation.\n| Path | Description\n| :--- | :----------\n|[FFHQ StyleGAN](https://nvlabs-fi-cdn.nvidia.com/stylegan2-ada-pytorch/pretrained/ffhq.pkl) | StyleGAN2-ada model trained on FFHQ with 1024x1024 output resolution.\n|[Dlib alignment](https://drive.google.com/file/d/1HKmjg6iXsWr4aFPuU0gBXPGR83wqMzq7/view?usp=sharing) | Dlib alignment used for images preproccessing.\n|[FFHQ e4e encoder](https://drive.google.com/file/d/1ALC5CLA89Ouw40TwvxcwebhzWXM5YSCm/view?usp=sharing) | Pretrained e4e encoder. Used for StyleCLIP editing.\n\nNote: The StyleGAN model is used directly from the official [stylegan2-ada-pytorch implementation](https://github.com/NVlabs/stylegan2-ada-pytorch).\nFor StyleCLIP pretrained mappers, please see [StyleCLIP's official routes](https://github.com/orpatashnik/StyleCLIP/blob/main/utils.py)\n\n\nBy default, we assume that all auxiliary models are downloaded and saved to the directory `pretrained_models`. \nHowever, you may use your own paths by changing the necessary values in `configs/path_configs.py`. \n\n\n## Inversion\n### Preparing your Data\nIn order to invert a real image and edit it you should first align and crop it to the correct size. To do so you should perform *One* of the following steps: \n1. Run `notebooks/align_data.ipynb` and change the \"images_path\" variable to the raw images path\n2. Run `utils/align_data.py` and change the \"images_path\" variable to the raw images path\n\n\n### Weights And Biases\nThe project supports [Weights And Biases](https://wandb.ai/home) framework for experiment tracking. For the inversion task it enables visualization of the losses progression and the generator intermediate results during the initial inversion and the *Pivotal Tuning*(PT) procedure.\n\nThe log frequency can be adjusted using the parameters defined at `configs/global_config.py` under the \"Logs\" subsection.\n\nThere is no no need to have an account. However, in order to use the features provided by Weights and Biases you first have to register on their site.\n\n\n### Running PTI\nThe main training script is `scripts/run_pti.py`. The script receives aligned and cropped images from paths configured in the \"Input info\" subscetion in\n `configs/paths_config.py`. \nResults are saved to directories found at \"Dirs for output files\" under `configs/paths_config.py`. This includes inversion latent codes and tuned generators. \nThe hyperparametrs for the inversion task can be found at  `configs/hyperparameters.py`. They are intilized to the default values used in the paper. \n\n## Editing\nBy default, we assume that all auxiliary edit directions are downloaded and saved to the directory `editings`. \nHowever, you may use your own paths by changing the necessary values in `configs/path_configs.py` under \"Edit directions\" subsection.\n\nExample of editing code can be found at `scripts/latent_editor_wrapper.py`\n\n## Inference Notebooks\nTo help visualize the results of PTI we provide a Jupyter notebook found in `notebooks/inference_playground.ipynb`.   \nThe notebook will download the pretrained models and run inference on a sample image found online or \non images of your choosing. It is recommended to run this in [Google Colab](https://colab.research.google.com/github/danielroich/PTI/blob/main/notebooks/inference_playground.ipynb).\n\nThe notebook demonstrates how to:\n- Invert an image using PTI\n- Visualise the inversion and use the PTI output\n- Edit the image after PTI using InterfaceGAN and StyleCLIP\n- Compare to other inversion methods\n\n## Evaluation\nCurrently the repository supports qualitative evaluation for reconstruction of: PTI, SG2 (*W Space*), e4e, SG2Plus (*W+ Space*). \nAs well as editing using InterfaceGAN and GANSpace for the same inversion methods.\nTo run the evaluation please see `evaluation/qualitative_edit_comparison.py`. Examples of the evaluation scripts are:\n\n\u003cp align=\"center\"\u003e\n\u003cimg src=\"docs/model_rec.jpg\"/\u003e  \n\u003cbr\u003e\nReconsturction comparison between different methods. The images order is: Original image, W+ inversion, e4e inversion, W inversion, PTI inversion\n\u003c/br\u003e  \n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n\u003cimg src=\"docs/stern_rotation.jpg\"/\u003e  \n\u003cbr\u003e\nInterfaceGAN pose edit comparison between different methods. The images order is: Original, W+, e4e, W, PTI\n\u003c/br\u003e  \n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n\u003cimg src=\"docs/tyron_original.jpg\" width=\"220\" height=\"220\"/\u003e  \n\u003cimg src=\"docs/tyron_edit.jpg\" width=\"220\" height=\"220\"/\u003e\n\u003cbr\u003e\nImage per edit or several edits without comparison\n\u003c/br\u003e  \n\u003c/p\u003e\n\n###  Coming Soon - Quantitative evaluation and StyleCLIP qualitative evaluation\n\n## Repository structure\n| Path | Description \u003cimg width=200\u003e\n| :--- | :---\n| \u0026boxvr;\u0026nbsp; configs | Folder containing configs defining Hyperparameters, paths and logging\n| \u0026boxvr;\u0026nbsp; criteria | Folder containing various loss and regularization criterias for the optimization\n| \u0026boxvr;\u0026nbsp; dnnlib | Folder containing internal utils for StyleGAN2-ada\n| \u0026boxvr;\u0026nbsp; docs | Folder containing the latent space edit directions\n| \u0026boxvr;\u0026nbsp; editings | Folder containing images displayed in the README\n| \u0026boxvr;\u0026nbsp; environment | Folder containing Anaconda environment used in our experiments\n| \u0026boxvr;\u0026nbsp; licenses | Folder containing licenses of the open source projects used in this repository\n| \u0026boxvr;\u0026nbsp; models | Folder containing models used in different editing techniques and first phase inversion\n| \u0026boxvr;\u0026nbsp; notebooks | Folder with jupyter notebooks to demonstrate the usage of PTI end-to-end\n| \u0026boxvr;\u0026nbsp; scripts | Folder with running scripts for inversion, editing and metric computations\n| \u0026boxvr;\u0026nbsp; torch_utils | Folder containing internal utils for StyleGAN2-ada\n| \u0026boxvr;\u0026nbsp; training | Folder containing the core training logic of PTI\n| \u0026boxvr;\u0026nbsp; utils | Folder with various utility functions\n\n\n## Credits\n**StyleGAN2-ada model and implementation:**  \nhttps://github.com/NVlabs/stylegan2-ada-pytorch\nCopyright © 2021, NVIDIA Corporation.  \nNvidia Source Code License https://nvlabs.github.io/stylegan2-ada-pytorch/license.html\n\n**LPIPS model and implementation:**  \nhttps://github.com/richzhang/PerceptualSimilarity  \nCopyright (c) 2020, Sou Uchida  \nLicense (BSD 2-Clause) https://github.com/richzhang/PerceptualSimilarity/blob/master/LICENSE\n\n**e4e model and implementation:**   \nhttps://github.com/omertov/encoder4editing\nCopyright (c) 2021 omertov  \nLicense (MIT) https://github.com/omertov/encoder4editing/blob/main/LICENSE\n\n**StyleCLIP model and implementation:**   \nhttps://github.com/orpatashnik/StyleCLIP\nCopyright (c) 2021 orpatashnik  \nLicense (MIT) https://github.com/orpatashnik/StyleCLIP/blob/main/LICENSE\n\n**InterfaceGAN implementation:**   \nhttps://github.com/genforce/interfacegan\nCopyright (c) 2020 genforce  \nLicense (MIT) https://github.com/genforce/interfacegan/blob/master/LICENSE\n\n**GANSpace implementation:**   \nhttps://github.com/harskish/ganspace\nCopyright (c) 2020 harkish  \nLicense (Apache License 2.0) https://github.com/harskish/ganspace/blob/master/LICENSE\n\n\n## Acknowledgments\nThis repository structure is based on [encoder4editing](https://github.com/omertov/encoder4editing) and [ReStyle](https://github.com/yuval-alaluf/restyle-encoder) repositories\n\n## Contact\nFor any inquiry please contact us at our email addresses: danielroich@gmail.com or ron.mokady@gmail.com\n\n\n## Citation\nIf you use this code for your research, please cite:\n```\n@article{roich2021pivotal,\n  title={Pivotal Tuning for Latent-based Editing of Real Images},\n  author={Roich, Daniel and Mokady, Ron and Bermano, Amit H and Cohen-Or, Daniel},\n  publisher = {Association for Computing Machinery},\n  journal={ACM Trans. Graph.},\n  year={2021}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdanielroich%2FPTI","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdanielroich%2FPTI","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdanielroich%2FPTI/lists"}