{"id":20213597,"url":"https://github.com/adobe-research/sam_inversion","last_synced_at":"2025-08-14T12:37:24.072Z","repository":{"id":37318109,"uuid":"503934557","full_name":"adobe-research/sam_inversion","owner":"adobe-research","description":"[CVPR 2022] GAN inversion and editing with spatially-adaptive multiple latent layers ","archived":false,"fork":false,"pushed_at":"2023-01-21T18:08:00.000Z","size":46733,"stargazers_count":173,"open_issues_count":8,"forks_count":10,"subscribers_count":9,"default_branch":"master","last_synced_at":"2025-06-02T22:03:12.290Z","etag":null,"topics":["computer-graphics","computer-vision","deep-learning","gan","generative-adversarial-network","image-manipulation","machine-learning","pytorch"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/adobe-research.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2022-06-15T22:01:13.000Z","updated_at":"2025-04-04T06:32:43.000Z","dependencies_parsed_at":"2023-02-12T12:01:17.469Z","dependency_job_id":null,"html_url":"https://github.com/adobe-research/sam_inversion","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/adobe-research/sam_inversion","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/adobe-research%2Fsam_inversion","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/adobe-research%2Fsam_inversion/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/adobe-research%2Fsam_inversion/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/adobe-research%2Fsam_inversion/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/adobe-research","download_url":"https://codeload.github.com/adobe-research/sam_inversion/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/adobe-research%2Fsam_inversion/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":270421749,"owners_count":24580814,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-14T02:00:10.309Z","response_time":75,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["computer-graphics","computer-vision","deep-learning","gan","generative-adversarial-network","image-manipulation","machine-learning","pytorch"],"created_at":"2024-11-14T06:10:17.838Z","updated_at":"2025-08-14T12:37:24.010Z","avatar_url":"https://github.com/adobe-research.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Spatially-Adaptive Multilayer (SAM) GAN Inversion\n\n[**Project Page**](https://www.cs.cmu.edu/~SAMInversion/) | [**Paper**](https://arxiv.org/abs/2206.08357)\n\n\u003cp align=\"center\"\u003e\n\u003cimg src=\"resources/teaser_prepped.gif\" /\u003e\n\u003c/p\u003e\n\nWe provide a PyTorch implementation of GANs projection using multilayer latent codes. Choosing a single latent layer for GAN inversion leads to a dilemma between obtaining a faithful reconstruction\nof the input image and being able to perform downstream edits (1st and 2nd row).\nIn contrast, our proposed method automatically selects the latent space tailored for each region to balance the reconstruction\nquality and editability (3rd row). \u003cbr\u003e\n\n[**Spatially-Adaptive Multilayer Selection for GAN Inversion and Editing**](https://arxiv.org/abs/2206.08357) \u003cbr\u003e\n[Gaurav Parmar](https://gauravparmar.com/), [Yijun Li](https://yijunmaverick.github.io/), [Jingwan Lu](https://research.adobe.com/person/jingwan-lu/), [Richard Zhang](http://richzhang.github.io/), [Jun-Yan Zhu](https://www.cs.cmu.edu/~junyanz/), [Krishna Kumar Singh](http://krsingh.cs.ucdavis.edu/) \u003cbr\u003e\nCMU, Adobe Research \u003cbr\u003e\nCVPR 2022 \u003cbr\u003e\n\n\n### Image Formation with Multiple Latent Codes\n\u003cp align=\"center\"\u003e\n\u003cimg src=\"resources/image_formation.png\" /\u003e\n\u003c/p\u003e\nWe use the predicted invertibility map in conjunction with multiple latent codes to generate the final image.\nFirst, the StyleBlocks of the pretrained StyleGAN2 model are modulated by W+ directly.\nSubsequently, for intermediate feature space Fi, we predict the change in the layer’s feature value\n∆Fi and add it to the feature block after masking with the corresponding binary mask mi.\n\n\n### Predicting the Invertibility Map\n\u003cp align=\"center\"\u003e\n\u003cimg src=\"resources/invertibility_map.png\" style=\"width:75%;\"/\u003e\n\u003c/p\u003e\nWe begin with predicting how difficult each region of the image is to invert for every latent layer using our trained\ninvertibility network. Subsequently, we refine the predicted map using a semantic segmentation network and\ncombine them using a user-specified threshold. This combined invertibility map is shown on the right and\nused to determine the latent layer to be used for inverting each segment in the image.\n\n\n### Qualitative Results\nBelow we show image inversion and editing results obtained using the proposed method.\n**Please see the project website for more results.**\n\u003cp align=\"center\"\u003e\n\u003cimg src=\"resources/results_github.png\" style=\"width:95%;\"/\u003e\n\u003c/p\u003e\n\n\n## Getting Started\nClone this repo:\n```bash\ngit clone --recurse-submodules https://github.com/adobe-research/sam_inversion\ncd sam_inversion\n```\n### Environment Setup\nSee [environment.yml](environment.yml) for a full list of library dependencies.\nThe following commands can be used to install all the dependencies in a new conda environment.\n```bash\nconda env create -f environment.yml\nconda activate inversion\n```\n\n### Inversion\nAn example command for inverting an image for a given target image is shown below. The `--image_category` should be one of {\"cars\", \"faces\", \"cats\"}. The `--sweep_threshold` will perform inversion for a range of different threshold values. See [file](src/sam_inv_optimization.py) for other optional flags.\n```bash\npython src/sam_inv_optimization.py \\\n    --image_category \"cars\" --image_path test_images/cars/b.png \\\n    --output_path \"output/cars/\" --sweep_thresholds --generate_edits\n```\n\n### Using a Custom Dataset\nTo perform SAM Inversion on a custom dataset, we need to train a corresponding invertibility network.\nFirst, perform a single layer inversion using all candidate latent spaces as shown in the command below for all images in the training set.\n```bash\nfor latent_name in \"W+\" \"F4\" \"F6\" \"F8\" \"F10\"; do\n    python src/single_latent_inv.py \\\n        --image_category \"cats\" --image_folder_path datasets/custom_images/train \\\n        --num_opt_steps 501 --output_path \"output/custom_ds/train/${latent_name}\" --target_H 256 --target_W 256 \\\n        --latent_name ${latent_name}\ndone\n```\nNext, repeat the above for the validation and test splits.\nFinally, train the invertibility network as shown in the example command below.\n```bash\npython src/train_invertibility.py \\\n    --dataset_folder_train output/custom_ds/train \\\n    --dataset_folder_val output/custom_ds/val \\\n    --output_folder output/invertibility/custom_ds \\\n    --gpu-ids \"0\" --batch-size 16 --lr 0.0001\n```\n\n## Reference\nIf you find this code useful for your research, please cite our [paper](https://arxiv.org/abs/2206.08357).\n```\n@inproceedings{\nparmar2022sam,\ntitle={Spatially-Adaptive Multilayer Selection for GAN Inversion and Editing},\nauthor={Gaurav Parmar, Yijun Li, Jingwan Lu, Richard Zhang, Jun-Yan Zhu, Krishna Kumar Singh},\nbooktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},\nyear={2022}\n}\n```\n\n## Related Projects\nPlease check out our past GANs inversion projects:\u003cbr\u003e\n[iGAN](https://github.com/junyanz/iGAN) (ECCV 2016), [GANPaint](https://ganpaint.io/) (SIGGRAPH 2019), [GANSeeing](https://github.com/davidbau/ganseeing) (ICCV 2019), [pix2latent](https://github.com/minyoungg/pix2latent) (ECCV 2020)\n\n## Acknowledgment\nOur work is built partly based on the following repos:\n - [e4e](https://github.com/omertov/encoder4editing) - Encoder used for the `W+` inversions.\n - [StyleGAN](https://github.com/NVlabs/stylegan3) - The generative model used for the inversion.\n - [Deeplab3-xception](https://github.com/jfzhang95/pytorch-deeplab-xception) - Used for the base architectore of the invertibility prediction network.\n - [HRNet](https://github.com/CSAILVision/semantic-segmentation-pytorch), [Detectron](https://github.com/facebookresearch/detectron2) - Used for segmenting images (except faces).\n - [Face Parsing](https://github.com/zllrunning/face-parsing.PyTorch) - Used for segmenting face images.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fadobe-research%2Fsam_inversion","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fadobe-research%2Fsam_inversion","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fadobe-research%2Fsam_inversion/lists"}