https://github.com/martinduartemore/awesome-text-based-image-manipulation

A curated list of text-based image manipulation methods.
https://github.com/martinduartemore/awesome-text-based-image-manipulation
List: awesome-text-based-image-manipulation
awesome awesome-list image-editing image-manipulation language-based-image-editing language-based-image-manipulation text-based-image-editing text-based-image-manipulation text-driven-editing text-driven-image-editing text-driven-image-manipulation text-driven-manipulation text-guided-image-editing text-guided-image-manipulation
Last synced: 5 months ago
JSON representation
A curated list of text-based image manipulation methods.
Host: GitHub
URL: https://github.com/martinduartemore/awesome-text-based-image-manipulation
Owner: martinduartemore
License: cc0-1.0
Created: 2020-08-19T21:00:27.000Z (over 5 years ago)
Default Branch: master
Last Pushed: 2023-02-05T18:36:08.000Z (almost 3 years ago)
Last Synced: 2024-05-23T07:08:26.654Z (over 1 year ago)
Topics: awesome, awesome-list, image-editing, image-manipulation, language-based-image-editing, language-based-image-manipulation, text-based-image-editing, text-based-image-manipulation, text-driven-editing, text-driven-image-editing, text-driven-image-manipulation, text-driven-manipulation, text-guided-image-editing, text-guided-image-manipulation
Language: Python
Homepage:
Size: 94.7 KB
Stars: 70
Watchers: 2
Forks: 3
Open Issues: 0
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
Awesome Lists containing this project

ultimate-awesome - awesome-text-based-image-manipulation - A curated list of text-based image manipulation methods. (Other Lists / TeX Lists)
README

          # Awesome Text-based Image Manipulation [![Awesome](https://awesome.re/badge-flat.svg)](https://awesome.re)

A curated list of text-based image manipulation methods

[![GitHub - LICENSE](https://img.shields.io/github/license/martinduartemore/awesome-text-based-image-manipulation)](./LICENSE)

## Table of Contents

* [Datasets](#datasets)

* [Papers](#papers)

* [Contributing](#contributing)

## Datasets

|Name|Links|

|:---:|:---:|

|102 Category Flower|[Images](https://www.robots.ox.ac.uk/~vgg/data/flowers/102/index.html)
[Captions](https://github.com/reedscot/icml2016)|

|Caltech-UCSD Birds-200-2011|[Images](http://www.vision.caltech.edu/visipedia/CUB-200-2011.html)
[Captions](https://github.com/reedscot/icml2016)|

|CoDraw|[Data](https://github.com/facebookresearch/CoDraw)|

|DeepFashion|[Images](http://mmlab.ie.cuhk.edu.hk/projects/DeepFashion/FashionSynthesis.html)
[Captions](http://mmlab.ie.cuhk.edu.hk/projects/FashionGAN)|

|i-CLEVR|[Data](https://www.microsoft.com/en-us/research/project/generative-neural-visual-artist-geneva/)|

|Multi-Modal-CelebA-HQ|[Data](https://github.com/IIGROUP/Multi-Modal-CelebA-HQ-Dataset)|

## Papers

Note on the date column: if a paper was published on a preprint venue (such as

arXiv) before being accepted at a conference, the date information will match

the first preprint release date.

|Date|Title|Venue|Citations|Paper|Code|

|:---:|:---:|:---:|:---:|:---:|:---:|

|2023/01|Muse: Text-To-Image Generation via Masked Generative Transformers|-|N/A|[arXiv](https://arxiv.org/abs/2301.00704)
[Project Page](https://muse-model.github.io/)|[PyTorch](https://github.com/lucidrains/muse-maskgit-pytorch)
 ![](https://img.shields.io/github/stars/lucidrains/muse-maskgit-pytorch.svg?style=social)|

|2023/01|FICE: Text-Conditioned Fashion Image Editing With Guided GAN Inversion|-|N/A|[arXiv](https://arxiv.org/abs/2301.02110)|-|

|2023/01|SEGA: Instructing Diffusion using Semantic Dimensions|-|N/A|[arXiv](https://arxiv.org/abs/2301.12247)|-|

|2022/12|CLIPVG: Text-Guided Image Manipulation Using Differentiable Vector Graphics|AAAI|N/A|[arXiv](https://arxiv.org/abs/2212.02122)|[PyTorch (Official)](https://github.com/NetEase-GameAI/clipvg)
 ![](https://img.shields.io/github/stars/NetEase-GameAI/clipvg.svg?style=social)|

|2022/12|SINE: SINgle Image Editing with Text-to-Image Diffusion Models|-|N/A|[arXiv](https://arxiv.org/abs/2212.04489)|[PyTorch (Official)](https://github.com/zhang-zx/SINE)
 ![](https://img.shields.io/github/stars/zhang-zx/SINE.svg?style=social)|

|2022/12|The Stable Artist: Steering Semantics in Diffusion Latent Space|-|N/A|[arXiv](https://arxiv.org/abs/2212.06013)|-|

|2022/11|Null-text Inversion for Editing Real Images using Guided Diffusion Models|-|N/A|[arXiv](https://arxiv.org/abs/2211.09794)
[Official Page](https://null-text-inversion.github.io/)|[PyTorch (Official)](https://github.com/google/prompt-to-prompt/#null-text-inversion-for-editing-real-images)
 ![](https://img.shields.io/github/stars/prompt-to-prompt/#null-text-inversion-for-editing-real-images.svg?style=social)|

|2022/11|InstructPix2Pix: Learning to Follow Image Editing Instructions|-|N/A|[arXiv](https://arxiv.org/abs/2211.09800)
[Official Page](https://www.timothybrooks.com/instruct-pix2pix)|[PyTorch (Official)](https://github.com/timothybrooks/instruct-pix2pix)
 ![](https://img.shields.io/github/stars/timothybrooks/instruct-pix2pix.svg?style=social)|

|2022/11|DiffStyler: Controllable Dual Diffusion for Text-Driven Image Stylization|-|N/A|[arXiv](https://arxiv.org/abs/2211.10682)|-|

|2022/11|Interactive Image Manipulation with Complex Text Instructions|WACV|N/A|[arXiv](https://arxiv.org/abs/2211.15352)|-|

|2022/11|Target-Free Text-guided Image Manipulation|AAAI|N/A|[arXiv](https://arxiv.org/abs/2211.14544)|-|

|2022/11|Direct Inversion: Optimization-Free Text-Driven Real Image Editing with Diffusion Models|-|N/A|[arXiv](https://arxiv.org/abs/2211.07825)|-|

|2022/11|Plug-and-Play Diffusion Features for Text-Driven Image-to-Image Translation|-|N/A|[arXiv](https://arxiv.org/abs/2211.12572)
[Project Page](https://pnp-diffusion.github.io/)|-|

|2022/10|ManiCLIP: Multi-Attribute Face Manipulation from Text||N/A|[arXiv](https://arxiv.org/abs/2210.00445)|-|

|2022/10|Towards Open-Ended Text-to-Face Generation, Combination and Manipulation|ACMMM|N/A|[ACM](https://dl.acm.org/doi/abs/10.1145/3503161.3547758)|-|

|2022/10|CLIP-PAE: Projection-Augmentation Embedding to Extract Relevant Features for a Disentangled, Interpretable, and Controllable Text-Guided Image Manipulation|-|N/A|[arXiv](https://arxiv.org/abs/2210.03919)|-|

|2022/10|One Model to Edit Them All: Free-Form Text-Driven Image Manipulation with Semantic Modulation|NeurIPS|N/A|[arXiv](https://arxiv.org/abs/2210.07883)|[- (Official)](https://github.com/KumapowerLIU/FFCLIP)
 ![](https://img.shields.io/github/stars/KumapowerLIU/FFCLIP.svg?style=social)|

|2022/10|Imagic: Text-Based Real Image Editing with Diffusion Models|-|N/A|[arXiv](https://arxiv.org/abs/2210.09276)
[Official Page](https://imagic-editing.github.io/)|-|

|2022/10|Assessment of Image Manipulation Using Natural Language Description: Quantification of Manipulation Direction|ICIP|N/A|[IEEE](https://ieeexplore.ieee.org/abstract/document/9897900)|-|

|2022/10|LDEdit: Towards Generalized Text Guided Image Manipulation via Latent Diffusion Models|BMVC|N/A|[arXiv](https://arxiv.org/abs/2210.02249)|-|

|2022/10|UniTune: Text-Driven Image Editing by Fine Tuning an Image Generation Model on a Single Image|-|N/A|[arXiv](https://arxiv.org/abs/2210.09477)|[PyTorch](https://github.com/xuduo35/UniTune)
 ![](https://img.shields.io/github/stars/xuduo35/UniTune.svg?style=social)|

|2022/10|DiffEdit: Diffusion-based semantic image editing with mask guidance|-|N/A|[arXiv](https://arxiv.org/abs/2210.11427)|-|

|2022/10|Bridging CLIP and StyleGAN through Latent Alignment for Image Editing|-|N/A|[arXiv](https://arxiv.org/abs/2210.04506)|-|

|2022/09|Language-based Image Manipulation Built on Language-Guided Ranking|IEEE Transactions on Multimedia|N/A|[IEEE](https://ieeexplore.ieee.org/abstract/document/9893379)|-|

|2022/09|StyleGAN-based CLIP-guided Image Shape Manipulation|CBMI|N/A|[ACM](https://dl.acm.org/doi/abs/10.1145/3549555.3549556)|-|

|2022/08|Prompt-to-Prompt Image Editing with Cross Attention Control|-|N/A|[arXiv](https://arxiv.org/abs/2208.01626)
[Official Page](https://prompt-to-prompt.github.io/)|[PyTorch (Official)](https://github.com/google/prompt-to-prompt/)
 ![](https://img.shields.io/github/stars/prompt-to-prompt/.svg?style=social)|

|2022/08|DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation|CVPR (2023)|N/A|[arXiv](https://arxiv.org/abs/2208.12242)
[Project Page](https://dreambooth.github.io/)|[Dataset (Official)](https://github.com/google/dreambooth)
 ![](https://img.shields.io/github/stars/google/dreambooth.svg?style=social)
[PyTorch](https://github.com/XavierXiao/Dreambooth-Stable-Diffusion)
 ![](https://img.shields.io/github/stars/XavierXiao/Dreambooth-Stable-Diffusion.svg?style=social)|

|2022/07|Towards Counterfactual Image Manipulation via CLIP|ACMMM|N/A|[arXiv](https://arxiv.org/abs/2207.02812)|[PyTorch (Official)](https://github.com/yingchen001/CF-CLIP)
 ![](https://img.shields.io/github/stars/yingchen001/CF-CLIP.svg?style=social)|

|2022/07|Cross-modal Representation Learning and Relation Reasoning for Bidirectional Adaptive Manipulation |IJCAI|N/A|[IJCAI](https://www.ijcai.org/proceedings/2022/447)|-|

|2022/06|DE-Net: Dynamic Text-guided Image Editing Adversarial Networks|AAAI|N/A|[arXiv](https://arxiv.org/abs/2206.01160)|[PyTorch (Official)](https://github.com/tobran/DE-Net)
 ![](https://img.shields.io/github/stars/tobran/DE-Net.svg?style=social)|

|2022/05|Generative Adversarial Network Including Referring Image Segmentation For Text-Guided Image Manipulation|ICASSP|N/A|[IEEE](https://ieeexplore.ieee.org/abstract/document/9746970)|-|

|2022/04|Paired-D++ GAN for image manipulation with text|Machine Vision and Applications|N/A|[Springer](https://link.springer.com/article/10.1007/s00138-022-01298-7)|-|

|2022/04|ManiTrans: Entity-Level Text-Guided Image Manipulation via Token-wise Semantic Alignment and Generation|CVPR|N/A|[arXiv](https://arxiv.org/abs/2204.04428)|[- (Official)](https://jawang19.github.io/manitrans/)
 ![](https://img.shields.io/github/stars/manitrans/.svg?style=social)|

|2022/04|VQGAN-CLIP: Open Domain Image Generation and Editing with Natural Language Guidance|ECCV|N/A|[arXiv](https://arxiv.org/abs/2204.08583)|[PyTorch (Official)](https://github.com/EleutherAI/vqgan-clip)
 ![](https://img.shields.io/github/stars/EleutherAI/vqgan-clip.svg?style=social)|

|2022/04|Referring Object Manipulation of Natural Images with Conditional Classifier-free Guidance|ECCV|N/A|[ECVA](https://www.ecva.net/papers/eccv_2022/papers_ECCV/papers/136960619.pdf)|[PyTorch (Official)](https://github.com/google/referring-manipulation)
 ![](https://img.shields.io/github/stars/google/referring-manipulation.svg?style=social)|

|2022/04|Text2LIVE: Text-Driven Layered Image and Video Editing|ECCV|N/A|[arXiv](https://arxiv.org/abs/2204.02491)|[PyTorch (Official)](https://github.com/omerbt/Text2LIVE)
 ![](https://img.shields.io/github/stars/omerbt/Text2LIVE.svg?style=social)|

|2022/03|FlexIT: Towards Flexible Semantic Image Translation|CVPR|N/A|[arXiv](https://arxiv.org/abs/2203.04705)|[PyTorch (Official)](https://github.com/facebookresearch/SemanticImageTranslation)
 ![](https://img.shields.io/github/stars/facebookresearch/SemanticImageTranslation.svg?style=social)|

|2022/03|EnvEdit: Environment Editing for Vision-and-Language Navigation|CVPR|N/A|[arXiv](https://arxiv.org/abs/2203.15685)|[PyTorch (Official)](https://github.com/jialuli-luka/EnvEdit)
 ![](https://img.shields.io/github/stars/jialuli-luka/EnvEdit.svg?style=social)|

|2022/03|AnyFace: Free-style Text-to-Face Synthesis and Manipulation|CVPR|N/A|[arXiv](https://arxiv.org/abs/2203.15334)|-|

|2022/02|FEAT: Face Editing with Attention|-|N/A|[arXiv](https://arxiv.org/abs/2202.02713)|[PyTorch](https://github.com/Psarpei/GanVinci)
 ![](https://img.shields.io/github/stars/Psarpei/GanVinci.svg?style=social)|

|2022/02|Learning by Imagination: A Joint Framework for Text-based Image Manipulation and Change Captioning|IEEE Transactions on Multimedia|N/A|[IEEE](https://ieeexplore.ieee.org/abstract/document/9720958)|-|

|2022/02|Name Your Style: An Arbitrary Artist-aware Image Style Transfer|CVPR (2023, Workshop)|N/A|[arXiv](https://arxiv.org/abs/2202.13562)
[CvF](https://openaccess.thecvf.com/content/CVPR2023W/CVFAD/papers/Liu_Name_Your_Style_Text-Guided_Artistic_Style_Transfer_CVPRW_2023_paper.pdf)|[PyTorch (Official)](https://github.com/Holmes-Alan/TxST)
 ![](https://img.shields.io/github/stars/Holmes-Alan/TxST.svg?style=social)|

|2022/02|Interactive Image Generation with Natural-Language Feedback|AAAI|N/A|[AAAI](https://aaai-2022.virtualchair.net/poster_aaai7081)|-|

|2021/12|CLIPstyler: Image Style Transfer with a Single Text Condition|CVPR|N/A|[arXiv](https://arxiv.org/abs/2112.00374)|[PyTorch (Official)](https://github.com/cyclomon/CLIPstyler)
 ![](https://img.shields.io/github/stars/cyclomon/CLIPstyler.svg?style=social)|

|2021/12|Embedding Arithmetic for Text-driven Image Transformation|CVPR (2022, O-DRUM Workshop)|N/A|[arXiv](https://arxiv.org/abs/2112.03162)
[CvF](https://openaccess.thecvf.com/content/CVPR2022W/ODRUM/papers/Couairon_Embedding_Arithmetic_of_Multimodal_Queries_for_Image_Retrieval_CVPRW_2022_paper.pdf)|[PyTorch (Official)](https://github.com/facebookresearch/SIMAT)
 ![](https://img.shields.io/github/stars/facebookresearch/SIMAT.svg?style=social)|

|2021/12|CLIP-NeRF: Text-and-Image Driven Manipulation of Neural Radiance Fields|CVPR|N/A|[arXiv](https://arxiv.org/abs/2112.05139)|[PyTorch (Official)](https://github.com/cassiePython/CLIPNeRF)
 ![](https://img.shields.io/github/stars/cassiePython/CLIPNeRF.svg?style=social)|

|2021/12|HairCLIP: Design Your Hair by Text and Reference Image|CVPR|N/A|[arXiv](https://arxiv.org/abs/2112.05142)|[PyTorch (Official)](https://github.com/wty-ustc/HairCLIP)
 ![](https://img.shields.io/github/stars/wty-ustc/HairCLIP.svg?style=social)|

|2021/12|More Control for Free! Image Synthesis with Semantic Diffusion Guidance|WACV (2023)|N/A|[arXiv](https://arxiv.org/abs/2112.05744)|[PyTorch (Official)](https://github.com/xh-liu/SDG_code/)
 ![](https://img.shields.io/github/stars/SDG_code/.svg?style=social)|

|2021/12|StyleMC: Multi-Channel Based Fast Text-Guided Image Generation and Manipulation|WACV|N/A|[arXiv](https://arxiv.org/abs/2112.08493)|[PyTorch (Official)](https://github.com/catlab-team/stylemc)
 ![](https://img.shields.io/github/stars/catlab-team/stylemc.svg?style=social)|

|2021/12|GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models|ICML (2022)|N/A|[arXiv](https://arxiv.org/abs/2112.08493)
[MLR](https://proceedings.mlr.press/v162/nichol22a/nichol22a.pdf)|[PyTorch (Official)](https://github.com/openai/glide-text2im)
 ![](https://img.shields.io/github/stars/openai/glide-text2im.svg?style=social)|

|2021/12|Generative Adversarial Network for Text-to-Face Synthesis and Manipulation with Pretrained BERT Model|IEEE FG|N/A|[IEEE](https://ieeexplore.ieee.org/abstract/document/9666791)|-|

|2021/11|LatteGAN: Visually Guided Language Attention for Multi-Turn Text-Conditioned Image Manipulation|IEEE Access|N/A|[IEEE](https://ieeexplore.ieee.org/abstract/document/9620071)|[PyTorch (Official)](https://github.com/smatsumori/LatteGAN)
 ![](https://img.shields.io/github/stars/smatsumori/LatteGAN.svg?style=social)|

|2021/11|Predict, Prevent, and Evaluate: Disentangled Text-Driven Image Manipulation Empowered by Pre-Trained Vision-Language Model|CVPR (2022)|N/A|[arXiv](https://arxiv.org/abs/2111.13333)|[PaddlePaddle (Official)](https://github.com/zipengxuc/ppe)
 ![](https://img.shields.io/github/stars/zipengxuc/ppe.svg?style=social)
[PyTorch (Official)](https://github.com/zipengxuc/PPE-Pytorch)
 ![](https://img.shields.io/github/stars/zipengxuc/PPE-Pytorch.svg?style=social)|

|2021/11|Blended Diffusion for Text-driven Editing of Natural Images|CVPR (2022)|N/A|[arXiv](https://arxiv.org/abs/2111.14818)
[CvF](https://openaccess.thecvf.com/content/CVPR2022/papers/Avrahami_Blended_Diffusion_for_Text-Driven_Editing_of_Natural_Images_CVPR_2022_paper.pdf)|[PyTorch (Official)](https://github.com/omriav/blended-diffusion)
 ![](https://img.shields.io/github/stars/omriav/blended-diffusion.svg?style=social)|

|2021/11|SpaceEdit: Learning a Unified Editing Space for Open-Domain Image Editing|CVPR|N/A|[arXiv](https://arxiv.org/abs/2112.00180)|-|

|2021/10|DiffusionCLIP: Text-Guided Diffusion Models for Robust Image Manipulation|CVPR|N/A|[arXiv](https://arxiv.org/abs/2110.02711)|[PyTorch (Official)](https://github.com/gwang-kim/DiffusionCLIP)
 ![](https://img.shields.io/github/stars/gwang-kim/DiffusionCLIP.svg?style=social)|

|2021/10|Language-Guided Global Image Editing via Cross-Modal Cyclic Mechanism|ICCV|N/A|[CvF](https://openaccess.thecvf.com/content/ICCV2021/papers/Jiang_Language-Guided_Global_Image_Editing_via_Cross-Modal_Cyclic_Mechanism_ICCV_2021_paper.pdf)|[PyTorch (Official)](https://github.com/wtjiang98/Language_Guided_Image_Editing)
 ![](https://img.shields.io/github/stars/wtjiang98/Language_Guided_Image_Editing.svg?style=social)|

|2021/10|Each Attribute Matters: Contrastive Attention for Sentence-based Image Editing|BMVC|N/A|[arXiv](https://arxiv.org/abs/2110.11159)|[(Official)](https://github.com/Zlq2021/CA-GAN)
 ![](https://img.shields.io/github/stars/Zlq2021/CA-GAN.svg?style=social)|

|2021/09|Segmentation-Aware Text-Guided Image Manipulation|ICIP|N/A|[IEEE](https://ieeexplore.ieee.org/abstract/document/9506601)|-|

|2021/09|Talk-to-Edit: Fine-Grained Facial Editing via Dialog|ICCV|N/A|[arXiv](https://arxiv.org/abs/2109.04425)|[PyTorch (Official)](https://github.com/yumingj/Talk-to-Edit)
 ![](https://img.shields.io/github/stars/yumingj/Talk-to-Edit.svg?style=social)|

|2021/06|Text-Guided Human Image Manipulation via Image-Text Shared Space|IEEE TPAMI|N/A|[IEEE](https://ieeexplore.ieee.org/abstract/document/9444850)|-|

|2021/06|Grounded, Controllable and Debiased Image Completion with Lexical Semantics|CVPR (Causality in Vision Workshop)|N/A|[CvF](https://openaccess.thecvf.com/content/CVPR2021W/CiV/papers/Zhang_Grounded_Controllable_and_Debiased_Image_Completion_With_Lexical_Semantics_CVPRW_2021_paper.pdf)|-|

|2021/06|Learning by Planning: Language-Guided Global Image Editing|CVPR|N/A|[arXiv](https://arxiv.org/abs/2106.13156)|[PyTorch (Official)](https://github.com/jshi31/T2ONet)
 ![](https://img.shields.io/github/stars/jshi31/T2ONet.svg?style=social)|

|2021/03|StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery|ICCV|N/A|[arXiv](https://arxiv.org/abs/2103.17249)|[PyTorch (Official)](https://github.com/orpatashnik/StyleCLIP)
 ![](https://img.shields.io/github/stars/orpatashnik/StyleCLIP.svg?style=social)|

|2021/03|Text-Guided Style Transfer-Based Image Manipulation Using Multimodal Generative Models|IEEE TPAMI|N/A|[IEEE](https://ieeexplore.ieee.org/abstract/document/9389731)|-|

|2021/02|Zero-Shot Text-to-Image Generation|ICML|N/A|[arXiv](https://arxiv.org/abs/2102.12092)
[MLR](https://proceedings.mlr.press/v139/ramesh21a.html)|[PyTorch (Official)](https://github.com/openai/DALL-E)
 ![](https://img.shields.io/github/stars/openai/DALL-E.svg?style=social)|

|2020/12|TediGAN: Text-Guided Diverse Face Image Generation and Manipulation|CVPR|N/A|[arXiv](https://arxiv.org/abs/2012.03308)|[PyTorch (Official)](https://github.com/IIGROUP/TediGAN)
 ![](https://img.shields.io/github/stars/IIGROUP/TediGAN.svg?style=social)|

|2020/10|Learning Cross-Modal Representations for Language-Based Image Manipulation|ICIP|N/A|[ResearchGate](https://www.researchgate.net/publication/341984406)|-|

|2020/10|Text-Guided Image Inpainting|ACMMM|N/A|[ACM](https://dl.acm.org/doi/abs/10.1145/3394171.3413939)|-|

|2020/10|Lightweight Generative Adversarial Networks for Text-Guided Image Manipulation|NeurIPS|N/A|[arXiv](https://arxiv.org/abs/2010.12136)|[(Official)](https://github.com/mrlibw/Lightweight-Manipulation)
 ![](https://img.shields.io/github/stars/mrlibw/Lightweight-Manipulation.svg?style=social)|

|2020/10|A Benchmark and Baseline for Language-Driven Image Editing|ACCV|N/A|[arXiv](https://arxiv.org/abs/2010.02330)|[- (Official)](https://github.com/jshi31/LDIE_ACCV)
 ![](https://img.shields.io/github/stars/jshi31/LDIE_ACCV.svg?style=social)|

|2020/09|SSCR: Iterative Language-Based Image Editing via Self-Supervised Counterfactual Reasoning|EMNLP|N/A|[arXiv](https://arxiv.org/abs/2009.09566)|[PyTorch (Official)](https://github.com/tsujuifu/pytorch_sscr)
 ![](https://img.shields.io/github/stars/tsujuifu/pytorch_sscr.svg?style=social)|

|2020/08|Describe What to Change: A Text-guided Unsupervised Image-to-Image Translation Approach|ACMMM|N/A|[arXiv](https://arxiv.org/abs/2008.04200)|[PyTorch (Official)](https://github.com/yhlleo/DWC-GAN)
 ![](https://img.shields.io/github/stars/yhlleo/DWC-GAN.svg?style=social)|

|2020/08|Open-Edit: Open-Domain Image Manipulation with Open-Vocabulary Instructions|ECCV|N/A|[arXiv](https://arxiv.org/abs/2008.01576)|[PyTorch (Official)](https://github.com/xh-liu/Open-Edit)
 ![](https://img.shields.io/github/stars/xh-liu/Open-Edit.svg?style=social)|

|2020/08|Text as Neural Operator: Image Manipulation by Text Instruction|ACMMM (2021)|N/A|[arXiv](https://arxiv.org/abs/2008.04556)
[ACM](https://dl.acm.org/doi/10.1145/3474085.3475343)|[PyTorch (Official)](https://github.com/google/tim-gan)
 ![](https://img.shields.io/github/stars/google/tim-gan.svg?style=social)|

|2020/08|IR-GAN: Image Manipulation with Linguistic Instructionby Increment Reasoning|ACMMM|N/A|[ACM](https://dl.acm.org/doi/10.1145/3394171.3413777)|[PyTorch (Official)](https://github.com/Victarry/IR-GAN-Code)
 ![](https://img.shields.io/github/stars/Victarry/IR-GAN-Code.svg?style=social)|

|2020/06|Customizable GAN: A Method for Image Synthesis of Human Controllable|IEEE Access|N/A|[IEEE](https://ieeexplore.ieee.org/document/9112217)|-|

|2020/05|Scones: Towards Conversational Authoring of Sketches|IUI|N/A|[arXiv](https://arxiv.org/abs/2005.07781)|[TensorFlow (Official)](https://github.com/CannyLab/scones)
 ![](https://img.shields.io/github/stars/CannyLab/scones.svg?style=social)|

|2020/04|Text-Guided Neural Image Inpainting|ACMMM|N/A|[arXiv](https://arxiv.org/abs/2004.03212)|[PyTorch (Official)](https://github.com/idealwhite/TDANet)
 ![](https://img.shields.io/github/stars/idealwhite/TDANet.svg?style=social)|

|2020/02|FACT: Fused Attention for Clothing Transfer with Generative Adversarial Networks|AAAI|N/A|[AAAI](https://aaai.org/ojs/index.php/AAAI/article/view/6987/6841)|-|

|2020/02|Grounded and Controllable Image Completion by Incorporating Lexical Semantics|-|N/A|[arXiv](https://arxiv.org/abs/2003.00303)|-|

|2020/02|Image-to-Image Translation with Text Guidance|-|N/A|[arXiv](https://arxiv.org/abs/2002.05235)|-|

|2020/01|Progressive Semantic Image Synthesis via Generative Adversarial Network|VCIP|N/A|[IEEE](https://ieeexplore.ieee.org/document/8966069)|-|

|2019/12|Image Manipulation with Natural Language using Two-sided Attentive Conditional Generative Adversarial Network|Neural Networks (2021)|N/A|[arXiv](https://arxiv.org/abs/1912.07478)|-|

|2019/12|ManiGAN: Text-Guided Image Manipulation|CVPR|N/A|[arXiv](https://arxiv.org/abs/1912.06203)|[PyTorch (Official)](https://github.com/mrlibw/ManiGAN)
 ![](https://img.shields.io/github/stars/mrlibw/ManiGAN.svg?style=social)|

|2019/12|Controlling Style and Semantics in Weakly-Supervised Image Generation|ECCV|N/A|[arXiv](https://arxiv.org/abs/1912.03161)|[PyTorch (Official)](https://github.com/dariopavllo/style-semantics)
 ![](https://img.shields.io/github/stars/dariopavllo/style-semantics.svg?style=social)|

|2019/09|Controllable Text-to-Image Generation|NeurIPS|N/A|[arXiv](https://arxiv.org/abs/1909.07083)|[PyTorch (Official)](https://github.com/mrlibw/ControlGAN)
 ![](https://img.shields.io/github/stars/mrlibw/ControlGAN.svg?style=social)
[TensorFLow](https://github.com/taki0112/ControlGAN-Tensorflow)
 ![](https://img.shields.io/github/stars/taki0112/ControlGAN-Tensorflow.svg?style=social)|

|2019/09|Multi-mapping Image-to-Image Translation via Learning Disentanglement|NeurIPS|N/A|[arXiv](https://arxiv.org/abs/1909.07877)|[PyTorch (Official)](https://github.com/Xiaoming-Yu/DMIT)
 ![](https://img.shields.io/github/stars/Xiaoming-Yu/DMIT.svg?style=social)|

|2019/08|SIMGAN: Photo-Realistic Semantic Image Manipulation Using Generative Adversarial Networks|ICIP|N/A|[IEEE](https://ieeexplore.ieee.org/document/8804285)
[Author](https://zsdonghao.github.io/paper/2019icip_simgan.pdf)|-|

|2019/05|Eevee: Transforming Images by Bridging High-level Goals and Low-level Edit Operations|CHI|N/A|[ACM](https://dl.acm.org/doi/10.1145/3290607.3312929)|-|

|2019/04|Text Guided Person Image Synthesis|CVPR|N/A|[arXiv](https://arxiv.org/abs/1904.05118)|-|

|2019/03|Bilinear Representation for Language-based Image Editing Using Conditional Generative Adversarial Networks|ICASSP|N/A|[arXiv](https://arxiv.org/abs/1903.07499)|[PyTorch (Official)](https://github.com/vtddggg/BilinearGAN_for_LBIE)
 ![](https://img.shields.io/github/stars/vtddggg/BilinearGAN_for_LBIE.svg?style=social)|

|2019/03|Language-based Colorization of Scene Sketches|SIGGRAPH Asia|N/A|[ACM](https://dl.acm.org/doi/10.1145/3355089.3356561)
[Author](mo-haoran.com/files/SIGA19/SketchColorization_paper_SA2019.pdf)|[TensorFlow (Official)](https://github.com/SketchyScene/SketchySceneColorization)
 ![](https://img.shields.io/github/stars/SketchyScene/SketchySceneColorization.svg?style=social)|

|2018/12|Paired-D GAN for Semantic Image Synthesis|ACCV|N/A|[Springer](https://link.springer.com/chapter/10.1007/978-3-030-20870-7_29)
[Author](http://www.dgcv.nii.ac.jp/Publications/Papers/2018/ACCV2018a.pdf)|[PyTorch (Official)](https://github.com/vominhduc/Paied-D-GAN)
 ![](https://img.shields.io/github/stars/vominhduc/Paied-D-GAN.svg?style=social)|

|2018/12|Sequential Attention GAN for Interactive Image Editing via Dialogue|ACMMM|N/A|[arXiv](https://arxiv.org/abs/1812.08352)|-|

|2018/11|Keep Drawing It: Iterative Language-based Image Generation and Editing|NeurIPS (ViGIL Workshop)|N/A|[arXiv](https://arxiv.org/abs/1811.09845v1)|-|

|2018/11|Tell, Draw, and Repeat: Generating and Modifying Images Based on Continual Linguistic Instruction|ICCV|N/A|[arXiv](https://arxiv.org/abs/1811.09845)|[PyTorch (Official)](https://github.com/Maluuba/GeNeVA)
 ![](https://img.shields.io/github/stars/Maluuba/GeNeVA.svg?style=social)|

|2018/10|Cross-Modal Style Transfer|ICIP|N/A|[IEEE](https://ieeexplore.ieee.org/document/8451734)|[PyTorch (Official)](https://github.com/SahilC/Cross-Modal-Style)
 ![](https://img.shields.io/github/stars/SahilC/Cross-Modal-Style.svg?style=social)|

|2018/10|Learning to Globally Edit Images with Textual Description|-|N/A|[arXiv](https://arxiv.org/abs/1810.05786)|[PyTorch (Official)](https://github.com/sohuren/Img_edit_with_text)
 ![](https://img.shields.io/github/stars/sohuren/Img_edit_with_text.svg?style=social)|

|2018/10|Text-Adaptive Generative Adversarial Networks: Manipulating Images with Natural Language|NeurIPS|N/A|[arXiv](https://arxiv.org/abs/1810.11919)|[PyTorch (Official)](https://github.com/woozzu/tagan)
 ![](https://img.shields.io/github/stars/woozzu/tagan.svg?style=social)|

|2018/08|Language Guided Fashion Image Manipulation with Feature-wise Transformations|ECCV (Workshop)|N/A|[arXiv](https://arxiv.org/abs/1808.04000)|-|

|2018/08|LUCSS: Language-based User-customized Colourization of Scene Sketches|-|N/A|[arXiv](https://arxiv.org/abs/1808.10544)|-|

|2018/07|Semantic Image Synthesis via Conditional Cycle-Generative Adversarial Networks|ICPR|N/A|[IEEE](https://ieeexplore.ieee.org/document/8545383)
[ResearchGate](https://www.researchgate.net/publication/329317460)|-|

|2018/07|Semantics Images Synthesis and Resolution Refinement Using Generative Adversarial Networks|CSPS|N/A|[Springer](https://link.springer.com/chapter/10.1007/978-981-13-6504-1_74)|-|

|2018/05|MC-GAN: Multi-conditional Generative Adversarial Network for Image Synthesis|BMVC|N/A|[arXiv](https://arxiv.org/abs/1805.01123)|[PyTorch (Official)](https://github.com/HYOJINPARK/MC_GAN)
 ![](https://img.shields.io/github/stars/HYOJINPARK/MC_GAN.svg?style=social)|

|2018/04|Coloring with Words: Guiding Image Colorization Through Text-based Palette Generation|ECCV|N/A|[arXiv](https://arxiv.org/abs/1804.04128)|[PyTorch (Official)](https://github.com/awesome-davian/Text2Colors)
 ![](https://img.shields.io/github/stars/awesome-davian/Text2Colors.svg?style=social)|

|2018/04|Learning to Color from Language|NAACL|N/A|[arXiv](https://arxiv.org/abs/1711.06288)|[PyTorch (Official)](https://github.com/superhans/colorfromlanguage)
 ![](https://img.shields.io/github/stars/superhans/colorfromlanguage.svg?style=social)|

|2017/12|CoDraw: Collaborative Drawing as a Testbed for Grounded Goal-driven Communication|ACL|N/A|[arXiv](https://arxiv.org/abs/1712.05558)|[PyTorch (Official)](https://github.com/facebookresearch/codraw-models)
 ![](https://img.shields.io/github/stars/facebookresearch/codraw-models.svg?style=social)|

|2017/12|Interactive Image Manipulation with Natural Language Instruction Commands|NeurIPS (ViGIL Workshop)|N/A|[arXiv](https://arxiv.org/abs/1802.08645)|-|

|2017/11|Language-Based Image Editing with Recurrent Attentive Models|CVPR|N/A|[arXiv](https://arxiv.org/abs/1711.06288)|[TensorFlow (Official)](https://github.com/Jianbo-Lab/LBIE)
 ![](https://img.shields.io/github/stars/Jianbo-Lab/LBIE.svg?style=social)|

|2017/10|Be Your Own Prada: Fashion Synthesis with Structural Coherence|ICCV|N/A|[arXiv](https://arxiv.org/abs/1710.07346)|[Torch (Official)](https://github.com/zhusz/ICCV17-fashionGAN)
 ![](https://img.shields.io/github/stars/zhusz/ICCV17-fashionGAN.svg?style=social)|

|2017/07|Semantic Image Synthesis via Adversarial Learning|ICCV|N/A|[arXiv](https://arxiv.org/abs/1707.06873)|[PyTorch](https://github.com/woozzu/dong_iccv_2017)
 ![](https://img.shields.io/github/stars/woozzu/dong_iccv_2017.svg?style=social)|

|2016/05|Generative Adversarial Text to Image Synthesis|ICML|N/A|[arXiv](https://arxiv.org/abs/1605.05396)|[Torch (Official)](https://github.com/reedscot/icml2016)
 ![](https://img.shields.io/github/stars/reedscot/icml2016.svg?style=social)|

## Contributing

Feel free to send me [pull requests](https://github.com/martinduartemore/awesome-text-based-image-manipulation) to add resources.
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/martinduartemore/awesome-text-based-image-manipulation

Awesome Lists containing this project

README