Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
Awesome-Text-to-Image
(ෆ`꒳´ෆ) A Survey on Text-to-Image Generation/Synthesis.
https://github.com/Yutong-Zhou-cv/Awesome-Text-to-Image
- 23/05/26 - Zhou-cv/Awesome-Text-to-Image/blob/main/%5BCVPRW%202023%F0%9F%8E%88%5D%20%20Best%20Collection.md) list!
- 💬 3D
- [Paper - cvpr2024.github.io/)]
- [Paper - Zhangjl/E3-FaceNet)]
- [Paper - Oboyob)]
- 💬 3D
- 💬 3D - nju/describe3d)] [[Project](https://mhwu2017.github.io/)]
- [Paper - Diffusion)] [[Project](https://ziqihuangg.github.io/projects/collaborative-diffusion.html)]
- [Paper - Brother-Pikachu/Where2edit)]
- [Paper
- [Paper
- [Paper
- [Paper
- [Paper
- [Paper - Ic7LeFlP/view)]
- [Paper
- [Paper
- [Paper
- [Paper
- [Paper - UR/StyleT2I)]
- [Paper
- [Paper
- [Paper
- [Paper - Modal-CelebA-HQ-Dataset)] [[Colab](https://colab.research.google.com/github/weihaox/TediGAN/blob/main/playground.ipynb)] [[Video](https://www.youtube.com/watch?v=L8Na2f5viAM)]
- [Paper
- [Paper - sjx/SEA-T2F)]
- [Paper
- [Paper
- [Paper
- 💬 Unauthorized Data
- 💬 Open-set Bias Detection
- 💬 Spatial Consistency - t2i.github.io/)] [[Code](https://github.com/SPRIGHT-T2I/SPRIGHT)] [[Dataset](https://huggingface.co/datasets/SPRIGHT-T2I/spright)]
- 💬 Safety - agnostic-governance)]
- 💬 Aesthetic - v2-5/)] [[HuggingFace](https://huggingface.co/playgroundai/playground-v2.5-1024px-aesthetic)]
- 💬 Text Visualness - visualness/)]
- 💬 Against Malicious Adaptation
- 💬 Principled Recaptioning
- 💬 Holistic Evaluation - crfm/helm)] [[Project](https://crfm.stanford.edu/heim/v1.1.0/)]
- 💬 Safety - the-Artist)]
- 💬 Natural Attack Capability
- 💬 Bias
- 💬 Demographic Stereotypes
- 💬 Robustness
- 💬 Adversarial Robustness Analysis
- 💬 Textual Inversion - research/DVAR)]
- 💬 Interpretable Intervention
- 💬 Ethical Image Manipulation
- 💬 Creativity Transfer
- 💬 Ambiguity
- 💬 Racial Politics
- 💬 Privacy Analysis
- 💬 Authenticity Evaluation for Fake Images
- 💬 Cultural Bias
- [Paper - t2i.github.io/Ranni/)] [[Code](https://github.com/ali-vilab/Ranni)]
- [Paper
- [Paper
- [Paper - t2i.vercel.app/)] [[Code](https://github.com/eclipse-t2i/eclipse-inference)] [[Hugging Face](https://huggingface.co/spaces/ECLIPSE-Community/ECLIPSE-Kandinsky-v2.2)]
- [Paper - t2i.github.io/)] [[Code](https://github.com/jialuli-luka/SELMA)]
- [Paper - alpha.github.io/)] [[Code](https://github.com/PixArt-alpha/PixArt-alpha)] [[Hugging Face](https://huggingface.co/spaces/PixArt-alpha/PixArt-LCM)]
- [Paper
- [Paper
- [Paper - t2i.github.io/)]
- [Paper - huang.github.io/realcustom/)]
- [Paper - Lightning)] [[Demo](https://fastsdxl.ai/)]
- [Paper
- [Paper
- [Paper
- [Paper - dresscode)]
- [Paper - 2.github.io/)] [[Code](https://github.com/microsoft/i-Code/tree/main/CoDi-2)]
- [Paper
- [Paper - official)] [[Demo](https://replicate.com/moayedhajiali/elasticdiffusion)]
- [Paper
- [Paper
- [Paper
- [Paper
- [Paper - GEN)] [[Project](https://czhang0528.github.io/iti-gen)]
- [Paper - Lai/Mini-DALLE3)] [[Demo](http://139.224.23.16:10085/)] [[Project](https://minidalle3.github.io/)]
- 💬Evaluation
- [Paper - forever/Kandinsky-2)] [[Demo](https://fusionbrain.ai/en/editor/)] [[Demo Video](https://www.youtube.com/watch?v=c7zHPc59cWU)] [[Hugging Face](https://huggingface.co/kandinsky-community)]
- [Paper
- [Paper
- [Paper - Compositional-Concepts-Discovery)] [[Project](https://energy-based-model.github.io/unsupervised-concept-discovery/)]
- [Paper
- [Paper - Huang/T2I-CompBench)] [[Project](https://karine-h.github.io/T2I-CompBench/)]
- 💬Human Preference Evaluation
- [Paper - XJTU/APTM)] [[Project](https://www.zdzheng.xyz/publication/Towards-2023)]
- [Paper - cinemagraph)] [[Project](https://text2cinemagraph.github.io/website/)]
- [Paper
- [Paper
- 💬Evaluation
- [Paper
- [Paper - Labs/Prompt-Free-Diffusion)] [[Hugging Face](https://huggingface.co/spaces/shi-labs/Prompt-Free-Diffusion)]
- [Paper - latent-diffusion)] [[Project](https://omriavrahami.com/blended-latent-diffusion-page/)]
- 💬Controllable
- [Paper - chosen-one/)]
- 💬Stable Diffusion with Brain
- [Paper
- 💬Evaluation
- [Paper - plus.github.io/)]
- [Paper
- [Paper - labs/tr0n)] [[Hugging Face](https://huggingface.co/spaces/Layer6/TR0N)]
- 💬3D
- [Paper (arXiv) - free-structured-diffusion-guidance)]
- [Paper
- [Paper
- 💬 Textual Inversion
- [Paper - explainer/)]
- 💬 Multi-language-to-Image - AltDiffusion](https://github.com/FlagAI-Open/FlagAI/tree/master/examples/AltDiffusion-m18)] [[Code-AltCLIP](https://github.com/FlagAI-Open/FlagAI/tree/master/examples/AltCLIP-m18)] [[Hugging Face](https://huggingface.co/BAAI/AltDiffusion-m18)]
- 💬 Seed selection
- 💬 Audio/Sound/Multi-language-to-Image
- 💬Faithfulness Evaluation - benchmark.github.io/)] [[Code](https://github.com/Yushi-Hu/tifa)]
- [Paper
- [Paper - ICST-MIPL/LFR-GAN_TOMM2023)]
- [Paper - text-to-image)] [[Project](https://rich-text-to-image.github.io/)] [[Demo](https://huggingface.co/spaces/songweig/rich-text-to-image/discussions)]
- 💬Human Preferences
- [Paper - I/)]
- [Paper
- 💬Human Evaluation
- [Paper - to-room/)] [[Code](https://github.com/lukasHoel/text2room)] [[Video](https://www.youtube.com/watch?v=fjRnFL91EZc)]
- [Paper - diffusion.github.io/)] [[Code](https://github.com/bahjat-kawar/time-diffusion)]
- [Paper - chatgpt)]
- [Paper
- 💬Stable Diffusion with Brain - with-brain/)] [[Code](https://github.com/yu-takagi/StableDiffusionReconstruction)]
- [Paper - Guided-Diffusion)]
- [Paper - and-Excite/)] [[Code](https://github.com/AttendAndExcite/Attend-and-Excite)]
- [Paper - and-bind)] [[Code](https://github.com/boschresearch/Divide-and-Bind)]
- [Paper
- [Paper - diffusion/)] [[Code](https://github.com/adobe-research/custom-diffusion)] [[Hugging Face](https://huggingface.co/spaces/nupurkmr9/custom-diffusion)]
- [Paper
- [Paper
- [Paper - model.github.io/)]
- [Paper
- 💬Optimizing Prompts
- 💬Optimizing Prompts
- 💬Aesthetic Image Generation
- [Paper
- [Paper
- [Paper - ov-file)]
- [Paper
- [Paper - gen.github.io/)] [[Code](https://github.com/microsoft/i-Code/tree/main/i-Code-V3)]
- [Paper - six-modalities-binding-ai/)] [[Code](https://github.com/facebookresearch/ImageBind)]
- [Paper
- [Paper
- [Paper - lisa/RDM-Region-Aware-Diffusion-Model)]
- [Paper
- [Paper - model.github.io/)]
- [Paper - Labs/Versatile-Diffusion)] [[Hugging Face](https://huggingface.co/spaces/shi-labs/Versatile-Diffusion)]
- [Paper
- [Paper - infinity.microsoft.com/#/)]
- [Paper
- [Paper
- [Paper - YeZhu/CDCD)]
- [Paper
- [Paper - research/MMVID)] [Project](https://snap-research.github.io/MMVID/)
- [Paper - diffusion)] [[Stable Diffusion Code](https://github.com/CompVis/stable-diffusion)]
- [Paper - sys/ofa)] [[Hugging Face](https://huggingface.co/OFA-Sys)]
- [Paper - GAN/)]
- [Paper
- [Paper
- [Paper - GAN/)]
- [Paper - Verse)]
- 💬Semantic Diffusion Guidance - liu.github.io/sdg/)]
- 💬Multi-Concept Composition
- 💬3D Hairstyle Generation
- 💬Image Super-Resolution
- 💬Image Editing
- 💬LLMs
- 💬Segmentation
- 💬Text Editing
- 💬Text Character Generation
- 💬Open-Vocabulary Panoptic Segmentation
- 💬Chinese Text Character Generation - draw.github.io/)]
- 💬Grounded Generation - Diffusion)] [Project](https://lipurple.github.io/Grounded_Diffusion/)]
- 💬Semantic segmentation - ES)]
- 💬Unsupervised semantic segmentation
- 💬Text+Speech → Gesture - ao/HumanBehaviorAnimation)]
- 💬Text+Image+Shape → Image - guided-diffusion.github.io/)]
- [Paper
- [Paper - imagen.github.io/)]
- 💬NERF - shahbazi.github.io/inserf/)]
- [Paper
- 💬Video Editing - stick-edit.github.io/)] [[Code](https://github.com/mayuelala/MagicStick)]
- [Paper
- 💬Style Transfer
- [Paper - diffusion)]
- 💬Multi-Subject Generation
- 💬Video Editing - igN4)]
- [Paper - a-scene/)] [[Code](https://github.com/google/break-a-scene)]
- [Paper
- 💬3D Shape Editing
- 💬Colorization
- 💬Video Editing - zero-edit.github.io/)] [Hugging Face](https://huggingface.co/spaces/chenyangqi/FateZero)]
- 💬3D
- [Paper
- [Paper
- [Paper
- [Paper
- 💬Reject Human Instructions - alpha.github.io/)] [[Code](https://github.com/matrix-alpha/Accountable-Textual-Visual-Chat)]
- [Paper
- [Paper
- [Paper
- [Paper
- [Paper
- 💬Image Editing
- 💬Image Editing
- [Paper
- [Paper - guided-diffusion/shape-guided-diffusion)] [[Project](https://shape-guided-diffusion.github.io/)] [Hugging Face](https://huggingface.co/spaces/shape-guided-diffusion/shape-guided-diffusion)]
- 💬Image Editing
- 💬Person Re-identification
- [Paper
- 💬3D
- 💬Image Editing
- [Paper
- 💬Image Editing
- [Paper - denoising-score.github.io/)]
- 💬Image Editing - lisa/RDM-Region-Aware-Diffusion-Model)]
- 💬Text+Video → Video
- [Paper
- 💬Fashion Image Editing
- [Paper
- [Paper - Net)]
- [Paper - diffusion.github.io/)]
- 💬Text+Image → Video
- 💬Image Stylization - lisa/Diffstyler)]
- [Paper - pix2pix](https://null-text-inversion.github.io/))]
- [Paper - pix2pix)]
- 💬Style Transfer
- [Paper
- [Paper
- [Paper
- 💬Iterative Language-based Image Manipulation
- 💬Digital Art Synthesis - lisa/MGAD-multimodal-guided-artwork-diffusion)]
- 💬HDR Panorama Generation
- [Paper - cvlab.github.io/LANIT/)] [[Code](https://github.com/KU-CVLAB/LANIT)]
- 💬3D Semantic Style Transfer - DISCOVER/LASST)]
- 💬Face Animation - guided-animation)]
- 💬Fashion Design
- 💬Image Colorization
- 💬Animating Human Meshes - Kim/CLIP-Actor)]
- 💬Pose Synthesis
- 💬Person Re-identification - H/LGUR)]
- [Paper - CLIP)]
- 💬Monocular Depth Estimation - galaxy/DepthCLIP)]
- 💬Image Style Transfer
- 💬Image Segmentation
- 💬Video Segmentation - zhao/2022cvpr-mmmmtbvs)]
- 💬Image Matting
- 💬Stylizing Video Objects - Driven-Stylization-of-Video-Objects/)]
- [Paper
- 💬Pose-Guided Person Generation
- 💬3D Avatar Generation
- 💬Image & Video Editing
- [Paper
- 💬Hairstyle Transfer - ustc/HairCLIP)]
- 💬NeRF
- [Paper
- [Paper
- [Paper - diffusion)] [[Project](https://omriavrahami.com/blended-diffusion-page/)]
- [Paper - Pytorch)]
- 💬Style Transfer
- 💬Multi-person Image Generation
- 💬Image Style Transfer - with-style-evaluation-styleclipdraw)] [[Code](https://github.com/pschaldenbrand/StyleCLIPDraw)] [[Demo](https://replicate.com/pschaldenbrand/style-clip-draw)]
- 💬Image Style Transfer
- 💬3D Avatar Generation - team.github.io/latent3D/)]
- 💬Image Inpainting
- 💬Text+Image → Video
- 💬NeRF
- [Paper
- [Paper
- [Paper - ntu.com/project/talkedit/)] [[Code](https://github.com/yumingj/Talk-to-Edit)]
- [Paper
- [Paper
- [Paper
- [Paper - E)] [[Blog](https://openai.com/blog/dall-e/)] [[Model Card](https://github.com/openai/DALL-E/blob/master/model_card.md)] [[Colab](https://colab.research.google.com/drive/1KA2w8bA9Q1HDiZf5Ow_VNOrTaWW4lXXG?usp=sharing)]
- [Paper
- [Paper
- [Paper
- [Paper
- [Paper
- [Paper
- [Paper
- [Paper - ai/DenseDiffusion)]
- [Paper - chen/layout-guidance)] [[Project](https://silent-chen.github.io/layout-guidance/)]
- 💬Skeleton/Sketch
- 💬Skeleton - research.github.io/HumanSD/)] [[Code](https://github.com/IDEA-Research/HumanSD)] [[Video](https://drive.google.com/file/d/1Djc2uJS5fmKnKeBnL34FnAAm3YSH20Bb/view)]
- 💬Sound+Speech→Robotic Painting
- 💬Sound
- 💬Instance information +Text→Image - xwang/InstanceDiffusion)]
- 💬Text→Layout→Image
- 💬Mask+Text→Image - jR7h0OUrtLBeN7O4fEq8XkaWWJBhiLWWMELo2NUMjJYS0FDS0RISUVBUllMV0FRSzNCOTFTQy4u)]
- [Paper
- [Paper
- Survey
- [Paper
- [Paper
- AI for Content Creation Workshop - Resolution Complex Scene Synthesis with Transformers**, Manuel Jahn et al. [[Paper](https://arxiv.org/pdf/2105.06458.pdf)]
- [Paper
- [Paper
- [Paper
- Extent Version👆
- [Paper - to-image-translation-without-text)] [[Project](https://smallflyingpig.github.io/speech-to-image/main)]
- [Paper
- [Paper
- [Paper - object-centric-vs-scene-centric-CMR)]
- [Paper
- [Paper
- [Paper
- RWS 2022 - based Person Retrieval**, Xiujun Shu et al. [[Paper](https://arxiv.org/abs/2208.08608)] [[Code](https://github.com/TencentYoutuResearch/PersonRetrieval-IVT)]
- 💬Text+Sketch→Visual Retrieval
- [Paper
- 💬Dataset - ml-dataset)]
- [Paper
- [Paper
- [Paper
- [Paper
- 💬Text → 3D
- 💬Text → 3D - ai/LATTE3D/)]
- 💬Text → Motion
- 💬Text → 4D
- 💬Text → 3D
- 💬Text → 3D - ai-3d.github.io/One2345plus_page/)]
- 💬Text → 3D - 2-3-45.github.io/)] [[Code](https://github.com/One-2-3-45/One-2-3-45)]
- 💬Text+Sketch → 3D
- 💬Text → 3D
- 💬Text → 3D
- 💬Text → Motion
- 💬Text → 3D - text-to-3D)]
- 💬Text → 3D
- 💬Text → 3D
- 💬Text → 3D
- 💬Text+Mesh → Mesh - xiaoma666.github.io/Projects/X-Mesh/)] [[Code](https://github.com/xmu-xiaoma666/X-Mesh)]
- 💬Text → Motion - zys.github.io/T2M-GPT/)] [[Code](https://github.com/Mael-zys/T2M-GPT)] [[Hugging Face](https://huggingface.co/vumichien/T2M-GPT)]
- 💬Text → 3D - human.github.io/)]
- 💬Text → 3D - ai/ATT3D/)]
- 💬Text → 3D
- 💬3D Generative Model - kim/DATID-3D)] [[Project](https://datid-3d.github.io/)]
- 💬Point Clouds - e)]
- 💬Text → 3D
- 💬Text → Shape - SDF)]
- 💬Mesh - 3d.github.io/tango/)] [[Code](https://github.com/Gorilla-Lab-SCUT/tango)]
- 💬Human Motion Generation - page/)] [[Code](https://github.com/GuyTevet/motion-diffusion-model)]
- 💬Human Motion Generation - zhang.github.io/projects/MotionDiffuse.html#)]
- 💬3D Shape
- 💬Virtual Humans
- 💬3D Shape - Implicit-Text-Guided-Shape-Generation)]
- 💬Object - research/google-research/tree/master/dreamfields)]
- 💬Mesh
- 💬Motion - to-motion/)] [[Code](https://github.com/EricGuo5513/text-to-motion)]
- 💬Shape - Forge)]
- 💬Motion
- [Homepage - generation-models-as-world-simulators)] [[Sora with Audio](https://x.com/elevenlabsio/status/1759240084342059260?s=20)]
- 💬Music Visualization
- [Paper
- [Paper - project/)] [[Code](https://github.com/Vchitect/LaVie)]
- [Paper - video.metademolab.com/)]
- [Paper - zero.github.io/)] [[Video](https://www.dropbox.com/s/uv90mi2z598olsq/Text2Video-Zero.MP4?dl=0)] [[Code](https://github.com/Picsart-AI-Research/Text2Video-Zero)] [[Hugging Face](https://huggingface.co/spaces/PAIR/Text2Video-Zero)]
- [Paper
- [Paper
- [Paper
- [Paper - A-Protagonist/Make-A-Protagonist)] [[Project](https://make-a-protagonist.github.io/)]
- [Paper - ai/VideoLDM/)]
- 💬Music Visualization
- [Paper - a-video3d.github.io/)]
- [Paper - A-Video)]
- [Paper
- [Paper
- [Paper
- [Paper
- [Paper - a-video/)] [[Code](https://github.com/lucidrains/make-a-video-pytorch)]
- 💬Story Continuation
- 💬Story → Video - Level-Story-Visualization)]
- [Paper
- [Paper - research/MMVID)] [Project](https://snap-research.github.io/MMVID/)
- [Paper - diffusion.github.io/)]
- ❌Genertation Task
- [Paper
- [Paper
- [Paper
- [Paper
- [Paper
- 💬Story → Video
- [Paper
- [Paper
- [Paper - research.github.io/seanet/musiclm/examples/)] [[MusicCaps](https://www.kaggle.com/datasets/googleai/musiccaps)]
- ![Star History Chart - history.com/#Yutong-Zhou-cv/Awesome-Text-to-Image&Date)
- **Yutong**
- Alt
- contrib.rocks
Programming Languages