Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Awesome-Text-to-3D

A growing curation of Text-to-3D, Diffusion-to-3D works.
https://github.com/yyeboah/Awesome-Text-to-3D

Last synced: 6 days ago
JSON representation

Papers :scroll:
- Zero-Shot Text-Guided Object Generation with Dream Fields - L6) | [site](https://ajayj.com/dreamfields) | [code](https://github.com/google-research/google-research/tree/master/dreamfields)
- CLIP-Forge: Towards Zero-Shot Text-to-Shape Generation - L13) | [site]() | [code](https://github.com/AutodeskAILab/Clip-Forge)
- CLIP-NeRF: Text-and-Image Driven Manipulation of Neural Radiance Fields - L20) | [site](https://cassiepython.github.io/clipnerf/) | [code](https://github.com/cassiePython/CLIPNeRF)
- CG-NeRF: Conditional Generative Neural Radiance Fields - L27) | [site]() | [code]()
- Dream3D: Zero-Shot Text-to-3D Synthesis Using 3D Shape Prior and Text-to-Image Diffusion Models - L69) | [site](https://bluestyle97.github.io/dream3d/) | [code]()
- PureCLIPNERF: Understanding Pure CLIP Guidance for Voxel Grid NeRF Models - Hung Lee et al., Arxiv 2022 | [citation](./references/citations.bib#L29-L34) | [site](https://hanhung.github.io/PureCLIPNeRF/) | [code](https://github.com/hanhung/PureCLIPNeRF)
- TANGO: Text-driven Photorealistic and Robust 3D Stylization via Lighting Decomposition - L41) | [site](https://cyw-3d.github.io/tango/) | [code](https://github.com/Gorilla-Lab-SCUT/tango)
- Point-E: A System for Generating 3D Point Clouds from Complex Prompts - L97) | [site]() | [code](https://github.com/openai/point-e)
- SDFusion: Multimodal 3D Shape Completion, Reconstruction, and Generation - Chi Cheng et al., CVPR 2023 | [citation](./references/citations.bib#L43-L48) | [site](https://yccyenchicheng.github.io/SDFusion/) | [code](https://github.com/yccyenchicheng/SDFusion)
- 3DDesigner: Towards Photorealistic 3D Object Generation and Editing with Text-guided Diffusion Models - L55) | [site](https://3ddesigner-diffusion.github.io/) | [code]()
- DreamFusion: Text-to-3D using 2D Diffusion - L62) | [site](https://dreamfusion3d.github.io/) | [code]()
- Score Jacobian Chaining: Lifting Pretrained 2D Diffusion Models for 3D Generation - L146) | [site](https://pals.ttic.edu/p/score-jacobian-chaining) | [code](https://github.com/pals-ttic/sjc/)
- Rodin: A Generative Model for Sculpting 3D Digital Avatars Using Diffusion - L202) | [site](https://3d-avatar-diffusion.microsoft.com/) | [code]()
- DINAR: Diffusion Inpainting of Neural Textures for One-Shot Human Avatars - L671) | [site]() | [code]()
- Text-To-4D Dynamic Scene Generation - L230) | [site](https://make-a-video3d.github.io/) | [code]()
- AvatarCraft: Transforming Text into Neural Human Avatars with Parameterized Shape and Pose Control - L279) | [site](https://avatar-craft.github.io/) | [code](https://github.com/songrise/avatarcraft)
- TextDeformer: Geometry Manipulation using Text Guidance - L286) | [site]() | [code]()
- ATT3D: Amortized Text-to-3D Object Synthesis - L293) | [site]() | [code]()
- Diffusion-SDF: Conditional Generative Modeling of Signed Distance Functions - L307) | [site]() | [code]()
- LERF: Language Embedded Radiance Fields - L321) | [site]() | [code]()
- Instruct-NeRF2NeRF: Editing 3D Scenes with Instructions - L328) | [site](https://instruct-nerf2nerf.github.io/) | [code](https://github.com/ayaanzhaque/instruct-nerf2nerf)
- MVDiffusion: Enabling Holistic Multi-view Image Generation with Correspondence-Aware Diffusion - L342) | [site]() | [code]()
- One-2-3-45: Any Single Image to 3D Mesh in 45 Seconds without Per-Shape Optimization - L349) | [site]() | [code]()
- TextMesh: Generation of Realistic 3D Meshes From Text Prompts - L356) | [site]() | [code]()
- Local 3D Editing via 3D Distillation of CLIP Knowledge - L377) | [site]() | [code]()
- PODIA-3D: Domain Adaptation of 3D Generative Model Across Large Domain Gap Using Pose-Preserved Text-to-Image Diffusion - L566) | [site]() | [code]()
- Chupa: Carving 3D Clothed Humans from Skinned Shape Priors using 2D Diffusion Probabilistic Models - L573) | [site](https://snuvclab.github.io/chupa/) | [code](https://github.com/snuvclab/chupa)
- DreamSparse: Escaping from Plato's Cave with 2D Frozen Diffusion Model Given Sparse Views - L447) | [site]() | [code]()
- Fantasia3D: Disentangling Geometry and Appearance for High-quality Text-to-3D Content Creation - L454) | [site]() | [code]()
- DreamFace: Progressive Generation of Animatable 3D Faces under Text Guidance - L461) | [site](https://sites.google.com/view/dreamface) | [code](https://huggingface.co/spaces/DEEMOSTECH/3D-Avatar-Generator)
- Set-the-Scene: Global-Local Training for Generating Controllable NeRF Scenes - Bar et al., Arxiv 2023 | [citation](./references/citations.bib#L463-L468) | [site](https://control4darxiv.github.io/) | [code]()
- InstructP2P: Learning to Edit 3D Point Clouds with Text Instructions - L489) | [site]() | [code]()
- DreamHuman: Animatable 3D Avatars from Text - L559) | [site](https://dream-human.github.io/) | [code]()
- FaceCLIPNeRF: Text-driven 3D Face Manipulation using Deformable Neural Radiance Fields - L496) | [site]() | [code]()
- 3D-LLM: Injecting the 3D World into Large Language Models - L503) | [site]() | [code]()
- AvatarVerse: High-quality & Stable 3D Avatar Creation from Text and Pose - L552) | [site](https://avatarverse3d.github.io/) | [code](https://github.com/bytedance/AvatarVerse)
- TeCH: Text-guided Reconstruction of Lifelike Clothed Humans - L580) | [site](https://huangyangyi.github.io/TeCH/) | [code](https://github.com/huangyangyi/TeCH)
- MATLABER: Material-Aware Text-to-3D via LAtent BRDF auto-EncodeR - L601) | [site](https://sheldontsui.github.io/projects/Matlaber) | [code](https://github.com/SheldonTsui/Matlaber)
- SATR: Zero-Shot Semantic Segmentation of 3D Shapes - L615) | [site](https://samir55.github.io/SATR/) | [code](https://github.com/Samir55/SATR)
- Texture Generation on 3D Meshes with Point-UV Diffusion - L643) | [site](https://cvmi-lab.github.io/Point-UV-Diffusion/) | [code](https://github.com/CVMI-Lab/Point-UV-Diffusion)
- Text-Guided Generation and Editing of Compositional 3D Avatars - L664) | [site](https://yfeng95.github.io/teca/) | [code]()
- Large-Vocabulary 3D Diffusion Model with Transformer - L678) | [site]() | [code]()
- Progressive Text-to-3D Generation for Automatic 3D Prototyping - L685) | [site]() | [code]()
- SweetDreamer: Aligning Geometric Priors in 2D Diffusion for Consistent Text-to-3D - L706) | [site]() | [code]()
- HumanNorm: Learning Normal Diffusion Model for High-quality and Realistic 3D Human Generation - L713) | [site](https://humannorm.github.io/) | [code](https://github.com/xhuangcv/humannorm)
- Consistent123: One Image to Highly Consistent 3D Asset Using Case-Aware Diffusion Priors - L720) | [site]() | [code]()
- GaussianDreamer: Fast Generation from Text to 3D Gaussian Splatting with Point Cloud Priors - L727) | [site]() | [code]()
- Text-to-3D using Gaussian Splatting - L734) | [site]() | [code]()
- CustomNet: Zero-shot Object Customization with Variable-Viewpoints in Text-to-Image Diffusion Models - L797) | [site]() | [code]()
- Progressive3D: Progressively Local Editing for Text-to-3D Content Creation with Complex Semantic Prompts - L741) | [site](https://cxh0519.github.io/projects/Progressive3D/) | [code](https://github.com/cxh0519/Progressive3D)
- 3D-GPT: Procedural 3D Modeling with Large Language Models - L748) | [site](https://chuny1.github.io/3DGPT/3dgpt.html) | [code]()
- Zero123++: a Single Image to Consistent Multi-view Diffusion Base Model - L755) | [site]() | [code]()
- Consistent4D: Consistent 360 Degree Dynamic Object Generation from Monocular Video - L809) | [site](https://consistent4d.github.io/) | [code](https://github.com/yanqinJiang/Consistent4D)
- Decorate3D: Text-Driven High-Quality Texture Generation for Mesh Decoration in the Wild - L840) | [site](https://decorate3d.github.io/Decorate3D/) | [code](https://github.com/Decorate3D/Decorate3D)
- 4D-fy:Text-to-4D Generation Using Hybrid Score Distillation Sampling - L847) | [site](https://sherwinbahmani.github.io/4dfy/) | [code](https://github.com/sherwinbahmani/4dfy)
- HumanGaussian: Text-Driven 3D Human Generation with Gaussian Splatting - L861) | [site](https://alvinliu0.github.io/projects/HumanGaussian) | [code](https://github.com/alvinliu0/HumanGaussian)
- X-Dreamer: Creating High-quality 3D Content by Bridging the Domain Gap Between Text-to-2D and Text-to-3D Generation - L889) | [site]() | [code]()
- Text-Guided 3D Face Synthesis: From Generation to Editing - L896) | [site](https://faceg2e.github.io/) | [code]()
- StableDreamer: Taming Noisy Score Distillation Sampling for Text-to-3D - L903) | [site]() | [code]()
- CAD: Photorealistic 3D Generation via Adversarial Distillation - L910) | [site]() | [code]()
- RichDreamer: A Generalizable Normal-Depth Diffusion Model for Detail Richness in Text-to-3D - L917) | [site]() | [code]()
- MeshGPT: Generating Triangle Meshes with Decoder-Only Transformers - L868) | [site]() | [code]()
- DreamPropeller: Supercharge Text-to-3D Generation with Parallel Sampling - L875) | [site]() | [code]()
- HyperDreamer: Hyper-Realistic 3D Content Generation and Editing from a Single Image - L882) | [site]() | [code]()
- Inpaint3D: 3D Scene Content Generation using 2D Inpainting Diffusion - L924) | [site]() | [code]()
- Text-to-3D Generation with Bidirectional Diffusion using both 2D and 3D priors - L931) | [site]() | [code]()
- SEEAvatar: Photorealistic Text-to-3D Avatar Generation with Constrained Geometry and Appearance - L938) | [site](https://yoxu515.github.io/SEEAvatar/) | [code](https://github.com/yoxu515/SEEAvatar)
- Text2Immersion: Generative Immersive Scene with 3D Gaussians - L945) | [site]() | [code]()
- GAvatar: Animatable 3D Gaussian Avatars with Implicit Mesh Learning - L973) | [site](https://nvlabs.github.io/GAvatar/) | [code]()
- Stable Score Distillation for High-Quality 3D Generation - L980) | [site]() | [code]()
- Hyper-VolTran: Fast and Generalizable One-Shot Image to 3D Object Structure via HyperNetworks - L987) | [site]() | [code]()
- Make-A-Character: High Quality Text-to-3D Character Generation within Minutes - L1008) | [site](https://human3daigc.github.io/MACH/) | [code](https://github.com/Human3DAIGC/Make-A-Character)
- En3D: An Enhanced Generative Model for Sculpting 3D Humans from 2D Synthetic Data - L1015) | [site](https://menyifang.github.io/projects/En3D/index.html) | [code](https://github.com/menyifang/En3D)
- SteinDreamer: Variance Reduction for Text-to-3D Score Distillation via Stein Identity - L1022) | [site]() | [code]()
- InseRF: Text-Driven Generative Object Insertion in Neural 3D Scenes - L1029) | [site](https://mohamad-shahbazi.github.io/inserf/) | [code]()
- AGG: Amortized Generative 3D Gaussians for Single Image to 3D - L1036) | [site]() | [code]()
- TEXTure: Text-Guided Texturing of 3D Shapes - L160) | [site](https://texturepaper.github.io/TEXTurePaper/) | [code](https://github.com/TEXTurePaper/TEXTurePaper)
- DATID-3D: Diversity-Preserved Domain Adaptation Using Text-to-Image Diffusion for 3D Generative Model - L181) | [site](https://gwang-kim.github.io/datid_3d/) | [code](https://github.com/gwang-kim/DATID-3D)
- Conditional 3D Shape Generation based on Shape-Image-Text Aligned Latent Representation - L300) | [site]() | [code]()
- HiFA: High-fidelity Text-to-3D with Advanced Diffusion Guidance - L314) | [site]() | [code]()
- 3DFuse: Let 2D Diffusion Model Know 3D-Consistency for Robust Text-to-3D Generation - L335) | [site]() | [code]()
- Text2Room: Extracting Textured 3D Meshes from 2D Text-to-Image Models - L391) | [site]() | [code]()
- 3D VADER - AutoDecoding Latent 3D Diffusion Models - L433) | [site]() | [code]()
- DreamGaussian: Generative Gaussian Splatting for Efficient 3D Content Creation - L692) | [site]() | [code]()
- HoloDreamer: Holistic 3D Panoramic World Generation from Text Descriptions - L1540) | [site](https://zhouhyocean.github.io/holodreamer/) | [code](https://github.com/zhouhyOcean/HoloDreamer)
- PlacidDreamer: Advancing Harmony in Text-to-3D Generation - L151548) | [site]() | [code]()
- NeRF-Art: Text-Driven Neural Radiance Fields Stylization - L76) | [site](https://cassiepython.github.io/nerfart/) | [code](https://github.com/cassiePython/NeRF-Art)
- Text2Tex: Text-driven Texture Synthesis via Diffusion Models - L426) | [site](https://daveredrum.github.io/Text2Tex/) | [code](https://github.com/daveredrum/Text2Tex)
- DreamVTON: Customizing 3D Virtual Try-on with Personalized Diffusion Models - L1533) | [site]() | [code]()
- HumanLiff: Layer-wise 3D Human Generation with Diffusion Model - L587) | [site](https://skhu101.github.io/HumanLiff/) | [code](https://github.com/skhu101/HumanLiff)
- HeadStudio: Text to Animatable Head Avatars with 3D Gaussian Splatting - L1099) | [site](https://zhenglinzhou.github.io/HeadStudio-ProjectPage/) | [code](https://github.com/ZhenglinZhou/HeadStudio/)
- Disentangled 3D Scene Generation with Layout Learning - L1148) | [site](https://dave.ml/layoutlearning/) | [code]()
- SceneCraft: Layout-Guided 3D Scene Generation - L1702) | [site](https://orangesodahub.github.io/SceneCraft/) | [code](https://github.com/OrangeSodahub/SceneCraft/)
- MeshUp: Multi-Target Mesh Deformation via Blended Score Distillation - L1695) | [site](https://threedle.github.io/MeshUp/) | [code](https://github.com/threedle/MeshUp)
- EpiDiff: Enhancing Multi-View Synthesis via Localized Epipolar-Constrained Diffusion - L1555) | [site](https://huanngzh.github.io/EpiDiff/) | [code](https://github.com/huanngzh/EpiDiff)
- Ouroboros3D: Image-to-3D Generation via 3D-aware Recursive Diffusion - L1562) | [site](https://costwen.github.io/Ouroboros3D/) | [code](https://github.com/Costwen/Ouroboros3D)
- DreamReward: Text-to-3D Generation with Human Preference - L1569) | [site](https://jamesyjl.github.io/DreamReward/) | [code](https://github.com/liuff19/DreamReward)
- Cycle3D: High-quality and Consistent Image-to-3D Generation via Generation-Reconstruction Cycle - L1576) | [site](https://pku-yuangroup.github.io/Cycle3D/) | [code](https://github.com/PKU-YuanGroup/Cycle3D)
- SF3D: Stable Fast 3D Mesh Reconstruction with UV-unwrapping and Illumination Disentanglement - L1583) | [site](https://stable-fast-3d.github.io/) | [code](https://github.com/Stability-AI/stable-fast-3d)
- RealmDreamer: Text-Driven 3D Scene Generation with Inpainting and Depth Diffusion - L1309) | [site](https://realmdreamer.github.io/) | [code]()
- TC4D: Trajectory-Conditioned Text-to-4D Generation - L1316) | [site](https://sherwinbahmani.github.io/tc4d/) | [code]()
- InstantMesh: Efficient 3D Mesh Generation from a Single Image with Sparse-view Large Reconstruction Models - L1330) | [site]() | [code](https://github.com/TencentARC/InstantMesh)
- Phidias: A Generative Model for Creating 3D Content from Text, Image, and 3D Conditions with Reference-Augmented Diffusion - L1667) | [site](https://rag-3d.github.io/) | [code](https://github.com/3DTopia/Phidias-Diffusion)
- 3DTopia-XL: Scaling High-quality 3D Asset Generation via Primitive Diffusion - L1674) | [site](https://3dtopia.github.io/3DTopia-XL/) | [code](https://github.com/3DTopia/3DTopia-XL)
- NeuralLift-360: Lifting An In-the-wild 2D Photo to A 3D Object with 360° Views - L90) | [site](https://vita-group.github.io/NeuralLift-360/) | [code](https://github.com/VITA-Group/NeuralLift-360)
- Topology-Aware Latent Diffusion for 3D Shape Generation - L1050) | [site]() | [code]()
- ReplaceAnything3D:Text-Guided 3D Scene Editing with Compositional Neural Radiance Fields - L1057) | [site](https://replaceanything3d.github.io/) | [code]()
- DreamInit: A General Framework to Boost 3D GS Initialization for Text-to-3D Generation by Lexical Richness - L1590) | [site](https://vlislab22.github.io/DreamInit/) | [code]()
- TexGen: Text-Guided 3D Texture Generation with Multi-view Sampling and Resampling - L1604) | [site](https://dong-huo.github.io/TexGen/) | [code]()
- DreamCouple: Exploring High Quality Text-to-3D Generation Via Rectified Flow - L1611) | [site]() | [code]()
- CAT3D: Create Anything in 3D with Multi-View Diffusion Models - L1372) | [site](https://cat3d.github.io) | [code]()
- DimensionX: Create Any 3D and 4D Scenes from a Single Image with Controllable Video Diffusion - L1744) | [site](https://chenshuo20.github.io/DimensionX/) | [code](https://github.com/wenqsun/DimensionX)
- StyleTex: Style Image-Guided Texture Generation for 3D Models - L1730) | [site]() | [code]()
- MVPaint: Synchronized Multi-View Diffusion for Painting Anything 3D - L1737) | [site](https://mvpaint.github.io/) | [code](https://github.com/3DTopia/MVPaint)
- AToM: Amortized Text-to-Mesh using 2D Diffusion - L1085)
- LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content Creation - L1092) | [site]() | [code]()
- IM-3D: : Iterative Multiview Diffusion and Reconstruction for High-Quality 3D Generation - Kyriazi et al., Arxiv 2024 | [citation](./references/citations.bib#L1108-L1113) | [site]() | [code]()
- L3GO: Language Agents with Chain-of-3D-Thoughts for Generating Unconventional Objects - L1120) | [site]() | [code]()
- RePaint-NeRF: NeRF Editing via Semantic Masks and Diffusion Models - L419) | [site](https://starstesla.github.io/repaintnerf/) | [code](https://github.com/StarsTesla/RePaint-NeRF)
- GALA3D: Towards Text-to-3D Complex Scene Generation via Layout-guided Generative Gaussian Splatting - L1106) | [site](https://gala3d.github.io/) | [code](https://github.com/VDIGPKU/GALA3D)
- Hi3D: Pursuing High-Resolution Image-to-3D Generation with Video Diffusion Models - L1646) | [site]() | [code]()
- MVGaussian: High-Fidelity text-to-3D Content Generation with Multi-View Guidance and Surface Densification - L1653) | [site](https://mvgaussian.github.io/) | [code](https://github.com/mvgaussian/mvgaussian)
- Geometry Image Diffusion: Fast and Data-Efficient Text-to-3D with Image-Based Surface Representation - L1660) | [site](https://unity-research.github.io/Geometry-Image-Diffusion.github.io/) | [code]()
- DreamHOI: Subject-Driven Generation of 3D Human-Object Interactions with Diffusion Priors - L1632) | [site](https://dreamhoi.github.io/) | [code](https://https://github.com/hanwenzhu/dreamhoi)
- MeshFormer: High-Quality Mesh Generation with 3D-Guided Reconstruction Model - L1618) | [site](https://meshformer3d.github.io/) | [code]()
- Barbie: Text to Barbie-Style 3D Avatars - L1625) | [site]() | [code]()
- Make-Your-3D: Fast and Consistent Subject-Driven 3D Content Generation - L1246) | [site]() | [code]()
- ZeroAvatar: Zero-shot 3D Avatar Generation from a Single Image - L272)
- TripoSR: Fast 3D Object Reconstruction from a Single Image - L1155) | [site]() | [code]()
- MagicClay: Sculpting Meshes With Generative Neural Fields - L1162) | [site]() | [code]()
- X-Oscar: A Progressive Framework for High-quality Text-guided 3D Animatable Avatar Generation - L1386) | [site](https://xmu-xiaoma666.github.io/Projects/X-Oscar/) | [code](https://github.com/LinZhekai/X-Oscar)
- Coin3D: Controllable and Interactive 3D Assets Generation with Proxy-Guided Conditioning - L1379) | [site]() | [code]()
- CLIP-GS: CLIP-Informed Gaussian Splatting for Real-time and View-consistent 3D Semantic Understanding - L1365) | [site]() | [code]()
- TELA: Text to Layer-wise 3D Clothed Human Generation - L1344) | [site](http://jtdong.com/tela_layer/) | [code]()
- Interactive3D: Create What You Want by Interactive 3D Generation - L1351) | [site](https://interactive-3d.github.io) | [code](https://github.com/interactive-3d/interactive3d)
- TIP-Editor: An Accurate 3D Editor Following Both Text-Prompts And Image-Prompts - L1358) | [site](https://zjy526223908.github.io/TIP-Editor) | [code](https://github.com/zjy526223908/TIP-Editor)
- FlashTex: Fast Relightable Mesh Texturing with LightControlNet - L1681) | [site](https://flashtex.github.io) | [code](https://github.com/Roblox/FlashTex)
- LDM: Large Tensorial SDF Model for Textured Mesh Generation - L1421) | [site]() | [code]()
- Dreamer XL: Towards High-Resolution Text-to-3D Generation via Trajectory Score Matching - L1428) | [site]() | [code]()
- Era3D: High-Resolution Multiview Diffusion using Efficient Row-wise Attention - L1435) | [site](https://penghtyx.github.io/Era3D/) | [code](https://github.com/pengHTYX/Era3D)
- Latent-NeRF for Shape-Guided Generation of 3D Shapes and Textures - L104) | [site]() | [code](https://github.com/eladrich/latent-nerf)
- Magic3D: High-Resolution Text-to-3D Content Creation - Hsuan Linet et al., CVPR 2023 | [citation](./references/citations.bib#L106-L111) | [site](https://research.nvidia.com/labs/dir/magic3d/) | [code]()
- RealFusion: 360° Reconstruction of Any Object from a Single Image - Kyriazi et al., CVPR 2023 | [citation](./references/citations.bib#L113-L118) | [site](https://lukemelas.github.io/realfusion/) | [code](https://github.com/lukemelas/realfusion)
- Monocular Depth Estimation using Diffusion Models - L125) | [site](https://depth-gen.github.io/) | [code]()
- SparseFusion: Distilling View-conditioned Diffusion for 3D Reconstruction - L132) | [site](https://sparsefusion.github.io/) | [code](https://github.com/zhizdev/sparsefusion)
- NerfDiff: Single-image View Synthesis with NeRF-guided Distillation from 3D-aware Diffusion - L139) | [site](https://jiataogu.me/nerfdiff/) | [code]()
- NeRDi: Single-View NeRF Synthesis with Language-Guided Diffusion as General Image Priors - L167) | [site]() | [code]()
- DiffusioNeRF: Regularizing Neural Radiance Fields with Denoising Diffusion Models - L174) | [site]() | [code](https://github.com/nianticlabs/diffusionerf)
- 3DQD: Generalized Deep 3D Shape Prior via Part-Discretized Diffusion Process - L545) | [site]() | [code](https://github.com/colorful-liyu/3DQD)
- ProlificDreamer: High-Fidelity and Diverse Text-to-3D Generation with Variational Score Distillation - L195) | [site]() | [code]()
- 3D-aware Image Generation using 2D Diffusion Models - L209) | [site]() | [code]()
- Make-It-3D: High-Fidelity 3D Creation from A Single Image with Diffusion Prior - L216) | [site]() | [code]()
- GECCO: Geometrically-Conditioned Point Diffusion Models - L699) | [site]() | [code]()
- Re-imagine the Negative Prompt Algorithm: Transform 2D Diffusion into 3D, alleviate Janus problem and Beyond - L223) | [site]() | [code]()
- Generative Novel View Synthesis with 3D-Aware Diffusion Models - L237) | [site]() | [code]()
- Text2NeRF: Text-Driven 3D Scene Generation with Neural Radiance Fields - L244) | [site]() | [code]()
- Magic123: One Image to High-Quality 3D Object Generation Using Both 2D and 3D Diffusion Priors - L251) | [site]() | [code]()
- Prompt-Free Diffusion: Taking "Text" out of Text-to-Image Diffusion Models - L363) | [site]() | [code]()
- SceneScape: Text-Driven Consistent Scene Generation - L370) | [site]() | [code]()
- CLIP-Mesh: Generating textured meshes from text using pretrained image-text models - L384) | [site]() | [code]()
- Single-Stage Diffusion NeRF: A Unified Approach to 3D Generation and Reconstruction - L398) | [site]() | [code]()
- Shap-E: Generating Conditional 3D Implicit Functions - L405) | [site]() | [code]()
- Sketch-A-Shape: Zero-Shot Sketch-to-3D Shape Generation - L412) | [site]() | [code]()
- Points-to-3D: Bridging the Gap between Sparse Points and Shape-Controllable Text-to-3D Generation - L510) | [site]() | [code]()
- RGB-D-Fusion: Image Conditioned Depth Diffusion of Humanoid Subjects - L517) | [site]() | [code]()
- IT3D: Improved Text-to-3D Generation with Explicit View Synthesis - L608) | [site]() | [code]()
- MVDream: Multi-view Diffusion for 3D Generation - L629) | [site]() | [code]()
- PointLLM: Empowering Large Language Models to Understand Point Clouds - L636) | [site]() | [code]()
- SyncDreamer: Generating Multiview-consistent Images from a Single-view Image - L650) | [site]() | [code]()
- DreamCraft3D: Hierarchical 3D Generation with Bootstrapped Diffusion Prior - L762) | [site]() | [code]()
- HyperFields: Towards Zero-Shot Generation of NeRFs from Text - L769) | [site]() | [code]()
- Enhancing High-Resolution 3D Generation through Pixel-wise Gradient Clipping - L776) | [site]() | [code]()
- Text-to-3D with classifier score distillation - L783) | [site]() | [code]()
- Noise-Free Score Distillation - L790) | [site]() | [code]()
- LRM: Large Reconstruction Model for Single Image to 3D - L811) | [site]() | [code]()
- One-2-3-45++: Fast Single Image to 3D Objects with Consistent Multi-View Generation and 3D Diffusion - L818) | [site]() | [code]()
- LucidDreamer: Towards High-Fidelity Text-to-3D Generation via Interval Score Matching - L825) | [site]() | [code]()
- MetaDreamer: Efficient Text-to-3D Creation With Disentangling Geometry and Texture - L832) | [site]() | [code]()
- Adversarial Diffusion Distillation - L854) | [site]() | [code]()
- HarmonyView: Harmonizing Consistency and Diversity in One-Image-to-3D - L994) | [site]() | [code]()
- MVD2: Efficient Multiview 3D Reconstruction for Multiview Diffusion - Yang Zheng et al., Arxiv 2024 | [citation](./references/citations.bib#L1122-L1127) | [site]() | [code]()
- Pushing Auto-regressive Models for 3D Shape Generation at Capacity and Scalability - L1134) | [site]() | [code]()
- SceneWiz3D: Towards Text-guided 3D Scene Composition - L1141) | [site]() | [code]()
- V3D: Video Diffusion Models are Effective 3D Generators - L1169) | [site]() | [code]()
- CRM: Single Image to 3D Textured Mesh with Convolutional Reconstruction Model - L1176) | [site]() | [code]()
- Isotropic3D: Image-to-3D Generation Based on a Single CLIP Embedding - L1190) | [site]() | [code]()
- SV3D: Novel Multi-view Synthesis and 3D Generation from a Single Image using Latent Video Diffusion - L1197) | [site]() | [code]()
- Generic 3D Diffusion Adapter Using Controlled Multi-View Editing - L1204) | [site]() | [code]()
- GVGEN: Text-to-3D Generation with Volumetric Representation - L1211) | [site]() | [code]()
- BrightDreamer: Generic 3D Gaussian Generative Framework for Fast Text-to-3D Synthesis - L1218) | [site]() | [code]()
- LATTE3D: Large-scale Amortized Text-To-Enhanced3D Synthesis - L1239) | [site]() | [code]()
- GRM: Large Gaussian Reconstruction Model for Efficient 3D Reconstruction and Generation - L1253) | [site]() | [code]()
- VP3D: Unleashing 2D Visual Prompt for Text-to-3D Generation - L1260) | [site]() | [code]()
- DreamPolisher: Towards High-Quality Text-to-3D Generation via Geometric Diffusion - L1267) | [site]() | [code]()
- PointInfinity: Resolution-Invariant Point Diffusion Models - L1274) | [site](https://zixuanh.com/projects/pointinfinity.html) | [code]()
- The More You See in 2D, the More You Perceive in 3D - L1295) | [site](https://sap3d.github.io/) | [code]()
- Hash3D: Training-free Acceleration for 3D Generation - L1302) | [site](https://adamdad.github.io/hash3D/) | [code](https://github.com/Adamdad/hash3D)
- Control4D: Dynamic Portrait Editing by Learning 4D GAN from 2D Diffusion-based Editor - L440) | [site](https://fantasia3d.github.io/) | [code](https://github.com/Gorilla-Lab-SCUT/Fantasia3D)
- Sketch2NeRF: Multi-view Sketch-guided Text-to-3D Generation - L1078) | [site]() | [code]()
- BoostDream: Efficient Refining for High-Quality Text-to-3D Generation from Multi-View Diffusion - L1064) | [site]() | [code]()
- 2L3: Lifting Imperfect Generated 2D Images into Accurate 3D - L1071) | [site]() | [code]()
- TexDreamer: Towards Zero-Shot High-Fidelity 3D Human Texture Generation - L1225) | [site](https://ggxxii.github.io/texdreamer/) | [code]()
- InTeX: Interactive Text-to-texture Synthesis via Unified Depth-aware Inpainting - L1232) | [site](https://me.kiui.moe/intex/) | [code](https://github.com/ashawkey/InTeX)
- SC4D: Sparse-Controlled Video-to-4D Generation and Motion Transfer - L1288) | [site](https://sc4d.github.io/) | [code]()
- HeadSculpt: Crafting 3D Head Avatars with Text - L475) | [site](https://brandonhan.uk/HeadSculpt/) | [code]()
- One-shot Implicit Animatable Avatars with Model-based Priors - L622) | [site](https://huangyangyi.github.io/ELICIT/) | [code](https://github.com/huangyangyi/ELICIT)
- Text2Control3D: Controllable 3D Avatar Generation in Neural Radiance Fields using Geometry-Guided Text-to-Image Diffusion Model - L657) | [site]() | [code]()
- InstructHumans: Editing Animatable 3D Human Textures with Instructions - L1281) | [site](https://jyzhu.top/instruct-humans/) | [code](https://github.com/viridityzhu/InstructHumans)
- L4GM: Large 4D Gaussian Reconstruction Model - L1477) | [site](https://research.nvidia.com/labs/toronto-ai/l4gm/) | [code]()
- HAAR: Text-Conditioned Generative Model of 3D Strand-based Human Hairstyles - L1470) | [site](https://haar.is.tue.mpg.de/) | [code](https://github.com/Vanessik/HAAR)
- GaussianDreamerPro: Text to Manipulable 3D Gaussians with Highly Enhanced Quality - L1484) | [site](https://taoranyi.com/gaussiandreamerpro/) | [code](https://github.com/hustvl/GaussianDreamerPro)
- LLaVA-3D: A Simple yet Effective Pathway to Empowering LMMs with 3D-awareness - L1688) | [site](https://zcmax.github.io/projects/LLaVA-3D/) | [code](https://github.com/ZCMax/LLaVA-3D)
- Edify 3D: Scalable High-Quality 3D Asset Generation - L1751) | [site](https://research.nvidia.com/labs/dir/edify-3d/) | [code]()
- Direct and Explicit 3D Generation from a Single Image - L1753) | [site]() | [code]()
- GaussianAnything: Interactive Point Cloud Latent Diffusion for 3D Generation - L1772) | [site]() | [code]()
- MVLight: Relightable Text-to-3D Generation via Light-conditioned Multi-View Diffusion - L1765) | [site]() | [code]()
- SAMPart3D: Segment Any Part in 3D Objects - L1779) | [site](https://yhyang-myron.github.io/SAMPart3D-website/) | [code](https://github.com/Pointcept/SAMPart3D)
- Portrait3D: Text-Guided High-Quality 3D Portrait Generation Using Pyramid Representation and GANs Prior - L1407) | [site]() | [code]()
- CraftsMan: High-fidelity Mesh Generation with 3D Native Generation and Interactive Geometry Refiner - L1414) | [site](https://craftsman3d.github.io/) | [code](https://github.com/wyysf-98/CraftsMan)
- GaussianCube: A Structured and Explicit Radiance Representation for 3D Generative Modeling - L1456) | [site](https://gaussiancube.github.io/) | [code](https://github.com/GaussianCube/)
- Tetrahedron Splatting for 3D Generation - L1463) | [site]() | [code](https://github.com/fudan-zvg/tet-splatting)
- Part123: Part-aware 3D Reconstruction from a Single-view Image - L1400) | [site]() | [code]()
- DreamMat: High-quality PBR Material Generation with Geometry- and Light-aware Diffusion Models - L1442) | [site](https://zzzyuqing.github.io/dreammat.github.io/) | [code](https://github.com/zzzyuqing/DreamMat)
- MagicPose4D: Crafting Articulated Models with Appearance and Motion Control - L1449) | [site](https://boese0601.github.io/magicpose4d/) | [code](https://github.com/haoz19/MagicPose4D)
- Zero-shot Point Cloud Completion Via 2D Priors - L1337) | [site]() | [code]()
- Turbo3D: Ultra-fast Text-to-3D Generation - L1786) | [site](https://turbo-3d.github.io/) | [code]()
- Material Anything: Generating Materials for Any 3D Object via Diffusion - L1793) | [site](https://xhuangcv.github.io/MaterialAnything/) | [code](https://github.com/3DTopia/MaterialAnything)
- Sharp-It: A Multi-view to Multi-view Diffusion Model for 3D Synthesis and Manipulation - L1800) | [site](https://yiftachede.github.io/Sharp-It/) | [code](https://github.com/YiftachEde/Sharp-It)
- DreamCraft3D++: Efficient Hierarchical 3D Generation with Multi-Plane Reconstruction Model - L1709) | [site](https://dreamcraft3dplus.github.io) | [code](https://github.com/MrTornado24/DreamCraft3D_Plus)
- Tex4D: Zero-shot 4D Scene Texturing with Video Diffusion Models - L1716) | [site](https://tex4d.github.io) | [code](https://github.com/ZqlwMatt/Tex4D)
- MvDrag3D: Drag-based Creative 3D Editing via Multi-view Generation-Reconstruction Priors - L1723) | [site](https://chenhonghua.github.io/MyProjects/MvDrag3D) | [code](https://github.com/chenhonghua/MvDrag3D)
- Novel View Synthesis with Diffusion Models - L188) | [site]() | [code]()
- Gamba: Marry Gaussian Splatting with Mamba for single view 3D reconstruction - L1491) | [site](https://florinshen.github.io/gamba-project/) | [code](https://github.com/SkyworkAI/Gamba)
- HouseCrafter: Lifting Floorplans to 3D Scenes with 2D Diffusion Model - L1493) | [site](https://neu-vi.github.io/houseCrafter/) | [code]()
- Meta 3D Gen - L1505) | [site](https://ai.meta.com/research/publications/meta-3d-gen/) | [code]()
- ScaleDreamer - L1512) | [site](https://sites.google.com/view/scaledreamer-release/) | [code](https://github.com/theEricMa/ScaleDreamer)
- YouDream: Generating Anatomically Controllable Consistent Text-to-3D Animals - L1519) | [site](https://youdream3d.github.io/) | [code](https://github.com/YouDream3D/YouDream/)
- RodinHD: High-Fidelity 3D Avatar Generation with Diffusion Models - L1526) | [site]() | [code]()
Datasets :floppy_disk:
Tutorial Videos :tv:
- AI 3D Generation, explained
Frameworks :desktop_computer:

Programming Languages

Python 6 HTML 2 Jupyter Notebook 1 JavaScript 1

Categories

Papers :scroll: 238 Datasets :floppy_disk: 10 Frameworks :desktop_computer: 5 Tutorial Videos :tv: 1

Sub Categories

Keywords

video 1 open-source 1 diffusion-models 1 audio 1 ai 1 3d 1 iccv2023 1 cv 1 transformers-models 1 text-to-3d 1 shape-generation 1 michelangelo 1 image-to-3d 1 alignment-before-generation 1