{"id":13958423,"url":"https://github.com/frozenburning/text2light","last_synced_at":"2025-07-20T23:31:36.118Z","repository":{"id":59813547,"uuid":"525714614","full_name":"FrozenBurning/Text2Light","owner":"FrozenBurning","description":"[SIGGRAPH Asia 2022] Text2Light: Zero-Shot Text-Driven HDR Panorama Generation","archived":false,"fork":false,"pushed_at":"2023-05-05T03:22:22.000Z","size":17103,"stargazers_count":559,"open_issues_count":0,"forks_count":45,"subscribers_count":11,"default_branch":"master","last_synced_at":"2024-08-09T13:18:57.329Z","etag":null,"topics":["3d-generation","hdr","hdr-image","hdri","inverse-tonemapping","panorama","rendering","siggraph-asia","text2image"],"latest_commit_sha":null,"homepage":"https://frozenburning.github.io/projects/text2light/","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/FrozenBurning.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2022-08-17T09:04:07.000Z","updated_at":"2024-07-28T10:59:21.000Z","dependencies_parsed_at":"2022-09-25T17:43:14.636Z","dependency_job_id":null,"html_url":"https://github.com/FrozenBurning/Text2Light","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/FrozenBurning%2FText2Light","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/FrozenBurning%2FText2Light/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/FrozenBurning%2FText2Light/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/FrozenBurning%2FText2Light/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/FrozenBurning","download_url":"https://codeload.github.com/FrozenBurning/Text2Light/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":226845028,"owners_count":17691144,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["3d-generation","hdr","hdr-image","hdri","inverse-tonemapping","panorama","rendering","siggraph-asia","text2image"],"created_at":"2024-08-08T13:01:33.614Z","updated_at":"2024-11-28T01:32:04.574Z","avatar_url":"https://github.com/FrozenBurning.png","language":"Jupyter Notebook","funding_links":[],"categories":["图像风格"],"sub_categories":["网络服务_其他"],"readme":"\u003cdiv align=\"center\"\u003e\n\n\u003ch1\u003eText2Light: Zero-Shot Text-Driven HDR Panorama Generation\u003c/h1\u003e\n\n\u003cdiv\u003e\n    \u003ca href='https://frozenburning.github.io/' target='_blank'\u003eZhaoxi Chen\u003c/a\u003e\u0026emsp;\n    \u003ca href='https://wanggcong.github.io/' target='_blank'\u003eGuangcong Wang\u003c/a\u003e\u0026emsp;\n    \u003ca href='https://liuziwei7.github.io/' target='_blank'\u003eZiwei Liu\u003c/a\u003e\n\u003c/div\u003e\n\u003cdiv\u003e\n    S-Lab, Nanyang Technological University\n\u003c/div\u003e\n\n\u003cstrong\u003e\u003ca href='https://sa2022.siggraph.org/' target='_blank'\u003eTOG 2022 (Proc. SIGGRAPH Asia)\u003c/a\u003e\u003c/strong\u003e\n\n\u003ch3\u003eTL;DR\u003c/h3\u003e\n\u003ch4\u003eText2Light can generate HDR panoramas in 4K+ resolution using free-form texts solely. \u003cbr\u003e Our high-quality results can be directly applied to downstream tasks, e.g., light 3D scenes and immersive VR.\u003c/h4\u003e\n\n### [Project Page](https://frozenburning.github.io/projects/text2light) | [Video](https://youtu.be/XDx6tOHigPE) | [Paper](https://arxiv.org/abs/2209.09898) | [Colab](https://colab.research.google.com/github/FrozenBurning/Text2Light/blob/master/text2light.ipynb)\n\n\u003ctr\u003e\n    \u003cimg src=\"https://github.com/FrozenBurning/FrozenBurning.github.io/blob/master/projects/text2light/img/teaser.gif\" width=\"100%\"/\u003e\n\u003c/tr\u003e\n\n\u003c/div\u003e\n\n## Updates\n[05/2023] Release text descriptions used during inference. [![Google Drive](https://img.shields.io/badge/Google%20Drive-4285F4?style=for-the-badge\u0026logo=googledrive\u0026logoColor=yellow)](https://drive.google.com/drive/folders/1TtzS28GCbc-iTIAqtgCh_JTSJDoXUGAJ?usp=share_link)\n\n[09/2022] Our online demo in Colab is released! [![Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/FrozenBurning/Text2Light/blob/master/text2light.ipynb)\n\n[09/2022] Paper uploaded to arXiv. [![arXiv](https://img.shields.io/badge/arXiv-2209.09898-b31b1b.svg)](https://arxiv.org/abs/2209.09898)\n\n[09/2022] Model weights released. [![Google Drive](https://img.shields.io/badge/Google%20Drive-4285F4?style=for-the-badge\u0026logo=googledrive\u0026logoColor=yellow)](https://drive.google.com/drive/folders/1HKBjC7oQOzrkGFKMQmSh6PySv6AycDS3?usp=sharing)\n\n[09/2022] Code released.\n\n\n## Citation\nIf you find our work useful for your research, please consider citing this paper:\n```\n@article{chen2022text2light,\n    title={Text2Light: Zero-Shot Text-Driven HDR Panorama Generation},\n    author={Chen, Zhaoxi and Wang, Guangcong and Liu, Ziwei},\n    journal={ACM Transactions on Graphics (TOG)},\n    volume={41},\n    number={6},\n    articleno={195},\n    pages={1--16},\n    year={2022},\n    publisher={ACM New York, NY, USA}\n}\n```\n\n## Installation\nWe highly recommend using [Anaconda](https://www.anaconda.com/) to manage your python environment. You can setup the required environment by the following command:\n```bash\nconda env create -f environment.yml\nconda activate text2light\n```\n\n## Text-driven HDRI Generation\n\nYou may do the following steps to generate HDR panoramas from free-form texts with our models.\n### Download Pretrained Models\nPlease download our checkpoints from [Google Drive](https://drive.google.com/drive/folders/1HKBjC7oQOzrkGFKMQmSh6PySv6AycDS3?usp=sharing) to run the following inference scripts. We use the model trained on our full dataset by default (`local_sampler`). Note that we also release models that trained on outdoor (`local_sampler_outdoor`) and indoor (`local_sampler_indoor`) scenes respectively.\n\n### All-in-one Inference Script\nAll inference codes are in [text2light.py](text2light.py), you can learn to use it by:\n```bash\npython text2light.py -h\n```\n\nHere are some examples, the output will be saved in `./generated_panorama`:\n- Generate a HDR panorama from a single sentence:\n    ```bash\n    python text2light.py -rg logs/global_sampler_clip -rl logs/local_sampler_outdoor --outdir ./generated_panorama --text \"YOUR SCENE DESCRIPTION\" --clip clip_emb.npy --sritmo ./logs/sritmo.pth --sr_factor 4\n    ```\n\n- Generate HDR panoramas from a list of texts:\n    ```bash\n    # assume your texts is stored in alt.txt\n    python text2light.py -rg logs/global_sampler_clip -rl logs/local_sampler_outdoor --outdir ./generated_panorama --text ./alt.txt --clip clip_emb.npy --sritmo ./logs/sritmo.pth --sr_factor 4\n    ```\n\n- Generate low-resolution (512x1024) LDR panoramas only:\n    ```bash\n    # assume your texts is stored in alt.txt\n    python text2light.py -rg logs/global_sampler_clip -rl logs/local_sampler_outdoor --outdir ./generated_panorama --text ./alt.txt --clip clip_emb.npy\n    ```\nHere are some examples of Text2Light in generating HDRIs. The generated results can be directly used to render 3D scenes like [Barcelona Pavillion](https://download.blender.org/demo/test/pabellon_barcelona_v1.scene_.zip) from [Blender Demo Files](https://www.blender.org/download/demo-files/).\n\n\u003ctable\u003e\n\u003ctr\u003e\n    \u003ctd align='center' width='50%'\u003e\u003cimg src=\"https://github.com/FrozenBurning/FrozenBurning.github.io/blob/master/projects/text2light/img/github_demo_house.jpg\" width=\"100%\"/\u003e\u003c/td\u003e\n    \u003ctd align='center' width='50%'\u003e\u003cimg src=\"https://github.com/FrozenBurning/FrozenBurning.github.io/blob/master/projects/text2light/img/github_demo_ball.jpg\" width=\"100%\"/\u003e\u003c/td\u003e\n\u003c/tr\u003e\n\u003c/table\u003e\n\n## Rendering\n\nOur generated HDR panoramas can be directly used in any modern graphics pipeline as the environment texture and light source. Here we take [Blender](https://www.blender.org/) as an example.\n\n### From GUI\nOpen Blender -\u003e Select `Shading` Panel -\u003e Select `Shader Type` as `World` -\u003e Add an `Environment Texture` node -\u003e Browse and select our generated panoramas -\u003e Render\n\nYou can also refer to this [tutorial](https://www.youtube.com/watch?v=gC4Uqr4E78U).\n\nHere is an example of rendering a landscape in San Francisco using the HDRI with input texts as `landscape photography of mountain ranges under purple and pink skies`.\n\n\u003ctr\u003e\n    \u003cimg src=\"https://github.com/FrozenBurning/FrozenBurning.github.io/blob/master/projects/text2light/img/github_blender.jpg\" width=\"100%\"/\u003e\n\u003c/tr\u003e\n\n### From Command line\nFor the ease of batch processing, e.g. rendering with multiple HDRIs, we offer scripts in command line for rendering your 3D assets.\n\n1. Download the Linux version of Blender from [Blender Download Page](https://www.blender.org/download/).\n2. Unpack it and check the usage of Blender:\n    ```bash\n    # assume your downloaded version is 3.1.2\n    tar -xzvf blender-3.1.2-linux-x64.tar.xz\n    cd blender-3.1.2-linux-x64\n    ./blender --help\n    ```\n3. Add an alias to your .bashrc or .zshrc:\n    ```bash\n    # PATH_TO_DOWNLOADED_BLENDER indicates the parent directory where you save the downloaded blender\n    alias blender=\"/PATH_TO_DOWNLOADED_BLENDER/blender-3.1.2-linux-x64/blender\"\n    ```\n4. Back to the codebase of Text2Light, and run the following commands for different rendering setup:\n    - Render four shader balls given all HDRIs stored at `PATH_TO_HDRI`\n    ```bash\n    blender --background --python rendering_shader_ball.py -- ./rendered_balls 100 1000 PATH_TO_HDRI\n    ```\n    The results will be saved in `./rendered_balls` which looks like:\n\u003ctable\u003e\n\u003ctr\u003e\n    \u003ctd align='center' width='50%'\u003e\u003cimg src=\"https://github.com/FrozenBurning/FrozenBurning.github.io/blob/master/projects/text2light/img/rendered_ball/full_[green grass field with trees and mountains in the distance]_balls.png\" width=\"100%\"/\u003e\u003c/td\u003e\n    \u003ctd align='center' width='50%'\u003e\u003cimg src=\"https://github.com/FrozenBurning/FrozenBurning.github.io/blob/master/projects/text2light/img/rendered_ball/full_[landscape photography of mountain ranges under purple and pink skies]_balls.png\" width=\"100%\"/\u003e\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n    \u003ctd align='center' width='50%'\u003e\u003cimg src=\"https://github.com/FrozenBurning/FrozenBurning.github.io/blob/master/projects/text2light/img/rendered_ball/full_[Audience, Auditorium, Conference]_balls.png\" width=\"100%\"/\u003e\u003c/td\u003e\n    \u003ctd align='center' width='50%'\u003e\u003cimg src=\"https://github.com/FrozenBurning/FrozenBurning.github.io/blob/master/projects/text2light/img/rendered_ball/full_[white bed linen with white pillow]_balls.png\" width=\"100%\"/\u003e\u003c/td\u003e\n\u003c/tr\u003e\n\n\u003c/table\u003e\n    \n\n\n## Training\nOur training is stage-wise with multiple steps. The details are listed as follows.\n\n### Data Preparation\nAssume all your HDRIs for training are stored at `PATH_TO_HDR_DATA`, please run [process_hdri.py](./process_hdri.py) to process the data:\n```bash\npython process_hdri.py --src PATH_TO_HDR_DATA\n```\nThe processed data will be saved to `./data` by default and organized as follows: \n```\n├── ...\n└── Text2Light/\n    ├── data/\n        ├── train/\n            ├── calib_hdr\n            ├── ldr\n            └── raw_hdr\n        ├── val/\n            ├── calib_hdr\n            ├── ldr\n            └── raw_hdr\n        └── meta/\n```\n\n### Stage I - Text-driven LDR Panorama Generation\n\nThe training stage1 is launched by [train_stage1.py](train_stage1.py), you can check the usage by:\n```bash\npython train_stage1.py -h\n```\n\n1) Train the global codebook\n    ```bash\n    python train_stage1.py --base configs/global_codebook.yaml -t True --gpu 0,1,2,3,4,5,6,7\n    ```\n2) Train the local codebook\n    ```bash\n    python train_stage1.py --base configs/local_codebook.yaml -t True --gpu 0,1,2,3,4,5,6,7\n    ```\n3) Train the text-conditioned global sampler. Please specify the path to global codebook in the config YAML.\n    ```bash\n    python train_stage1.py --base configs/global_sampler_clip.yaml -t True --gpu 0,1,2,3,4,5,6,7 \n    ```\n4) Train the structure-aware local sampler. Please specify the path to global and local codebooks in the config YAML, respectively.\n    ```bash\n    python train_stage1.py --base configs/local_sampler_spe.yaml -t True --gpu 0,1,2,3,4,5,6,7 \n    ```\n\n### Stage II - Super-resolution Inverse Tonemapping\n\nThe training stage2 is launched by [train_stage2.py](train_stage2.py), you can check the usage by:\n```bash\npython train_stage2.py -h\n```\n\nThe default setting can be trained on a single A100 GPU without DDP:\n```bash\n# assume you use the default --dst_dir in process_hdri.py, thus the hdr dataset would be stored in ./data\npython train_stage2.py --dir ./data --save_dir ./output/bs32_7e-5 --workers 16 --val_ep 5 --gpu 0\n```\n\nTo enable distributed training, for example, over 8 GPUs:\n```bash\npython train_stage2.py --dir ./data --save_dir ./output/bs32_7e-5 --workers 8 --val_ep 5 --ddp\n```\n\n\n## Acknowledgements\nThis work is supported by the National Research Foundation, Singapore under its AI Singapore Programme, NTU NAP, MOE AcRF Tier 2 (T2EP20221-0033), and under the RIE2020 Industry Alignment Fund - Industry Collaboration Projects (IAF-ICP) Funding Initiative, as well as cash and in-kind contribution from the industry partner(s).\n\nText2Light is implemented on top of the [VQGAN](https://github.com/CompVis/taming-transformers) codebase. We also thanks [CLIP](https://github.com/openai/CLIP) and [LIIF](https://github.com/yinboc/liif) for their released models and codes. Thanks this [repo](https://github.com/yuki-koyama/blender-cli-rendering) for its amazing command line rendering toolbox.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffrozenburning%2Ftext2light","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffrozenburning%2Ftext2light","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffrozenburning%2Ftext2light/lists"}