{"id":19883002,"url":"https://github.com/modelscope/normal-depth-diffusion","last_synced_at":"2025-05-02T14:32:56.572Z","repository":{"id":211818929,"uuid":"728081252","full_name":"modelscope/normal-depth-diffusion","owner":"modelscope","description":null,"archived":false,"fork":false,"pushed_at":"2024-02-07T14:39:51.000Z","size":6984,"stargazers_count":128,"open_issues_count":6,"forks_count":9,"subscribers_count":9,"default_branch":"main","last_synced_at":"2025-04-07T02:41:58.564Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/modelscope.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-12-06T07:29:34.000Z","updated_at":"2025-03-16T06:06:48.000Z","dependencies_parsed_at":"2024-02-07T15:52:39.736Z","dependency_job_id":null,"html_url":"https://github.com/modelscope/normal-depth-diffusion","commit_stats":null,"previous_names":["modelscope/normal-depth-diffusion"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/modelscope%2Fnormal-depth-diffusion","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/modelscope%2Fnormal-depth-diffusion/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/modelscope%2Fnormal-depth-diffusion/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/modelscope%2Fnormal-depth-diffusion/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/modelscope","download_url":"https://codeload.github.com/modelscope/normal-depth-diffusion/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":252053936,"owners_count":21687196,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-12T17:19:08.281Z","updated_at":"2025-05-02T14:32:55.565Z","avatar_url":"https://github.com/modelscope.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cp align=\"center\"\u003e\n    \u003cbr\u003e\n    \u003cimg src=\"https://modelscope.oss-cn-beijing.aliyuncs.com/modelscope.gif\" width=\"400\"/\u003e\n    \u003cbr\u003e\n    \u003ch1\u003eNormal-Depth Diffusion Model\u003c/h1\u003e\n\u003cp\u003e\n\nNormal-Depth Diffusion Model: A Generalizable Normal-Depth Diffusion Model.\n\n如果您熟悉中文，可以阅读[中文版本的README](./README_ZH.md)。\n\n### Text-to-ND\n![teaser-nd](assets/text-to-nd-laion.png)\n### Text-to-ND-MV\n\u003cimg src=\"./assets/nd-mv.jpg\" alt=\"image\" width=\"atuo\" height=\"auto\"\u003e\n\n## [Project page](https://aigc3d.github.io/richdreamer/) | [Paper](https://arxiv.org/abs/2311.16918) | [YouTube](https://youtu.be/6gQ1VWiKoc0)\n\n- [x] Inference code.\n- [x] Training code.\n- [x] Pretrained model: ND, ND-MV, Albedo-MV.\n- [ ] Pretrained model: ND-MV-VAE.\n- [x] Rendered Multi-View Image of Objaverse-dataset.\n\n## News\n- 2023-12-25: We release the training dataset mvs_objaverse through Alibaba OSS Service. We also provide a convenient multi-threads script for fast downloading.\n- 2023-12-11: Inference codes and pretrained models are released. We are working to improve ND-Diffusion Model, stay tuned!.\n\n\n## 3D Generation\n\n\n- This repository only includes the diffusion model and 2D image generation code of RichDreamer paper.\n- For 3D Generation, please check [RichDreamer](https://github.com/modelscope/RichDreamer).\n\n\n## Preparation for inference\n1. Install requirements using following scripts. \n```bash\nconda create -n nd\nconda activate nd \npip install -r requirements.txt\npip install git+https://github.com/openai/CLIP.git\npip install git+https://github.com/CompVis/taming-transformers.git\npip install webdataset\npip install img2dataset\n```\nwe also provide a dockerfile to build docker image.\n```bash\nsudo docker build -t mv3dengine_22.04:cu118 -f docker/Dockerfile .\n```\n\n2. Download pretrained weights.\n- [ND](https://virutalbuy-public.oss-cn-hangzhou.aliyuncs.com/share/RichDreamer/nd-laion_ema.ckpt): Normal-Depth Diffusion trained on Laion-2B\n- [ND-MV](https://virutalbuy-public.oss-cn-hangzhou.aliyuncs.com/share/RichDreamer/nd_mv_ema.ckpt): MultiView Normal-Depth Diffusion Model\n- [Alebdo-MV](https://virutalbuy-public.oss-cn-hangzhou.aliyuncs.com/share/RichDreamer/albedo_mv_ema.ckpt): MultiView Depth-conditioned Albedo Diffusion Model\n\nwe also provide a script for download.\n```bash\npython tools/download_models/download_nd_models.py\n```\n\n## Inference (Sampling)\nwe provide a script for sampling\n```bash\nsh demo_inference.sh\n```\nOr use the following detailed instructions:\n\n### Text2ND sampling\n```\n# dmp solver\npython ./scripts/t2i.py --ckpt $ckpt_path --prompt $prompt --dpm_solver --n_samples 2 --save_dir $save_dir\n# plms solver\npython ./scripts/t2i.py --ckpt $ckpt_path --prompt $prompt --plms --n_samples 2  --save_dir $save_dir\n# ddim solver\npython ./scripts/t2i.py --ckpt $ckpt_path --prompt $prompt --n_samples 2  --save_dir $save_dir\n```\n\n### Text2ND-MV sampling\n```\n# nd-mv\npython ./scripts/t2i_mv.py --ckpt_path $ckpt_path --prompt $prompt  --num_frames 4  --model_name nd-mv --save_dir $save_dir\n\n# nd-mv with VAE (coming soon)\npython ./scripts/t2i_mv.py --ckpt_path $ckpt_path --prompt $prompt  --num_frames 4  --model_name nd-mv-vae --save_dir $save_dir\n\n```\n\n### Text2Albedo-MV sampling\n```\npython ./scripts/td2i_mv.py --ckpt_path $ckpt_path --prompt $prompt --depth_file $ depth_file --num_frames 4  --model_name albedo-mv --save_dir $save_dir\n\n```\n\n\n## Preparation for training\n\n1. Download Laion-2B-en-5-AES (*Required to train ND model*)\n\nDownload laion-2b dataset from [parquet](https://huggingface.co/datasets/laion/laion2B-en)\nThen, put parquet files into  ```./laion2b-dataset-5-aes```\n```bash\ncd ./tools/download_dataset\nbash ./download_2b-5_aes.sh\ncd -\n```\n\n2. Download Monocular Prior Models' Weight (*Required to train ND model*)\n- NormalBae [scannet.pt](https://virutalbuy-public.oss-cn-hangzhou.aliyuncs.com/share/RichDreamer/scannet.pt)\n- Midas3.1 [dpt_beit_large512.pt](https://virutalbuy-public.oss-cn-hangzhou.aliyuncs.com/share/RichDreamer/dpt_beit_large_512.pt)\n\n```bash\n# move the scannet.pt to normalbae Prior Model\nmv scannet.pt ./libs/ControlNet-v1-1-nightly/annotator/normalbae/scannet.pt\n# move the dpt_beit_large512.pt to ./libs/omnidata_torch/pretrained_models/dpt_beit_large_512.pt\nmv dpt_beit_large512.pt ./libs/omnidata_torch/pretrained_models/dpt_beit_large_512.pt\n```\n\n3. Download rendered Multi-View image of Objaverse-dataset (*Required to train ND-MV and Albedo-MV model*)\n- Download our rendered dataset using the prepared script\n\n```bash\nwget https://virutalbuy-public.oss-cn-hangzhou.aliyuncs.com/share/aigc3d/valid_paths_v4_cap_filter_thres_28.json\n# Example: python ./scripts/data/download_objaverse.py ./mvs_objaverse ./valid_paths_v4_cap_filter_thres_28.json 50\npython ./scripts/data/download_objaverse.py /path/to/savedata /path/to/valid_paths_v4_cap_filter_thres_28.json nthreads(eg. 10)\n# set up a link if you save data anywhere\nln -s /path/to/savedata mvs_objaverse\n# caption file\nwget https://virutalbuy-public.oss-cn-hangzhou.aliyuncs.com/share/aigc3d/text_captions_cap3d.json\n```\n\n## Training\n### Training Normal-Depth-VAE Model\n1. Download [pretrained-VAE weights](https://virutalbuy-public.oss-cn-hangzhou.aliyuncs.com/share/RichDreamer/nd-vae-imgnet.ckpt) pretrained on ImageNet.\n2. Modify the config file in `configs/autoencoder_normal_depth/autoencoder_normal_depth.yaml`, set `model.ckpt_path=/path/to/pretained-VAE weights`\n\n```bash\n# training  VAE datasets\nbash ./scripts/train_vae/train_nd_vae/train_rgbd_vae_webdatasets.sh \\ model.params.ckpt_path=${pretained-VAE weights} \\\ndata.params.train.params.curls='path_laion/{00000..${:5 end_id}}.tar' \\\n--gpus 0,1,2,3,4,5,6,7\n```\n\n### Training Normal-Depth-Diffusion Model\nAfter training and get `Normal-Depth-VAE` Model or you could download it from [ND-VAE](https://virutalbuy-public.oss-cn-hangzhou.aliyuncs.com/share/RichDreamer/nd-vae-laion.ckpt)\n\n```bash\n# step 1\nexport SD-MODEL-PATH=/path/to/sd-1.5\nbash scripts/train_normald_sd/txt_cond/web_datasets/train_normald_webdatasets.sh --gpus 0,1,2,3,4,5,6,7 \\\n    model.params.first_stage_ckpts=${Normal-Depth-VAE} model.params.ckpt_path=${SD-MODEL-PATH} \\\n    data.params.train.params.curls='path_laion/{00000..${:5 end_id}}.tar'\n\n# step 2 modify your step_weights path in ./configs/stable-diffusion/normald/sd_1_5/txt_cond/web_datasets/laion_2b_step2.yaml\nbash scripts/train_normald_sd/txt_cond/web_datasets/train_normald_webdatasets_step2.sh --gpus 0,1,2,3,4,5,6,7 \\\n    model.params.first_stage_ckpts=${Normal-Depth-VAE} \\\n    model.params.ckpt_path=${pretrained-step-weights} \\\n    data.params.train.params.curls='path_laion/{00000..${:5 end_id}}.tar'\n```\n\n### Training MultiView-Normal-Depth-Diffusion Model\nAfter training and get `Normal-Depth-Diffusion` Model or you could download it from [ND](https://virutalbuy-public.oss-cn-hangzhou.aliyuncs.com/share/RichDreamer/nd-laion.ckpt),\n\nWe provide two versions of MultiView-Normal-Depth Diffusion Model\n\na. without VAE Denoise\nb. with VAE Denoise\n\nIn current version, we provide w/o VAE denoise\n\n```bash\n# a. Training Without VAE version\nbash ./scripts/train_normald_sd/txt_cond/objaverse/objaverse_finetune_wovae_mvsd-4.sh --gpus 0,1,2,3,4,5,6,7,  \\\n    model.params.ckpt_path=${Normal-Depth-Diffusion}\n# b. Training with VAE version\nbash ./scripts/train_normald_sd/txt_cond/objaverse/objaverse_finetune_mvsd-4.sh --gpus 0,1,2,3,4,5,6,7, \\\n    model.params.ckpt_path=${Normal-Depth-Diffusion}\n\n```\n\n### Training MultiView-Depth-Conditioned-Albedo-Diffusion Model\nAfter training and get `Normal-Depth-Diffusion` Model or you could download it from [ND](https://virutalbuy-public.oss-cn-hangzhou.aliyuncs.com/share/RichDreamer/nd-laion.ckpt),\n\n```bash\nbash scripts/train_abledo/objaverse/objaverse_finetune_mvsd-4.sh --gpus 0,1,2,3,4,5,6,7, model.params.ckpt_path=${Normal-Depth-Diffusion}\n```\n\n## Acknowledgement\nWe have intensively borrow codes from the following repositories. Many thanks to the authors for sharing their codes.\n- [stable diffusion](https://github.com/CompVis/stable-diffusion)\n- [mvdream](https://github.com/bytedance/MVDream)\n\n## Citation\t\n\n```\n@article{qiu2023richdreamer,\n    title={RichDreamer: A Generalizable Normal-Depth Diffusion Model for Detail Richness in Text-to-3D}, \n    author={Lingteng Qiu and Guanying Chen and Xiaodong Gu and Qi zuo and Mutian Xu and Yushuang Wu and Weihao Yuan and Zilong Dong and Liefeng Bo and Xiaoguang Han},\n    year={2023},\n    journal = {arXiv preprint arXiv:2311.16918}\n}\n```\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmodelscope%2Fnormal-depth-diffusion","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmodelscope%2Fnormal-depth-diffusion","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmodelscope%2Fnormal-depth-diffusion/lists"}