{"id":13493606,"url":"https://github.com/prs-eth/Marigold","last_synced_at":"2025-03-28T12:31:58.545Z","repository":{"id":210794602,"uuid":"724328776","full_name":"prs-eth/Marigold","owner":"prs-eth","description":"[CVPR 2024 - Oral, Best Paper Award Candidate] Marigold: Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation","archived":false,"fork":false,"pushed_at":"2024-12-14T13:19:11.000Z","size":5798,"stargazers_count":2574,"open_issues_count":6,"forks_count":150,"subscribers_count":42,"default_branch":"main","last_synced_at":"2025-03-04T19:01:52.726Z","etag":null,"topics":["diffusion","in-the-wild","monocular-depth-estimation","zero-shot"],"latest_commit_sha":null,"homepage":"https://marigoldmonodepth.github.io","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/prs-eth.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-11-27T21:25:00.000Z","updated_at":"2025-03-04T09:28:20.000Z","dependencies_parsed_at":"2024-01-03T16:30:11.445Z","dependency_job_id":"52ab9a63-ffd9-45c6-a60a-9070824ead2d","html_url":"https://github.com/prs-eth/Marigold","commit_stats":{"total_commits":91,"total_committers":5,"mean_commits":18.2,"dds":"0.25274725274725274","last_synced_commit":"62413d56099d36573b2de1eb8c429839734b7782"},"previous_names":["prs-eth/marigold"],"tags_count":5,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/prs-eth%2FMarigold","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/prs-eth%2FMarigold/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/prs-eth%2FMarigold/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/prs-eth%2FMarigold/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/prs-eth","download_url":"https://codeload.github.com/prs-eth/Marigold/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":245868294,"owners_count":20685607,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["diffusion","in-the-wild","monocular-depth-estimation","zero-shot"],"created_at":"2024-07-31T19:01:17.061Z","updated_at":"2025-03-28T12:31:58.508Z","avatar_url":"https://github.com/prs-eth.png","language":"Python","funding_links":[],"categories":["Python","Image Generation \u0026 Editing"],"sub_categories":[],"readme":"# Marigold: Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation\n\n**CVPR 2024 (Oral, Best Paper Award Candidate)**\n\nThis repository represents the official implementation of the paper titled \"Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation\".\n\n[![Website](doc/badges/badge-website.svg)](https://marigoldmonodepth.github.io)\n[![Paper](https://img.shields.io/badge/arXiv-PDF-b31b1b)](https://arxiv.org/abs/2312.02145)\n[![Hugging Face (LCM) Space](https://img.shields.io/badge/🤗%20Hugging%20Face%20(LCM)-Space-yellow)](https://huggingface.co/spaces/prs-eth/marigold-lcm)\n[![Hugging Face (LCM) Model](https://img.shields.io/badge/🤗%20Hugging%20Face%20(LCM)-Model-green)](https://huggingface.co/prs-eth/marigold-lcm-v1-0)\n[![Open In Colab](doc/badges/badge-colab.svg)](https://colab.research.google.com/drive/12G8reD13DdpMie5ZQlaFNo2WCGeNUH-u?usp=sharing)\n[![License](https://img.shields.io/badge/License-Apache--2.0-929292)](https://www.apache.org/licenses/LICENSE-2.0)\n\u003c!-- [![Hugging Face Model](https://img.shields.io/badge/🤗%20Hugging%20Face-Model-green)](https://huggingface.co/prs-eth/marigold-v1-0) --\u003e\n\u003c!-- [![Website](https://img.shields.io/badge/Project-Website-1081c2)](https://arxiv.org/abs/2312.02145) --\u003e\n\u003c!-- [![GitHub](https://img.shields.io/github/stars/prs-eth/Marigold?style=default\u0026label=GitHub%20★\u0026logo=github)](https://github.com/prs-eth/Marigold) --\u003e\n\u003c!-- [![HF Space](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Space-blue)]() --\u003e\n\u003c!-- [![Docker](doc/badges/badge-docker.svg)]() --\u003e\n\n[Bingxin Ke](http://www.kebingxin.com/),\n[Anton Obukhov](https://www.obukhov.ai/),\n[Shengyu Huang](https://shengyuh.github.io/),\n[Nando Metzger](https://nandometzger.github.io/),\n[Rodrigo Caye Daudt](https://rcdaudt.github.io/),\n[Konrad Schindler](https://scholar.google.com/citations?user=FZuNgqIAAAAJ\u0026hl=en )\n\nWe present Marigold, a diffusion model, and associated fine-tuning protocol for monocular depth estimation. Its core principle is to leverage the rich visual knowledge stored in modern generative image models. Our model, derived from Stable Diffusion and fine-tuned with synthetic data, can zero-shot transfer to unseen data, offering state-of-the-art monocular depth estimation results.\n\n![teaser](doc/teaser_collage_transparant.png)\n\n## 📢 News\n2024-05-28: Training code is released.\u003cbr\u003e\n2024-03-23: Added [LCM v1.0](https://huggingface.co/prs-eth/marigold-lcm-v1-0) for faster inference - try it out at \u003ca href=\"https://huggingface.co/spaces/prs-eth/marigold-lcm\"\u003e\u003cimg src=\"https://img.shields.io/badge/🤗%20Hugging%20Face%20(LCM)-Space-yellow\" height=\"16\"\u003e\u003c/a\u003e\u003cbr\u003e\n2024-03-04: Accepted to CVPR 2024. \u003cbr\u003e\n2023-12-22: Contributed to Diffusers [community pipeline](https://github.com/huggingface/diffusers/tree/main/examples/community#marigold-depth-estimation). \u003cbr\u003e\n2023-12-19: Updated [license](LICENSE.txt) to Apache License, Version 2.0.\u003cbr\u003e\n2023-12-08: Added\n\u003ca href=\"https://huggingface.co/spaces/toshas/marigold\"\u003e\u003cimg src=\"https://img.shields.io/badge/🤗%20Hugging%20Face-Space-yellow\" height=\"16\"\u003e\u003c/a\u003e - try it out with your images for free!\u003cbr\u003e\n2023-12-05: Added \u003ca href=\"https://colab.research.google.com/drive/12G8reD13DdpMie5ZQlaFNo2WCGeNUH-u?usp=sharing\"\u003e\u003cimg src=\"doc/badges/badge-colab.svg\" height=\"16\"\u003e\u003c/a\u003e - dive deeper into our inference pipeline!\u003cbr\u003e\n2023-12-04: Added \u003ca href=\"https://arxiv.org/abs/2312.02145\"\u003e\u003cimg src=\"https://img.shields.io/badge/arXiv-PDF-b31b1b\" height=\"16\"\u003e\u003c/a\u003e\npaper and inference code (this repository).\n\n## 🚀 Usage\n\n**We offer several ways to interact with Marigold**:\n\n1. We integrated [Marigold Pipelines into diffusers 🧨](https://huggingface.co/docs/diffusers/api/pipelines/marigold). Check out many exciting usage scenarios in [this diffusers tutorial](https://huggingface.co/docs/diffusers/using-diffusers/marigold_usage).\n\n1. A free online interactive demo is available here: \u003ca href=\"https://huggingface.co/spaces/prs-eth/marigold-lcm\"\u003e\u003cimg src=\"https://img.shields.io/badge/🤗%20Hugging%20Face%20(LCM)-Space-yellow\" height=\"16\"\u003e\u003c/a\u003e (kudos to the HF team for the GPU grant)\n\n1. Run the demo locally (requires a GPU and an `nvidia-docker2`, see [Installation Guide](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html)):\n    1. Paper version: `docker run -it -p 7860:7860 --platform=linux/amd64 --gpus all registry.hf.space/toshas-marigold:latest python app.py`\n    1. LCM version: `docker run -it -p 7860:7860 --platform=linux/amd64 --gpus all registry.hf.space/prs-eth-marigold-lcm:latest python app.py`\n\n1. Extended demo on a Google Colab: \u003ca href=\"https://colab.research.google.com/drive/12G8reD13DdpMie5ZQlaFNo2WCGeNUH-u?usp=sharing\"\u003e\u003cimg src=\"doc/badges/badge-colab.svg\" height=\"16\"\u003e\u003c/a\u003e\n\n1. If you just want to see the examples, visit our gallery: \u003ca href=\"https://marigoldmonodepth.github.io\"\u003e\u003cimg src=\"doc/badges/badge-website.svg\" height=\"16\"\u003e\u003c/a\u003e\n\n1. Finally, local development instructions with this codebase are given below.\n\n## 🛠️ Setup\n\nThe inference code was tested on:\n\n- Ubuntu 22.04 LTS, Python 3.10.12,  CUDA 11.7, GeForce RTX 3090 (pip)\n\n### 🪧 A Note for Windows users\n\nWe recommend running the code in WSL2:\n\n1. Install WSL following [installation guide](https://learn.microsoft.com/en-us/windows/wsl/install#install-wsl-command).\n1. Install CUDA support for WSL following [installation guide](https://docs.nvidia.com/cuda/wsl-user-guide/index.html#cuda-support-for-wsl-2).\n1. Find your drives in `/mnt/\u003cdrive letter\u003e/`; check [WSL FAQ](https://learn.microsoft.com/en-us/windows/wsl/faq#how-do-i-access-my-c--drive-) for more details. Navigate to the working directory of choice. \n\n### 📦 Repository\n\nClone the repository (requires git):\n\n```bash\ngit clone https://github.com/prs-eth/Marigold.git\ncd Marigold\n```\n\n### 💻 Dependencies\n\nWe provide several ways to install the dependencies.\n\n1. **Using [Mamba](https://github.com/mamba-org/mamba)**, which can installed together with [Miniforge3](https://github.com/conda-forge/miniforge?tab=readme-ov-file#miniforge3). \n\n    Windows users: Install the Linux version into the WSL.\n\n    After the installation, Miniforge needs to be activated first: `source /home/$USER/miniforge3/bin/activate`.\n\n    Create the environment and install dependencies into it:\n\n    ```bash\n    mamba env create -n marigold --file environment.yaml\n    conda activate marigold\n    ```\n\n2. **Using pip:** \n    Alternatively, create a Python native virtual environment and install dependencies into it:\n\n    ```bash\n    python -m venv venv/marigold\n    source venv/marigold/bin/activate\n    pip install -r requirements.txt\n    ```\n\nKeep the environment activated before running the inference script. \nActivate the environment again after restarting the terminal session.\n\n## 🏃 Testing on your images\n\n### 📷 Prepare images\n\n1. Use selected images from our paper:\n\n    ```bash\n    bash script/download_sample_data.sh\n    ```\n\n1. Or place your images in a directory, for example, under `input/in-the-wild_example`, and run the following inference command.\n\n### 🚀 Run inference with LCM (faster)\n\nThe [LCM checkpoint](https://huggingface.co/prs-eth/marigold-lcm-v1-0) is distilled from our original checkpoint towards faster inference speed (by reducing inference steps). The inference steps can be as few as 1 (default) to 4. Run with default LCM setting:\n\n```bash\n python run.py \\\n     --input_rgb_dir input/in-the-wild_example \\\n     --output_dir output/in-the-wild_example_lcm\n ```\n\n### 🎮 Run inference with DDIM (paper setting)\n\nThis setting corresponds to our paper. For academic comparison, please run with this setting.\n\n```bash\npython run.py \\\n    --checkpoint prs-eth/marigold-v1-0 \\\n    --denoise_steps 50 \\\n    --ensemble_size 10 \\\n    --input_rgb_dir input/in-the-wild_example \\\n    --output_dir output/in-the-wild_example\n```\n\nYou can find all results in `output/in-the-wild_example`. Enjoy!\n\n### ⚙️ Inference settings\n\nThe default settings are optimized for the best result. However, the behavior of the code can be customized:\n\n- Trade-offs between the **accuracy** and **speed** (for both options, larger values result in better accuracy at the cost of slower inference.)\n  - `--ensemble_size`: Number of inference passes in the ensemble. For LCM `ensemble_size` is more important than `denoise_steps`. Default: ~~10~~ 5 (for LCM).\n  - `--denoise_steps`: Number of denoising steps of each inference pass. For the original (DDIM) version, it's recommended to use 10-50 steps, while for LCM 1-4 steps. When unassigned (`None`), will read default setting from model config. Default: ~~10 4 (for LCM)~~ `None`.\n\n- By default, the inference script resizes input images to the *processing resolution*, and then resizes the prediction back to the original resolution. This gives the best quality, as Stable Diffusion, from which Marigold is derived, performs best at 768x768 resolution.  \n  \n  - `--processing_res`: the processing resolution; set as 0 to process the input resolution directly. When unassigned (`None`), will read default setting from model config. Default: ~~768~~ `None`.\n  - `--output_processing_res`: produce output at the processing resolution instead of upsampling it to the input resolution. Default: False.\n  - `--resample_method`: the resampling method used to resize images and depth predictions. This can be one of `bilinear`, `bicubic`, or `nearest`. Default: `bilinear`.\n\n- `--half_precision` or `--fp16`: Run with half-precision (16-bit float) to have faster speed and reduced VRAM usage, but might lead to suboptimal results.\n- `--seed`: Random seed can be set to ensure additional reproducibility. Default: None (unseeded). Note: forcing `--batch_size 1` helps to increase reproducibility. To ensure full reproducibility, [deterministic mode](https://pytorch.org/docs/stable/notes/randomness.html#avoiding-nondeterministic-algorithms) needs to be used.\n- `--batch_size`: Batch size of repeated inference. Default: 0 (best value determined automatically).\n- `--color_map`: [Colormap](https://matplotlib.org/stable/users/explain/colors/colormaps.html) used to colorize the depth prediction. Default: Spectral. Set to `None` to skip colored depth map generation.\n- `--apple_silicon`: Use Apple Silicon MPS acceleration.\n\n### ⬇ Checkpoint cache\n\nBy default, the [checkpoint](https://huggingface.co/prs-eth/marigold-v1-0) is stored in the Hugging Face cache.\nThe `HF_HOME` environment variable defines its location and can be overridden, e.g.:\n\n```bash\nexport HF_HOME=$(pwd)/cache\n```\n\nAlternatively, use the following script to download the checkpoint weights locally:\n\n```bash\nbash script/download_weights.sh marigold-v1-0\n# or LCM checkpoint\nbash script/download_weights.sh marigold-lcm-v1-0\n```\n\nAt inference, specify the checkpoint path:\n\n```bash\npython run.py \\\n    --checkpoint checkpoint/marigold-v1-0 \\\n    --denoise_steps 50 \\\n    --ensemble_size 10 \\\n    --input_rgb_dir input/in-the-wild_example\\\n    --output_dir output/in-the-wild_example\n```\n\n## 🦿 Evaluation on test datasets \u003ca name=\"evaluation\"\u003e\u003c/a\u003e\n\nInstall additional dependencies:\n\n```bash\npip install -r requirements+.txt -r requirements.txt\n```\n\nSet data directory variable (also needed in evaluation scripts) and download [evaluation datasets](https://share.phys.ethz.ch/~pf/bingkedata/marigold/evaluation_dataset) into corresponding subfolders:\n\n```bash\nexport BASE_DATA_DIR=\u003cYOUR_DATA_DIR\u003e  # Set target data directory\n\nwget -r -np -nH --cut-dirs=4 -R \"index.html*\" -P ${BASE_DATA_DIR} https://share.phys.ethz.ch/~pf/bingkedata/marigold/evaluation_dataset/\n```\n\nRun inference and evaluation scripts, for example:\n\n```bash\n# Run inference\nbash script/eval/11_infer_nyu.sh\n\n# Evaluate predictions\nbash script/eval/12_eval_nyu.sh\n```\n\nNote: although the seed has been set, the results might still be slightly different on different hardware.\n\n## 🏋️ Training\n\nBased on the previously created environment, install extended requirements:\n\n```bash\npip install -r requirements++.txt -r requirements+.txt -r requirements.txt\n```\n\nSet environment parameters for the data directory:\n\n```bash\nexport BASE_DATA_DIR=YOUR_DATA_DIR  # directory of training data\nexport BASE_CKPT_DIR=YOUR_CHECKPOINT_DIR  # directory of pretrained checkpoint\n```\n\nDownload Stable Diffusion v2 [checkpoint](https://huggingface.co/stabilityai/stable-diffusion-2) into `${BASE_CKPT_DIR}`\n\nPrepare for [Hypersim](https://github.com/apple/ml-hypersim) and [Virtual KITTI 2](https://europe.naverlabs.com/research/computer-vision/proxy-virtual-worlds-vkitti-2/) datasets and save into `${BASE_DATA_DIR}`. Please refer to [this README](script/dataset_preprocess/hypersim/README.md) for Hypersim preprocessing.\n\nRun training script\n\n```bash\npython train.py --config config/train_marigold.yaml\n```\n\nResume from a checkpoint, e.g.\n\n```bash\npython train.py --resume_run output/marigold_base/checkpoint/latest\n```\n\nEvaluating results\n\nOnly the U-Net is updated and saved during training. To use the inference pipeline with your training result, replace `unet` folder in Marigold checkpoints with that in the `checkpoint` output folder. Then refer to [this section](#evaluation) for evaluation.\n\n**Note**: Although random seeds have been set, the training result might be slightly different on different hardwares. It's recommended to train without interruption.\n\n## ✏️ Contributing\n\nPlease refer to [this](CONTRIBUTING.md) instruction.\n\n## 🤔 Troubleshooting\n\n| Problem                                                                                                                                      | Solution                                                       |\n|----------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------|\n| (Windows) Invalid DOS bash script on WSL                                                                                                     | Run `dos2unix \u003cscript_name\u003e` to convert script format          |\n| (Windows) error on WSL: `Could not load library libcudnn_cnn_infer.so.8. Error: libcuda.so: cannot open shared object file: No such file or directory` | Run `export LD_LIBRARY_PATH=/usr/lib/wsl/lib:$LD_LIBRARY_PATH` |\n\n\n## 🎓 Citation\n\nPlease cite our paper:\n\n```bibtex\n@InProceedings{ke2023repurposing,\n      title={Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation},\n      author={Bingxin Ke and Anton Obukhov and Shengyu Huang and Nando Metzger and Rodrigo Caye Daudt and Konrad Schindler},\n      booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},\n      year={2024}\n}\n```\n\n## 🎫 License\n\nThis work is licensed under the Apache License, Version 2.0 (as defined in the [LICENSE](LICENSE.txt)).\n\nBy downloading and using the code and model you agree to the terms in the  [LICENSE](LICENSE.txt).\n\n[![License](https://img.shields.io/badge/License-Apache--2.0-929292)](https://www.apache.org/licenses/LICENSE-2.0)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fprs-eth%2FMarigold","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fprs-eth%2FMarigold","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fprs-eth%2FMarigold/lists"}