{"id":15034631,"url":"https://github.com/isl-org/zoedepth","last_synced_at":"2025-04-14T11:20:50.297Z","repository":{"id":91349206,"uuid":"565837677","full_name":"isl-org/ZoeDepth","owner":"isl-org","description":"Metric depth estimation from a single image","archived":false,"fork":false,"pushed_at":"2024-05-03T17:03:49.000Z","size":4194,"stargazers_count":2545,"open_issues_count":98,"forks_count":231,"subscribers_count":36,"default_branch":"main","last_synced_at":"2025-04-07T02:18:24.436Z","etag":null,"topics":["adaptive-bins","deep-learning","depth-estimation","metric-depth-estimation","monocular-depth-estimation","neural-networks","pretrained-models","transformers","zero-shot-transfer"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/isl-org.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-11-14T12:40:17.000Z","updated_at":"2025-04-05T14:35:16.000Z","dependencies_parsed_at":"2024-05-12T22:35:49.111Z","dependency_job_id":"f069ef60-8ca6-47cc-9cfd-30936b246b29","html_url":"https://github.com/isl-org/ZoeDepth","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/isl-org%2FZoeDepth","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/isl-org%2FZoeDepth/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/isl-org%2FZoeDepth/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/isl-org%2FZoeDepth/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/isl-org","download_url":"https://codeload.github.com/isl-org/ZoeDepth/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248868767,"owners_count":21174758,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["adaptive-bins","deep-learning","depth-estimation","metric-depth-estimation","monocular-depth-estimation","neural-networks","pretrained-models","transformers","zero-shot-transfer"],"created_at":"2024-09-24T20:25:47.727Z","updated_at":"2025-04-14T11:20:50.274Z","avatar_url":"https://github.com/isl-org.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# **ZoeDepth: Combining relative and metric depth** (Official implementation)  \u003c!-- omit in toc --\u003e\n[![Open In Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/isl-org/ZoeDepth)\n[![Open in Spaces](https://huggingface.co/datasets/huggingface/badges/raw/main/open-in-hf-spaces-sm.svg)](https://huggingface.co/spaces/shariqfarooq/ZoeDepth)\n\n[![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](https://opensource.org/licenses/MIT) ![PyTorch](https://img.shields.io/badge/PyTorch_v1.10.1-EE4C2C?\u0026logo=pytorch\u0026logoColor=white) \n[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/zoedepth-zero-shot-transfer-by-combining/monocular-depth-estimation-on-nyu-depth-v2)](https://paperswithcode.com/sota/monocular-depth-estimation-on-nyu-depth-v2?p=zoedepth-zero-shot-transfer-by-combining)\n\n\u003e#### [ZoeDepth: Zero-shot Transfer by Combining Relative and Metric Depth](https://arxiv.org/abs/2302.12288)\n\u003e ##### [Shariq Farooq Bhat](https://shariqfarooq123.github.io), [Reiner Birkl](https://www.researchgate.net/profile/Reiner-Birkl), [Diana Wofk](https://dwofk.github.io/), [Peter Wonka](http://peterwonka.net/), [Matthias Müller](https://matthias.pw/)\n\n[[Paper]](https://arxiv.org/abs/2302.12288)\n\n![teaser](assets/zoedepth-teaser.png)\n\n## **Table of Contents** \u003c!-- omit in toc --\u003e\n- [**Usage**](#usage)\n  - [Using torch hub](#using-torch-hub)\n  - [Using local copy](#using-local-copy)\n    - [Using local torch hub](#using-local-torch-hub)\n    - [or load the models manually](#or-load-the-models-manually)\n  - [Using ZoeD models to predict depth](#using-zoed-models-to-predict-depth)\n- [**Environment setup**](#environment-setup)\n- [**Sanity checks** (Recommended)](#sanity-checks-recommended)\n- [Model files](#model-files)\n- [**Evaluation**](#evaluation)\n  - [Evaluating offical models](#evaluating-offical-models)\n  - [Evaluating local checkpoint](#evaluating-local-checkpoint)\n- [**Training**](#training)\n- [**Gradio demo**](#gradio-demo)\n- [**Citation**](#citation)\n\n\n## **Usage**\nIt is recommended to fetch the latest [MiDaS repo](https://github.com/isl-org/MiDaS) via torch hub before proceeding:\n```python\nimport torch\n\ntorch.hub.help(\"intel-isl/MiDaS\", \"DPT_BEiT_L_384\", force_reload=True)  # Triggers fresh download of MiDaS repo\n```\n### **ZoeDepth models** \u003c!-- omit in toc --\u003e\n### Using torch hub\n```python\nimport torch\n\nrepo = \"isl-org/ZoeDepth\"\n# Zoe_N\nmodel_zoe_n = torch.hub.load(repo, \"ZoeD_N\", pretrained=True)\n\n# Zoe_K\nmodel_zoe_k = torch.hub.load(repo, \"ZoeD_K\", pretrained=True)\n\n# Zoe_NK\nmodel_zoe_nk = torch.hub.load(repo, \"ZoeD_NK\", pretrained=True)\n```\n### Using local copy\nClone this repo:\n```bash\ngit clone https://github.com/isl-org/ZoeDepth.git \u0026\u0026 cd ZoeDepth\n```\n#### Using local torch hub\nYou can use local source for torch hub to load the ZoeDepth models, for example: \n```python\nimport torch\n\n# Zoe_N\nmodel_zoe_n = torch.hub.load(\".\", \"ZoeD_N\", source=\"local\", pretrained=True)\n```\n\n#### or load the models manually\n```python\nfrom zoedepth.models.builder import build_model\nfrom zoedepth.utils.config import get_config\n\n# ZoeD_N\nconf = get_config(\"zoedepth\", \"infer\")\nmodel_zoe_n = build_model(conf)\n\n# ZoeD_K\nconf = get_config(\"zoedepth\", \"infer\", config_version=\"kitti\")\nmodel_zoe_k = build_model(conf)\n\n# ZoeD_NK\nconf = get_config(\"zoedepth_nk\", \"infer\")\nmodel_zoe_nk = build_model(conf)\n```\n\n### Using ZoeD models to predict depth \n```python\n##### sample prediction\nDEVICE = \"cuda\" if torch.cuda.is_available() else \"cpu\"\nzoe = model_zoe_n.to(DEVICE)\n\n\n# Local file\nfrom PIL import Image\nimage = Image.open(\"/path/to/image.jpg\").convert(\"RGB\")  # load\ndepth_numpy = zoe.infer_pil(image)  # as numpy\n\ndepth_pil = zoe.infer_pil(image, output_type=\"pil\")  # as 16-bit PIL Image\n\ndepth_tensor = zoe.infer_pil(image, output_type=\"tensor\")  # as torch tensor\n\n\n\n# Tensor \nfrom zoedepth.utils.misc import pil_to_batched_tensor\nX = pil_to_batched_tensor(image).to(DEVICE)\ndepth_tensor = zoe.infer(X)\n\n\n\n# From URL\nfrom zoedepth.utils.misc import get_image_from_url\n\n# Example URL\nURL = \"https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcS4W8H_Nxk_rs3Vje_zj6mglPOH7bnPhQitBH8WkqjlqQVotdtDEG37BsnGofME3_u6lDk\u0026usqp=CAU\"\n\n\nimage = get_image_from_url(URL)  # fetch\ndepth = zoe.infer_pil(image)\n\n# Save raw\nfrom zoedepth.utils.misc import save_raw_16bit\nfpath = \"/path/to/output.png\"\nsave_raw_16bit(depth, fpath)\n\n# Colorize output\nfrom zoedepth.utils.misc import colorize\n\ncolored = colorize(depth)\n\n# save colored output\nfpath_colored = \"/path/to/output_colored.png\"\nImage.fromarray(colored).save(fpath_colored)\n```\n\n## **Environment setup**\nThe project depends on :\n- [pytorch](https://pytorch.org/) (Main framework)\n- [timm](https://timm.fast.ai/)  (Backbone helper for MiDaS)\n- pillow, matplotlib, scipy, h5py, opencv (utilities)\n\nInstall environment using `environment.yml` : \n\nUsing [mamba](https://github.com/mamba-org/mamba) (fastest):\n```bash\nmamba env create -n zoe --file environment.yml\nmamba activate zoe\n```\nUsing conda : \n\n```bash\nconda env create -n zoe --file environment.yml\nconda activate zoe\n```\n\n## **Sanity checks** (Recommended)\nCheck if models can be loaded: \n```bash\npython sanity_hub.py\n```\nTry a demo prediction pipeline:\n```bash\npython sanity.py\n```\nThis will save a file `pred.png` in the root folder, showing RGB and corresponding predicted depth side-by-side.\n## Model files\nModels are defined under `models/` folder, with `models/\u003cmodel_name\u003e_\u003cversion\u003e.py` containing model definitions and  `models/config_\u003cmodel_name\u003e.json` containing configuration.\n\nSingle metric head models (Zoe_N and Zoe_K from the paper) have the common definition and are defined under `models/zoedepth` while as the multi-headed model (Zoe_NK) is defined under `models/zoedepth_nk`.\n## **Evaluation**\nDownload the required dataset and change the `DATASETS_CONFIG` dictionary in `utils/config.py` accordingly. \n### Evaluating offical models\nOn NYU-Depth-v2 for example:\n\nFor ZoeD_N:\n```bash\npython evaluate.py -m zoedepth -d nyu\n```\n\nFor ZoeD_NK:\n```bash\npython evaluate.py -m zoedepth_nk -d nyu\n```\n\n### Evaluating local checkpoint\n```bash\npython evaluate.py -m zoedepth --pretrained_resource=\"local::/path/to/local/ckpt.pt\" -d nyu\n```\nPretrained resources are prefixed with `url::` to indicate weights should be fetched from a url, or `local::` to indicate path is a local file. Refer to `models/model_io.py` for details. \n\nThe dataset name should match the corresponding key in `utils.config.DATASETS_CONFIG` .\n\n## **Training**\nDownload training datasets as per instructions given [here](https://github.com/cleinc/bts/tree/master/pytorch#nyu-depvh-v2). Then for training a single head model on NYU-Depth-v2 :\n```bash\npython train_mono.py -m zoedepth --pretrained_resource=\"\"\n```\n\nFor training the Zoe-NK model:\n```bash\npython train_mix.py -m zoedepth_nk --pretrained_resource=\"\"\n```\n## **Gradio demo**\nWe provide a UI demo built using [gradio](https://gradio.app/). To get started, install UI requirements:\n```bash\npip install -r ui/ui_requirements.txt\n```\nThen launch the gradio UI:\n```bash\npython -m ui.app\n```\n\nThe UI is also hosted on HuggingFace🤗 [here](https://huggingface.co/spaces/shariqfarooq/ZoeDepth)\n## **Citation**\n```\n@misc{https://doi.org/10.48550/arxiv.2302.12288,\n  doi = {10.48550/ARXIV.2302.12288},\n  \n  url = {https://arxiv.org/abs/2302.12288},\n  \n  author = {Bhat, Shariq Farooq and Birkl, Reiner and Wofk, Diana and Wonka, Peter and Müller, Matthias},\n  \n  keywords = {Computer Vision and Pattern Recognition (cs.CV), FOS: Computer and information sciences, FOS: Computer and information sciences},\n  \n  title = {ZoeDepth: Zero-shot Transfer by Combining Relative and Metric Depth},\n  \n  publisher = {arXiv},\n  \n  year = {2023},\n  \n  copyright = {arXiv.org perpetual, non-exclusive license}\n}\n\n```\n\n\n\n\n\n\n\n\n\n\n\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fisl-org%2Fzoedepth","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fisl-org%2Fzoedepth","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fisl-org%2Fzoedepth/lists"}