{"id":13577971,"url":"https://github.com/isl-org/DPT","last_synced_at":"2025-04-05T15:31:57.863Z","repository":{"id":40334171,"uuid":"350409920","full_name":"isl-org/DPT","owner":"isl-org","description":"Dense Prediction Transformers","archived":true,"fork":false,"pushed_at":"2024-12-18T16:24:13.000Z","size":437,"stargazers_count":2081,"open_issues_count":38,"forks_count":263,"subscribers_count":41,"default_branch":"main","last_synced_at":"2025-03-16T15:03:21.572Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/isl-org.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-03-22T16:17:15.000Z","updated_at":"2025-03-13T19:18:13.000Z","dependencies_parsed_at":"2023-01-28T16:16:07.139Z","dependency_job_id":"1bc3270c-4ac9-4ada-8808-3f7dd9b9fb05","html_url":"https://github.com/isl-org/DPT","commit_stats":{"total_commits":104,"total_committers":10,"mean_commits":10.4,"dds":0.2692307692307693,"last_synced_commit":"f43ef9e08d70a752195028a51be5e1aff227b913"},"previous_names":["intel-isl/dpt"],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/isl-org%2FDPT","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/isl-org%2FDPT/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/isl-org%2FDPT/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/isl-org%2FDPT/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/isl-org","download_url":"https://codeload.github.com/isl-org/DPT/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247358992,"owners_count":20926337,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-01T15:01:25.855Z","updated_at":"2025-04-05T15:31:52.853Z","avatar_url":"https://github.com/isl-org.png","language":"Python","funding_links":[],"categories":["Python","其他_机器视觉"],"sub_categories":["网络服务_其他"],"readme":"## Vision Transformers for Dense Prediction\n\nThis repository contains code and models for our [paper](https://arxiv.org/abs/2103.13413):\n\n\u003e Vision Transformers for Dense Prediction  \n\u003e René Ranftl, Alexey Bochkovskiy, Vladlen Koltun\n\n\n### Changelog \n* [March 2021] Initial release of inference code and models\n\n### Setup \n\n1) Download the model weights and place them in the `weights` folder:\n\n\nMonodepth:\n- [dpt_hybrid-midas-501f0c75.pt](https://github.com/intel-isl/DPT/releases/download/1_0/dpt_hybrid-midas-501f0c75.pt), [Mirror](https://drive.google.com/file/d/1dgcJEYYw1F8qirXhZxgNK8dWWz_8gZBD/view?usp=sharing)\n- [dpt_large-midas-2f21e586.pt](https://github.com/intel-isl/DPT/releases/download/1_0/dpt_large-midas-2f21e586.pt), [Mirror](https://drive.google.com/file/d/1vnuhoMc6caF-buQQ4hK0CeiMk9SjwB-G/view?usp=sharing)\n\nSegmentation:\n - [dpt_hybrid-ade20k-53898607.pt](https://github.com/intel-isl/DPT/releases/download/1_0/dpt_hybrid-ade20k-53898607.pt), [Mirror](https://drive.google.com/file/d/1zKIAMbltJ3kpGLMh6wjsq65_k5XQ7_9m/view?usp=sharing)\n - [dpt_large-ade20k-b12dca68.pt](https://github.com/intel-isl/DPT/releases/download/1_0/dpt_large-ade20k-b12dca68.pt), [Mirror](https://drive.google.com/file/d/1foDpUM7CdS8Zl6GPdkrJaAOjskb7hHe-/view?usp=sharing)\n  \n2) Set up dependencies: \n\n    ```shell\n    pip install -r requirements.txt\n    ```\n\n   The code was tested with Python 3.7, PyTorch 1.8.0, OpenCV 4.5.1, and timm 0.4.5\n\n### Usage \n\n1) Place one or more input images in the folder `input`.\n\n2) Run a monocular depth estimation model:\n\n    ```shell\n    python run_monodepth.py\n    ```\n\n    Or run a semantic segmentation model:\n\n    ```shell\n    python run_segmentation.py\n    ```\n\n3) The results are written to the folder `output_monodepth` and `output_semseg`, respectively.\n\nUse the flag `-t` to switch between different models. Possible options are `dpt_hybrid` (default) and `dpt_large`.\n\n\n**Additional models:**\n\n- Monodepth finetuned on KITTI: [dpt_hybrid_kitti-cb926ef4.pt](https://github.com/intel-isl/DPT/releases/download/1_0/dpt_hybrid_kitti-cb926ef4.pt) [Mirror](https://drive.google.com/file/d/1-oJpORoJEdxj4LTV-Pc17iB-smp-khcX/view?usp=sharing)\n- Monodepth finetuned on NYUv2: [dpt_hybrid_nyu-2ce69ec7.pt](https://github.com/intel-isl/DPT/releases/download/1_0/dpt_hybrid_nyu-2ce69ec7.pt) [Mirror](https\\://drive.google.com/file/d/1NjiFw1Z9lUAfTPZu4uQ9gourVwvmd58O/view?usp=sharing)\n\nRun with \n\n```shell\npython run_monodepth -t [dpt_hybrid_kitti|dpt_hybrid_nyu] \n```\n\n### Evaluation\n\nHints on how to evaluate monodepth models can be found here: https://github.com/intel-isl/DPT/blob/main/EVALUATION.md\n\n\n### Citation\n\nPlease cite our papers if you use this code or any of the models. \n```\n@article{Ranftl2021,\n\tauthor    = {Ren\\'{e} Ranftl and Alexey Bochkovskiy and Vladlen Koltun},\n\ttitle     = {Vision Transformers for Dense Prediction},\n\tjournal   = {ArXiv preprint},\n\tyear      = {2021},\n}\n```\n\n```\n@article{Ranftl2020,\n\tauthor    = {Ren\\'{e} Ranftl and Katrin Lasinger and David Hafner and Konrad Schindler and Vladlen Koltun},\n\ttitle     = {Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-shot Cross-dataset Transfer},\n\tjournal   = {IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)},\n\tyear      = {2020},\n}\n```\n\n### Acknowledgements\n\nOur work builds on and uses code from [timm](https://github.com/rwightman/pytorch-image-models) and [PyTorch-Encoding](https://github.com/zhanghang1989/PyTorch-Encoding). We'd like to thank the authors for making these libraries available.\n\n### License \n\nMIT License \n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fisl-org%2FDPT","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fisl-org%2FDPT","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fisl-org%2FDPT/lists"}