{"id":22310581,"url":"https://github.com/mihaidusmanu/d2-net","last_synced_at":"2025-05-16T18:10:28.711Z","repository":{"id":41176234,"uuid":"179720496","full_name":"mihaidusmanu/d2-net","owner":"mihaidusmanu","description":"D2-Net: A Trainable CNN for Joint Description and Detection of Local Features","archived":false,"fork":false,"pushed_at":"2024-04-08T08:16:22.000Z","size":2398,"stargazers_count":817,"open_issues_count":11,"forks_count":168,"subscribers_count":24,"default_branch":"master","last_synced_at":"2025-04-12T17:46:15.731Z","etag":null,"topics":["cnn","cvpr2019","local-features","pytorch","visual-localization"],"latest_commit_sha":null,"homepage":null,"language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mihaidusmanu.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-04-05T16:48:58.000Z","updated_at":"2025-04-08T07:51:39.000Z","dependencies_parsed_at":"2024-04-08T09:34:19.924Z","dependency_job_id":"4857ac46-b7d9-43e2-96da-64179a05f50a","html_url":"https://github.com/mihaidusmanu/d2-net","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mihaidusmanu%2Fd2-net","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mihaidusmanu%2Fd2-net/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mihaidusmanu%2Fd2-net/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mihaidusmanu%2Fd2-net/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mihaidusmanu","download_url":"https://codeload.github.com/mihaidusmanu/d2-net/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254582907,"owners_count":22095518,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cnn","cvpr2019","local-features","pytorch","visual-localization"],"created_at":"2024-12-03T21:05:30.851Z","updated_at":"2025-05-16T18:10:28.694Z","avatar_url":"https://github.com/mihaidusmanu.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# D2-Net: A Trainable CNN for Joint Detection and Description of Local Features\n\nThis repository contains the implementation of the following paper:\n\n```text\n\"D2-Net: A Trainable CNN for Joint Detection and Description of Local Features\".\nM. Dusmanu, I. Rocco, T. Pajdla, M. Pollefeys, J. Sivic, A. Torii, and T. Sattler. CVPR 2019.\n```\n\n[Paper on arXiv](https://arxiv.org/abs/1905.03561), [Project page](https://dusmanu.com/publications/d2-net.html)\n    \n## Getting started\n\nPython 3.6+ is recommended for running our code. [Conda](https://docs.conda.io/en/latest/) can be used to install the required packages:\n\n```bash\nconda install pytorch torchvision cudatoolkit=10.0 -c pytorch\nconda install h5py imageio imagesize matplotlib numpy scipy tqdm\n```\n\n## Downloading the models\n\nThe off-the-shelf **Caffe VGG16** weights and their tuned counterpart can be downloaded by running:\n\n```bash\nmkdir models\nwget https://dusmanu.com/files/d2-net/d2_ots.pth -O models/d2_ots.pth\nwget https://dusmanu.com/files/d2-net/d2_tf.pth -O models/d2_tf.pth\nwget https://dusmanu.com/files/d2-net/d2_tf_no_phototourism.pth -O models/d2_tf_no_phototourism.pth\n```\n\n**Update - 23 May 2019** We have added a new set of weights trained on MegaDepth without the PhotoTourism scenes (sagrada_familia - 0019, lincoln_memorial_statue - 0021, british_museum - 0024, london_bridge - 0025, us_capitol - 0078, mount_rushmore - 1589). Our initial results show similar performance. In order to use these weights at test time, you should add `--model_file models/d2_tf_no_phototourism.pth`.\n\n## Feature extraction\n\n`extract_features.py` can be used to extract D2 features for a given list of images. The singlescale features require less than 6GB of VRAM for 1200x1600 images. The `--multiscale` flag can be used to extract multiscale features - for this, we recommend at least 12GB of VRAM. \n\nThe output format can be either [`npz`](https://docs.scipy.org/doc/numpy/reference/generated/numpy.savez.html) or `mat`. In either case, the feature files encapsulate three arrays: \n\n- `keypoints` [`N x 3`] array containing the positions of keypoints `x, y` and the scales `s`. The positions follow the COLMAP format, with the `X` axis pointing to the right and the `Y` axis to the bottom.\n- `scores` [`N`] array containing the activations of keypoints (higher is better).\n- `descriptors` [`N x 512`] array containing the L2 normalized descriptors.\n\n```bash\npython extract_features.py --image_list_file images.txt (--multiscale)\n```\n\n# Feature extraction with kapture datasets\n\nKapture is a pivot file format, based on text and binary files, used to describe SFM (Structure From Motion) and more generally sensor-acquired data.\n\nIt is available at https://github.com/naver/kapture.\nIt contains conversion tools for popular formats and several popular datasets are directly available in kapture.\n\nIt can be installed with:\n```bash\npip install kapture\n```\n\nDatasets can be downloaded with:\n```bash\nkapture_download_dataset.py update\nkapture_download_dataset.py list\n# e.g.: install mapping and query of Extended-CMU-Seasons_slice22\nkapture_download_dataset.py install \"Extended-CMU-Seasons_slice22_*\"\n```\nIf you want to convert your own dataset into kapture, please find some examples [here](https://github.com/naver/kapture/blob/master/doc/datasets.adoc).\n\nOnce installed, you can extract keypoints for your kapture dataset with:\n```bash\npython extract_kapture.py --kapture-root pathto/yourkapturedataset (--multiscale)\n```\n\nRun `python extract_kapture.py --help` for more information on the extraction parameters. \n\n## Tuning on MegaDepth\n\nThe training pipeline provided here is a PyTorch implementation of the TensorFlow code that was used to train the model available to download above.\n\n**Update - 05 June 2019** We have fixed a bug in the dataset preprocessing - retraining now yields similar results to the original TensorFlow implementation.\n\n**Update - 07 August 2019** We have released an updated, more accurate version of the training dataset - training is more stable and significantly faster for equal performance.\n\n### Downloading and preprocessing the MegaDepth dataset\n\nFor this part, [COLMAP](https://colmap.github.io/) should be installed. Please refer to the official website for installation instructions.\n\nAfter downloading the entire [MegaDepth](http://www.cs.cornell.edu/projects/megadepth/) dataset (including SfM models), the first step is generating the undistorted reconstructions. This can be done by calling `undistort_reconstructions.py` as follows:\n\n```bash\npython undistort_reconstructions.py --colmap_path /path/to/colmap/executable --base_path /path/to/megadepth\n```\n\nNext, `preprocess_megadepth.sh` can be used to retrieve the camera parameters and compute the overlap between images for all scenes. \n\n```bash\nbash preprocess_undistorted_megadepth.sh /path/to/megadepth /path/to/output/folder\n```\n\n### Training\n\nAfter downloading and preprocessing MegaDepth, the training can be started right away:\n\n```bash\npython train.py --use_validation --dataset_path /path/to/megadepth --scene_info_path /path/to/preprocessing/output\n```\n\n## BibTeX\n\nIf you use this code in your project, please cite the following paper:\n\n```bibtex\n@InProceedings{Dusmanu2019CVPR,\n    author = {Dusmanu, Mihai and Rocco, Ignacio and Pajdla, Tomas and Pollefeys, Marc and Sivic, Josef and Torii, Akihiko and Sattler, Torsten},\n    title = {{D2-Net: A Trainable CNN for Joint Detection and Description of Local Features}},\n    booktitle = {Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition},\n    year = {2019},\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmihaidusmanu%2Fd2-net","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmihaidusmanu%2Fd2-net","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmihaidusmanu%2Fd2-net/lists"}