{"id":19517506,"url":"https://github.com/cvg/glace","last_synced_at":"2025-09-06T07:32:31.506Z","repository":{"id":243117297,"uuid":"811233174","full_name":"cvg/glace","owner":"cvg","description":"[CVPR 2024] GLACE: Global Local Accelerated Coordinate Encoding","archived":false,"fork":false,"pushed_at":"2024-07-01T10:30:30.000Z","size":20464,"stargazers_count":80,"open_issues_count":0,"forks_count":3,"subscribers_count":8,"default_branch":"main","last_synced_at":"2024-12-25T16:06:15.293Z","etag":null,"topics":["computer-vision","deep-learning","pose-estimation","scene-coordinate-regression","visual-localization"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/cvg.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-06-06T07:43:55.000Z","updated_at":"2024-12-24T06:03:31.000Z","dependencies_parsed_at":"2024-06-06T21:08:16.014Z","dependency_job_id":null,"html_url":"https://github.com/cvg/glace","commit_stats":null,"previous_names":["cvg/glace"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cvg%2Fglace","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cvg%2Fglace/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cvg%2Fglace/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cvg%2Fglace/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/cvg","download_url":"https://codeload.github.com/cvg/glace/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":232104487,"owners_count":18473166,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["computer-vision","deep-learning","pose-estimation","scene-coordinate-regression","visual-localization"],"created_at":"2024-11-11T00:04:19.337Z","updated_at":"2025-01-01T17:08:12.184Z","avatar_url":"https://github.com/cvg.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# GLACE: Global Local Accelerated Coordinate Encoding\n\n----------------------------------------------------------------------------------------\n\nThis repository contains the code associated to the GLACE paper:\n\u003e **GLACE: Global Local Accelerated Coordinate Encoding**\n\u003e \n\u003e Fangjinhua Wang, Xudong Jiang, Silvano Galliani, Christoph Vogel, Marc Pollefeys\n\u003e \n\u003e CVPR 2024\n\nFor further information please visit:\n\n- [Project page](https://xjiangan.github.io/glace)\n- [Arxiv](https://arxiv.org/abs/2406.04340)\n\n\n\n## Installation\n\nIn your python environment install the required dependencies:\n```shell\npip install -r requirements.txt\n```\nIt was tested on Linux python 3.10, pytorch 2.2.2 with cuda 11.8\n\nThe GLACE network predicts dense 3D scene coordinates associated to the pixels of the input images.\nIn order to estimate the 6DoF camera poses, it relies on the RANSAC implementation of the DSAC* paper (Brachmann and\nRother, TPAMI 2021), which is written in C++.\nAs such, you need to build and install the C++/Python bindings of those functions.\nYou can do this with:\n\n```shell\ncd dsacstar\npython setup.py install\n```\n\nHaving done the steps above, you are ready to experiment with GLACE!\n\n## Datasets\n\nThe GLACE method has been evaluated using multiple published datasets:\n\n- [Microsoft 7-Scenes](https://www.microsoft.com/en-us/research/project/rgb-d-dataset-7-scenes/)\n- [Stanford 12-Scenes](https://graphics.stanford.edu/projects/reloc/)\n- [Cambridge Landmarks](https://www.repository.cam.ac.uk/handle/1810/251342/)\n- [Aachen Day-Night](https://www.visuallocalization.net/)\n\nWe provide scripts in the `datasets` folder to automatically download and extract the data in a format that can be\nreadily used by the GLACE scripts.\nThe format is the same used by the DSAC* codebase, see [here](https://github.com/vislearn/dsacstar#data-structure) for\ndetails.\n\n\u003e **Important: make sure you have checked the license terms of each dataset before using it.**\n\n### {7, 12}-Scenes:\n\nYou can use the `datasets/setup_{7,12}scenes.py` scripts to download the data.\nAs mentioned in the paper, we experimented with two variants of each of these datasets: one using the original\nD-SLAM ground truth camera poses, and one using _Pseudo Ground Truth (PGT)_ camera poses obtained after running SfM on\nthe scenes\n(see\nthe [ICCV 2021 paper](https://openaccess.thecvf.com/content/ICCV2021/html/Brachmann_On_the_Limits_of_Pseudo_Ground_Truth_in_Visual_Camera_ICCV_2021_paper.html)\n,\nand [associated code](https://github.com/tsattler/visloc_pseudo_gt_limitations/) for details).\n\nTo download and prepare the datasets using the D-SLAM poses:\n\n```shell\ncd datasets\n# Downloads the data to datasets/7scenes_{chess, fire, ...}\n./setup_7scenes.py\n# Downloads the data to datasets/12scenes_{apt1_kitchen, ...}\n./setup_12scenes.py\n``` \n\nTo download and prepare the datasets using the PGT poses:\n\n```shell\ncd datasets\n# Downloads the data to datasets/pgt_7scenes_{chess, fire, ...}\n./setup_7scenes.py --poses pgt\n# Downloads the data to datasets/pgt_12scenes_{apt1_kitchen, ...}\n./setup_12scenes.py --poses pgt\n``` \n\n### Cambridge Landmarks / Aachen Day-Night:\n\nWe used a single variant of these datasets. Simply run:\n\n```shell\ncd datasets\n# Downloads the data to datasets/Cambridge_{GreatCourt, KingsCollege, ...}\n./setup_cambridge.py\n# Downloads the data to datasets/aachen\n./setup_aachen.py\n```\n\nNote: The Aachen Day-Night dataset has no public test ground truth. The dataset script will create dummy ground truth in the form of identity camera poses. The actual pose evaluation has to be performed via the dataset website [Visual Localization Benchmark](https://www.visuallocalization.net/).\n\n## Usage\n\n\n### Global feature extraction\n\nWe use [R2Former](https://github.com/bytedance/R2Former) for global feature. Please download the pre-trained checkpoint [CVPR23_DeitS_Rerank.pth](https://drive.google.com/file/d/1RU4wnupKXpmM0FiPeglqeNizBw4w6j38).  \nRun the following to extract the global features for all the images in the dataset. \n\n```shell\ncd datasets\npython extract_features.py \u003cscene path\u003e --checkpoint \u003cpath to the R2Former checkpoint\u003e\n```\n\n### GLACE Training\n\nThe GLACE scene-specific coordinate regression head for a scene can be trained using the `train_ace.py` script.\nBasic usage:\n\n```shell\n\ntorchrun --standalone --nnodes \u003cnum nodes\u003e --nproc-per-node \u003cnum gpus per node\u003e \\\n  ./train_ace.py \u003cscene path\u003e \u003coutput map name\u003e\n# Example:\ntorchrun --standalone --nnodes 1 --nproc-per-node 1 \\\n  ./train_ace.py datasets/7scenes_chess output/7scenes_chess.pt\n```\n\nThe output map file contains just the weights of the scene-specific head network -- encoded as half-precision floating\npoint -- for a size of ~9MB when using default options, as mentioned in the paper. The testing script will use these\nweights, together with the scene-agnostic pretrained encoder (`ace_encoder_pretrained.pt`), to estimate 6DoF\nposes for the query images.\n\n**Additional parameters** that can be passed to the training script to alter its behavior:\n\n- `--training_buffer_size`: Changes the size of the training buffer containing decorrelated image features (see paper),\n  that is created at the beginning of the training process. The default size is 16M.\n- `--samples_per_image`: How many features to sample from each image during the buffer generation phase. This affects\n  the amount of time necessary to fill the training buffer, but also affects the amount of decorrelation in the features\n  present in the buffer. The default is 1024 samples per image.\n- `--max_iterations`: How many training iterations are performed during the training. This directly affects the\n  training time. Default is 30000.\n- `--num_head_blocks`: The depth of the head network. Specifically, the number of extra 3-layer residual blocks to add\n  to the default head depth. Default value is 1, which results in a head network composed of 9 layers, for a total of\n  9MB weights.\n- `--mlp_ratio`: The ratio of the hidden size of the residual block to the hidden size of the head. Default is 1.\n- `--num_decoder_clusters`: The number of clusters to use in the position decoder. Default is 1.\n\nThere are other options available, they can be discovered by running the script with the `--help` flag.\n\n### GLACE Evaluation\n\nThe pose estimation for a testing scene can be performed using the `test_ace.py` script.\nBasic usage:\n\n```shell\n./test_ace.py \u003cscene path\u003e \u003coutput map name\u003e\n# Example:\n./test_ace.py datasets/7scenes_chess output/7scenes_chess.pt\n```\n\nThe script loads (a) the scene-specific GLACE head network and (b) the pre-trained scene-agnostic encoder and, for each\ntesting frame:\n\n- Computes its per-pixel 3D scene coordinates, resulting in a set of 2D-3D correspondences.\n- The correspondences are then passed to a RANSAC algorithm that is able to estimate a 6DoF camera pose.\n- The camera poses are compared with the ground truth, and various cumulative metrics are then computed and printed\n  at the end of the script.\n\nThe metrics include: %-age of frames within certain translation/angle thresholds of the ground truth,\nmedian translation, median rotation error.\n\nThe script also creates a file containing per-frame results so that they can be parsed by other tools or analyzed\nseparately.\nThe output file is located alongside the head network and is named: `poses_\u003cmap name\u003e_\u003csession\u003e.txt`.\n\nEach line in the output file contains the results for an individual query frame, in this format:\n\n```\nfile_name rot_quaternion_w rot_quaternion_x rot_quaternion_y rot_quaternion_z translation_x translation_y translation_z rot_err_deg tr_err_m inlier_count\n```\n\nThere are some parameters that can be passed to the script to customize the RANSAC behavior:\n\n- `--session`: Custom suffix to append to the name of the file containing the estimated camera poses.\n- `--hypotheses`: How many pose hypotheses to generate and evaluate (i.e. the number of RANSAC iterations). Default is\n    64.\n- `--threshold`: Inlier threshold (in pixels) to consider a 2D-3D correspondence as valid.\n- `--render_visualization`: Set to `True` to enable generating frames showing the evaluation process. Will slow down the\n  testing significantly if enabled. Default `False`.\n- `--render_target_path`: Base folder where the frames will be saved. The script automatically appends the current map\n  name to the folder. Default is `renderings`.\n\nThere are other options available, they can be discovered by running the script with the `--help` flag.\n\n### Complete training and evaluation scripts\n\nWe provide several scripts to run training and evaluation on the various datasets we tested our method with.\nThese allow replicating the results we showcased in the paper.\nThey are located under the `scripts` folder: `scripts/train_*.sh`.\n\n### Pretrained GLACE Networks\n\nWe also make available the set of pretrained GLACE Heads we used for the experiments in the paper.\n\nEach network can be passed directly to the `test_ace.py` script, together with the path to its dataset scene, to run\ncamera relocalization on the images of the testing split and compute the accuracy metrics, like this:\n\n```shell\n./test_ace.py datasets/7scenes_chess \u003cDownloads\u003e/7Scenes/7scenes_chess.pt\n```\n\n**The weights are available\nat [this location](https://hkustconnect-my.sharepoint.com/:u:/g/personal/xjiangan_connect_ust_hk/ESdgFNFTuBtAqFkohVsu-wUBwMXCgEukJH0H1CCSLkxGPg?e=BL0Xx0).**\n\n\n## Publications\n\nIf you use GLACE or parts of its code in your own work, please cite:\n\n```\n@inproceedings{GLACE2024CVPR,\n      title     = {GLACE: Global Local Accelerated Coordinate Encoding},\n      author    = {Fangjinhua Wang and Xudong Jiang and Silvano Galliani and Christoph Vogel and Marc Pollefeys},\n      booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},\n      month     = {June},\n      year      = {2024}\n  }\n```\n\nThis code uses R2former for global feature extraction. Please consider citing:\n\n```\n@article{Zhu2023R2FU,\n  title={\\$R^\\{2\\}\\$ Former: Unified Retrieval and Reranking Transformer for Place Recognition},\n  author={Sijie Zhu and Linjie Yang and Chen Chen and Mubarak Shah and Xiaohui Shen and Heng Wang},\n  journal={2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},\n  year={2023},\n  pages={19370-19380},\n}\n```\n\nThis code builds on previous camera relocalization pipelines, namely DSAC, DSAC++, DSAC*, and ACE. Please consider citing:\n\n```\n@inproceedings{brachmann2023ace,\n    title={Accelerated Coordinate Encoding: Learning to Relocalize in Minutes using RGB and Poses},\n    author={Brachmann, Eric and Cavallari, Tommaso and Prisacariu, Victor Adrian},\n    booktitle={CVPR},\n    year={2023},\n}\n\n@inproceedings{brachmann2017dsac,\n  title={{DSAC}-{Differentiable RANSAC} for Camera Localization},\n  author={Brachmann, Eric and Krull, Alexander and Nowozin, Sebastian and Shotton, Jamie and Michel, Frank and Gumhold, Stefan and Rother, Carsten},\n  booktitle={CVPR},\n  year={2017}\n}\n\n@inproceedings{brachmann2018lessmore,\n  title={Learning less is more - {6D} camera localization via {3D} surface regression},\n  author={Brachmann, Eric and Rother, Carsten},\n  booktitle={CVPR},\n  year={2018}\n}\n\n@article{brachmann2021dsacstar,\n  title={Visual Camera Re-Localization from {RGB} and {RGB-D} Images Using {DSAC}},\n  author={Brachmann, Eric and Rother, Carsten},\n  journal={TPAMI},\n  year={2021}\n}\n```\n\n## License\n\nCopyright © Niantic, Inc. 2023. Patent Pending.\nAll rights reserved.\nPlease see the [license file](LICENSE) for terms.\nModified files: `ace_network.py`, `ace_trainer.py`, `ace_vis_utils.py`, `ace_visualizer.py`, `dataset.py`, `test_ace.py`, `train_ace.py`, scripts in `scripts/` folder.\n\nDatasets in the `datasets` folder are provided with their own licenses. Please check their license terms before using.\nGlobal feature extraction script `datasets/extract_features.py` is based on R2Former, which is licensed under the [Apache License 2.0](https://github.com/bytedance/R2Former/blob/91d314f25de64098cdc8a479d9f022fdc2287f49/LICENSE).\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcvg%2Fglace","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcvg%2Fglace","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcvg%2Fglace/lists"}