https://github.com/cvg/scrstudio
[CVPR 2025] A unified framework for Scene Coordinate Regression-based visual localization
https://github.com/cvg/scrstudio
3d-reconstruction 3d-vision computer-vision deep-learning pose-estimation scene-coordinate-regression visual-localization
Last synced: 4 months ago
JSON representation
[CVPR 2025] A unified framework for Scene Coordinate Regression-based visual localization
- Host: GitHub
- URL: https://github.com/cvg/scrstudio
- Owner: cvg
- License: apache-2.0
- Created: 2025-01-02T09:08:02.000Z (9 months ago)
- Default Branch: main
- Last Pushed: 2025-04-23T13:32:58.000Z (6 months ago)
- Last Synced: 2025-04-23T14:33:35.706Z (6 months ago)
- Topics: 3d-reconstruction, 3d-vision, computer-vision, deep-learning, pose-estimation, scene-coordinate-regression, visual-localization
- Language: Python
- Homepage:
- Size: 350 KB
- Stars: 88
- Watchers: 12
- Forks: 0
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
SCRStudio
A Unified Framework for Scene Coordinate Regression
Paper
![]()
# About
SCRStudio is a unified and modular framework for Scene Coordinate Regression (SCR)-based visual localization, built on top of the [nerfstudio](https://github.com/nerfstudio-project/nerfstudio) project.This library provides an interpretable and modular implementation of SCRs, breaking down components such as input encoding, network architecture, and supervision strategies. It offers a unified implementation of three major SCR methods: **ACE, GLACE, and R-SCoRe**. SCRStudio supports various pretrained local encodings (both sparse and dense) while incorporating state-of-the-art techniques for integrating global encodings.
# Quickstart
This guide will help you get started with the default R-SCoRe SCR model trained on the classic **Aachen** dataset.
## 1. Installation: Setup the Environment
### Create Environment
We recommend using **conda** to manage dependencies. Make sure to install [Conda](https://docs.conda.io/miniconda.html) before proceeding.
### Install Dependencies
Install PyTorch with CUDA (tested with CUDA 12.1 and 12.4). PyTorch Geometric and cuML are also required for encoding preprocessing.
For **CUDA 12.4**:
```bash
conda create -n scrstudio python=3.10 pytorch=2.5.1 torchvision=0.20.1 pytorch-cuda=12.4 cuml=25.02 -c pytorch -c rapidsai -c conda-forge -c nvidia
conda activate scrstudio
pip install --upgrade pip
pip install torch_geometric
pip install pyg_lib torch_scatter torch_sparse torch_cluster torch_spline_conv -f https://data.pyg.org/whl/torch-2.5.1+cu124.html
```### Install SCRStudio
```bash
git clone --recursive https://github.com/cvg/scrstudio.git
cd scrstudio
pip install --upgrade pip setuptools
pip install -e .
```## 2. Train Your First Model
The following steps will train a **scrfacto** model, our recommended model for large scenes.
### Download the Data
```bash
# Download Aachen dataset:
scr-download-data aachen
# Download specific capture for NAVER Lab dataset:
scr-download-data naver --capture-name dept_1F
```### Preprocessing for Training
**scrfacto** follows the methodology from **R-SCoRe** ([paper](https://arxiv.org/abs/2501.01421)), utilizing **PCA** for local encoding dimensionality reduction and **[Node2Vec](https://arxiv.org/abs/1607.00653)** for learning global encodings.
#### Local Encoding: PCA Compression
To reduce GPU memory usage for local encoding buffer, apply PCA on local encodings:
```bash
# Compute PCA for Dedode local encodings on Aachen dataset
scr-encoding-pca dedode --encoder.detector L --encoder.descriptor B --n_components 128 --data data/aachen
```This saves the PCA components as PyTorch state dictionary at: `data/aachen/proc/pcad3LB_128.pth`
#### Global Encoding: Covisibility Graph & Node2Vec Training
Compute the pose overlap score for training images:```bash
scr-overlap-score --data data/aachen/train --max_depth 50
```
This saves a sparse COO format overlap matrix at: `data/aachen/train/pose_overlap.npz`Train a **Node2Vec** model on this graph:
```bash
scr-train node2vec --data data/aachen --pipeline.model.graph pose_overlap.npz --pipeline.model.edge_threshold 0.2
```
Use the trained global encoding:```bash
cp outputs/aachen/node2vec//scrstudio_models/head.pt data/aachen/train/pose_n2c.pt
```### Model Training
Now, train the **scrfacto** model:
```bash
scr-train scrfacto --data data/aachen --pipeline.datamanager.train_dataset.feat_name pose_n2c.pt
```
Results are saved in `outputs/aachen/scrfacto/`.### 3. Evaluation
#### Preprocessing for Evaluation
Compute NetVLAD retrieve features and compress them with Product Quantization (PQ):
```bash
scr-retrieval-feat --data data/aachen/train --pq
```
Results are saved in `data/aachen/train/netvlad_feats_pq.pkl`### Running Evaluation
Compute retrieval features for test images and run evaluation:```bash
scr-retrieval-feat --data data/aachen/test
scr-eval --load-config outputs/aachen/scrfacto//config.yml --split test
```# Release Plan
We are actively preparing SCRStudio for public release. Below is the tentative schedule:- [x] March 2025: Initial release of SCRStudio.
- [ ] April 2025: SCRStudio Viewer.# Publications
This code builds on previous camera relocalization pipelines, namely DSAC, DSAC++, DSAC*, ACE, GLACE, and R-SCoRe.
Please consider citing:```
@inproceedings{brachmann2017dsac,
title={{DSAC}-{Differentiable RANSAC} for Camera Localization},
author={Brachmann, Eric and Krull, Alexander and Nowozin, Sebastian and Shotton, Jamie and Michel, Frank and Gumhold, Stefan and Rother, Carsten},
booktitle={CVPR},
year={2017}
}@inproceedings{brachmann2018lessmore,
title={Learning less is more - {6D} camera localization via {3D} surface regression},
author={Brachmann, Eric and Rother, Carsten},
booktitle={CVPR},
year={2018}
}@article{brachmann2021dsacstar,
title={Visual Camera Re-Localization from {RGB} and {RGB-D} Images Using {DSAC}},
author={Brachmann, Eric and Rother, Carsten},
journal={TPAMI},
year={2021}
}@inproceedings{brachmann2023ace,
title={Accelerated Coordinate Encoding: Learning to Relocalize in Minutes using RGB and Poses},
author={Brachmann, Eric and Cavallari, Tommaso and Prisacariu, Victor Adrian},
booktitle={CVPR},
year={2023},
}@inproceedings{wang2024glace,
title={Glace: Global local accelerated coordinate encoding},
author={Wang, Fangjinhua and Jiang, Xudong and Galliani, Silvano and Vogel, Christoph and Pollefeys, Marc},
booktitle={CVPR},
year={2024}
}@inproceedings{jiang2025rscore,
title={R-SCoRe: Revisiting Scene Coordinate Regression for Robust Large-Scale Visual Localization},
author={Jiang, Xudong and Wang, Fangjinhua and Galliani, Silvano and Vogel, Christoph and Pollefeys, Marc},
booktitle = {CVPR},
year={2025}
}
```