https://github.com/google-research/snap
SNAP: Self-supervised Neural Maps for Visual Positioning and Semantic Understanding (NeurIPS 2023)
https://github.com/google-research/snap
3d-mapping deep-learning pose-estimation self-supervised-learning
Last synced: 3 months ago
JSON representation
SNAP: Self-supervised Neural Maps for Visual Positioning and Semantic Understanding (NeurIPS 2023)
- Host: GitHub
- URL: https://github.com/google-research/snap
- Owner: google-research
- License: apache-2.0
- Created: 2023-10-05T09:14:06.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2023-12-14T10:44:32.000Z (over 2 years ago)
- Last Synced: 2025-04-03T01:01:56.846Z (about 1 year ago)
- Topics: 3d-mapping, deep-learning, pose-estimation, self-supervised-learning
- Language: Python
- Homepage:
- Size: 569 KB
- Stars: 187
- Watchers: 7
- Forks: 18
- Open Issues: 3
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
Awesome Lists containing this project
README
SNAP!
Self-Supervised Neural Maps
for Visual Positioning and Semantic Understanding
Paul-Edouard Sarlin
·
Eduard Trulls
Marc Pollefeys
·
Jan Hosang
·
Simon Lynen
NeurIPS 2023 Paper
| Video
| Poster
SNAP estimates 2D neural maps from multi-modal data like StreetView and aeral imagery.
Neural maps learn easily interpretable, high-level semantics through self-supervision alone
and can be used for geometric and semantic tasks.
##
This repository hosts the training and inference code for SNAP, a deep neural network that turns multi-modal imagery into rich 2D neural maps.
SNAP was trained on a large dataset of 50M StreetView images with associated camera poses and aerial views.
**We do not release this dataset and the trained models, so this code is provided solely as a reference and cannot be used as is to reproduce any result of the paper.**
## Usage
The project requires Python >= 3.10 and is based on [Jax](https://github.com/google/jax) and [Scenic](https://github.com/google-research/scenic). All dependencies are listed in [`requirements.txt`](./requirements.txt).
- The data is stored as TensorFlow dataset and loaded in `snap/data/loader.py`.
- Train SNAP with self-supervision:
```bash
python -m snap.train --config=snap/configs/train_localization.py \
--config.batch_size=32 \
--workdir=train_snap_sv+aerial
```
- Evaluate SNAP for visual positioning:
```bash
python -m snap.evaluate --config=snap/configs/eval_localization.py \
--config.workdir=train_snap_sv+aerial \
--workdir=. # unused
```
- Fine-tune SNAP for semantic mapping:
```bash
python -m snap.train --config=snap/configs/train_semantics.py \
--config.batch_size=32 \
--config.model.bev_mapper.pretrained_path=train_snap_sv+aerial \
--workdir=train_snap_sv+aerial_semantics
```
- Evaluate the semantic mapping:
```bash
python -m snap.evaluate --config=snap/configs/eval_semantics.py \
--config.workdir=train_snap_sv+aerial_semantics \
--workdir=. # unused
```
## BibTeX citation
If you use any ideas from the paper or code from this repo, please consider citing:
```bibtex
@inproceedings{sarlin2023snap,
author = {Paul-Edouard Sarlin and
Eduard Trulls and
Marc Pollefeys and
Jan Hosang and
Simon Lynen},
title = {{SNAP: Self-Supervised Neural Maps for Visual Positioning and Semantic Understanding}},
booktitle = {NeurIPS},
year = {2023}
}
```
*This is not an officially supported Google product.*