{"id":13435848,"url":"https://github.com/ScanNet/ScanNet","last_synced_at":"2025-03-18T12:30:29.988Z","repository":{"id":37677279,"uuid":"81673803","full_name":"ScanNet/ScanNet","owner":"ScanNet","description":null,"archived":false,"fork":false,"pushed_at":"2024-05-05T15:57:14.000Z","size":8388,"stargazers_count":1924,"open_issues_count":116,"forks_count":349,"subscribers_count":40,"default_branch":"master","last_synced_at":"2025-03-17T05:05:13.191Z","etag":null,"topics":["3d-reconstruction","computer-graphics","computer-vision","deep-learning","rgbd"],"latest_commit_sha":null,"homepage":"http://www.scan-net.org/","language":"C","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ScanNet.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2017-02-11T18:41:30.000Z","updated_at":"2025-03-15T06:13:00.000Z","dependencies_parsed_at":"2022-07-14T07:20:35.607Z","dependency_job_id":"18be43ad-37e5-4d3b-99cd-64877fb20344","html_url":"https://github.com/ScanNet/ScanNet","commit_stats":{"total_commits":129,"total_committers":7,"mean_commits":"18.428571428571427","dds":0.5193798449612403,"last_synced_commit":"affdbfa9ead373c39e36f40a0fa3494a8b7911e9"},"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ScanNet%2FScanNet","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ScanNet%2FScanNet/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ScanNet%2FScanNet/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ScanNet%2FScanNet/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ScanNet","download_url":"https://codeload.github.com/ScanNet/ScanNet/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":244222050,"owners_count":20418433,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["3d-reconstruction","computer-graphics","computer-vision","deep-learning","rgbd"],"created_at":"2024-07-31T03:00:39.869Z","updated_at":"2025-03-18T12:30:29.968Z","avatar_url":"https://github.com/ScanNet.png","language":"C","readme":"# ScanNet\n\nScanNet is an RGB-D video dataset containing 2.5 million views in more than 1500 scans, annotated with 3D camera poses, surface reconstructions, and instance-level semantic segmentations.\n\n## ScanNet Data\n\nIf you would like to download the ScanNet data, please fill out an agreement to the [ScanNet Terms of Use](http://kaldir.vc.in.tum.de/scannet/ScanNet_TOS.pdf), using your institutional email addresses, and send it to us at scannet@googlegroups.com. \n\nIf you have not received a response within a week, it is likely that your email is bouncing - please check this before sending repeat requests. Please do not reply to the noreply email - your email won't be seen.\n\nPlease check the [changelog](http://www.scan-net.org/changelog) for updates to the data release.\n\n\n### Data Organization\nThe data in ScanNet is organized by RGB-D sequence. Each sequence is stored under a directory with named `scene\u003cspaceId\u003e_\u003cscanId\u003e`, or `scene%04d_%02d`, where each space corresponds to a unique location (0-indexed).  The raw data captured during scanning, camera poses and surface mesh reconstructions, and annotation metadata are all stored together for the given sequence.  The directory has the following structure:\n```shell\n\u003cscanId\u003e\n|-- \u003cscanId\u003e.sens\n    RGB-D sensor stream containing color frames, depth frames, camera poses and other data\n|-- \u003cscanId\u003e_vh_clean.ply\n    High quality reconstructed mesh\n|-- \u003cscanId\u003e_vh_clean_2.ply\n    Cleaned and decimated mesh for semantic annotations\n|-- \u003cscanId\u003e_vh_clean_2.0.010000.segs.json\n    Over-segmentation of annotation mesh\n|-- \u003cscanId\u003e.aggregation.json, \u003cscanId\u003e_vh_clean.aggregation.json\n    Aggregated instance-level semantic annotations on lo-res, hi-res meshes, respectively\n|-- \u003cscanId\u003e_vh_clean_2.0.010000.segs.json, \u003cscanId\u003e_vh_clean.segs.json\n    Over-segmentation of lo-res, hi-res meshes, respectively (referenced by aggregated semantic annotations)\n|-- \u003cscanId\u003e_vh_clean_2.labels.ply\n    Visualization of aggregated semantic segmentation; colored by nyu40 labels (see img/legend; ply property 'label' denotes the nyu40 label id)\n|-- \u003cscanId\u003e_2d-label.zip\n    Raw 2d projections of aggregated annotation labels as 16-bit pngs with ScanNet label ids\n|-- \u003cscanId\u003e_2d-instance.zip\n    Raw 2d projections of aggregated annotation instances as 8-bit pngs\n|-- \u003cscanId\u003e_2d-label-filt.zip\n    Filtered 2d projections of aggregated annotation labels as 16-bit pngs with ScanNet label ids\n|-- \u003cscanId\u003e_2d-instance-filt.zip\n    Filtered 2d projections of aggregated annotation instances as 8-bit pngs\n```\n\n### Data Formats\nThe following are overviews of the data formats used in ScanNet:\n\n**Reconstructed surface mesh file (`*.ply`)**:\nBinary PLY format mesh with +Z axis in upright orientation.\n\n**RGB-D sensor stream (`*.sens`)**:\nCompressed binary format with per-frame color, depth, camera pose and other data.  See [ScanNet C++ Toolkit](#scannet-c-toolkit) for more information and parsing code. See [SensReader/python](SensReader/python) for a very basic python data exporter.\n\n**Surface mesh segmentation file (`*.segs.json`)**:\n```javascript\n{\n  \"params\": {  // segmentation parameters\n   \"kThresh\": \"0.0001\",\n   \"segMinVerts\": \"20\",\n   \"minPoints\": \"750\",\n   \"maxPoints\": \"30000\",\n   \"thinThresh\": \"0.05\",\n   \"flatThresh\": \"0.001\",\n   \"minLength\": \"0.02\",\n   \"maxLength\": \"1\"\n  },\n  \"sceneId\": \"...\",  // id of segmented scene\n  \"segIndices\": [1,1,1,1,3,3,15,15,15,15],  // per-vertex index of mesh segment\n}\n```\n\n**Aggregated semantic annotation file (`*.aggregation.json`)**:\n```javascript\n{\n  \"sceneId\": \"...\",  // id of annotated scene\n  \"appId\": \"...\", // id + version of the tool used to create the annotation\n  \"segGroups\": [\n    {\n      \"id\": 0,\n      \"objectId\": 0,\n      \"segments\": [1,4,3],\n      \"label\": \"couch\"\n    },\n  ],\n  \"segmentsFile\": \"...\" // id of the *.segs.json segmentation file referenced\n}\n```\n[BenchmarkScripts/util_3d.py](BenchmarkScripts/util_3d.py) gives examples to parsing the semantic instance information from the `*.segs.json`, `*.aggregation.json`, and `*_vh_clean_2.ply` mesh file, with example semantic segmentation visualization in [BenchmarkScripts/3d_helpers/visualize_labels_on_mesh.py](BenchmarkScripts/3d_helpers/visualize_labels_on_mesh.py).\n\n**2d annotation projections (`*_2d-label.zip`, `*_2d-instance.zip`, `*_2d-label-filt.zip`, `*_2d-instance-filt.zip`)**:\nProjection of 3d aggregated annotation of a scan into its RGB-D frames, according to the computed camera trajectory. \n\n### ScanNet C++ Toolkit\nTools for working with ScanNet data. [SensReader](SensReader) loads the ScanNet `.sens` data of compressed RGB-D frames, camera intrinsics and extrinsics, and IMU data.\n\n### Camera Parameter Estimation Code\nCode for estimating camera parameters and depth undistortion. Required to compute sensor calibration files which are used by the pipeline server to undistort depth. See [CameraParameterEstimation](CameraParameterEstimation) for details.\n\n### Mesh Segmentation Code\nMesh supersegment computation code which we use to preprocess meshes and prepare for semantic annotation. Refer to [Segmentator](Segmentator) directory for building and using code.\n\n## BundleFusion Reconstruction Code\n\nScanNet uses the [BundleFusion](https://github.com/niessner/BundleFusion) code for reconstruction. Please refer to the BundleFusion repository at https://github.com/niessner/BundleFusion . If you use BundleFusion, please cite the original paper:\n```\n@article{dai2017bundlefusion,\n  title={BundleFusion: Real-time Globally Consistent 3D Reconstruction using On-the-fly Surface Re-integration},\n  author={Dai, Angela and Nie{\\ss}ner, Matthias and Zoll{\\\"o}fer, Michael and Izadi, Shahram and Theobalt, Christian},\n  journal={ACM Transactions on Graphics 2017 (TOG)},\n  year={2017}\n}\n```\n\n## ScanNet Scanner iPad App\n[ScannerApp](ScannerApp) is designed for easy capture of RGB-D sequences using an iPad with attached Structure.io sensor.\n\n## ScanNet Scanner Data Server\n[Server](Server) contains the server code that receives RGB-D sequences from iPads running the Scanner app.\n\n## ScanNet Data Management UI\n[WebUI](WebUI) contains the web-based data management UI used for providing an overview of available scan data and controlling the processing and annotation pipeline.\n\n## ScanNet Semantic Annotation Tools\nCode and documentation for the ScanNet semantic annotation web-based interfaces is provided as part of the [SSTK](https://github.com/smartscenes/sstk) library. Please refer to https://github.com/smartscenes/sstk/wiki/Scan-Annotation-Pipeline for an overview.\n\n## Benchmark Tasks\nWe provide code for several scene understanding benchmarks on ScanNet:\n* 3D object classification\n* 3D object retrieval\n* Semantic voxel labeling\n\nTrain/test splits are given at [Tasks/Benchmark](Tasks/Benchmark).   \nLabel mappings and trained models can be downloaded with the ScanNet data release.\n\nSee [Tasks](Tasks).\n\n### Labels\nThe label mapping file (`scannet-labels.combined.tsv`) in the ScanNet task data release contains mappings from the labels provided in the ScanNet annotations (`id`) to the object category sets of [NYUv2](http://cs.nyu.edu/~silberman/datasets/nyu_depth_v2.html), [ModelNet](http://modelnet.cs.princeton.edu/), [ShapeNet](https://www.shapenet.org/), and [WordNet](https://wordnet.princeton.edu/) synsets. Download with along with the task data (`--task_data`) or by itself (`--label_map`).\n\n## Citation\nIf you use the ScanNet data or code please cite:\n```\n@inproceedings{dai2017scannet,\n    title={ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes},\n    author={Dai, Angela and Chang, Angel X. and Savva, Manolis and Halber, Maciej and Funkhouser, Thomas and Nie{\\ss}ner, Matthias},\n    booktitle = {Proc. Computer Vision and Pattern Recognition (CVPR), IEEE},\n    year = {2017}\n}\n```\n\n## Help\nIf you have any questions, please contact us at scannet@googlegroups.com\n\n\n## Changelog\n\n## License\nThe data is released under the [ScanNet Terms of Use](http://kaldir.vc.in.tum.de/scannet/ScanNet_TOS.pdf), and the code is released under the MIT license.\n\nCopyright (c) 2017\n","funding_links":[],"categories":["C"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FScanNet%2FScanNet","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FScanNet%2FScanNet","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FScanNet%2FScanNet/lists"}