https://github.com/ldbc/data-sets-surf-repository
https://github.com/ldbc/data-sets-surf-repository
Last synced: about 1 year ago
JSON representation
- Host: GitHub
- URL: https://github.com/ldbc/data-sets-surf-repository
- Owner: ldbc
- Created: 2022-03-25T05:55:03.000Z (about 4 years ago)
- Default Branch: main
- Last Pushed: 2025-03-12T23:29:17.000Z (about 1 year ago)
- Last Synced: 2025-03-27T17:11:58.372Z (about 1 year ago)
- Language: Shell
- Homepage: https://ldbcouncil.org/data-sets-surf-repository/
- Size: 45.9 KB
- Stars: 15
- Watchers: 2
- Forks: 3
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# LDBC benchmark data sets
The LDBC benchmark data sets are stored under [SURF's CWI repositories](https://repository.surfsara.nl/community/cwi).
## Usage
The data sets are [stored on tape](https://servicedesk.surf.nl/wiki/display/WIKI/Data+Archive#DataArchive-What?-Thetapeback-endandtheDataMigrationFacility(DMF)), therefore, you may have to stage them before they can be downloaded.
To do so, visit the repository of the data set and click "Request" for offline files. Staging a 20 GB file takes approx. 3-5 minutes, while staging a 200 GB one takes approx. 10-15 minutes.
To decompress, use [zstd](https://github.com/facebook/zstd).
```bash
tar -xv --use-compress-program=unzstd file.tar.zst
```
We provide the [`download-data-set.sh`](https://raw.githubusercontent.com/ldbc/data-sets-surf-repository/refs/heads/main/download-data-set.sh) script, which attempts to download the data set and stages it to disk if necessary. Replace the `data_set_url` with one of the URLs linked below in this README (right click and select Copy Link Address).
```bash
./download-data-set.sh data_set_url
```
Example:
```bash
./download-data-set.sh https://repository.surfsara.nl/datasets/cwi/snb/files/social_network-csv_basic-longdateformatter/social_network-csv_basic-longdateformatter-sf0.1.tar.zst
```
## Data sets
* [Financial Benchmark (FinBench)](finbench.md)
* [Graphalytics](graphalytics.md)
* [Labeled Subgraph Query Benchmark (LSQB)](lsqb.md)
* [SIGMOD 2014 Programming Contest](sigmod-2014-programming-contest.md)
* [SNB Business Intelligence (BI)](snb-business-intelligence.md)
* [SNB Interactive v1 (Datagen v0.3.5)](snb-interactive-v1-datagen-v035.md)
* [SNB Interactive v1 (Datagen v1.0.0)](snb-interactive-v1-datagen-v100.md)
* [SNB Interactive v2 updates](snb-interactive-v2-updates.md)
* [SNB factor tables](snb-factor-tables.md)