{"id":17381721,"url":"https://github.com/qsbase/qs2","last_synced_at":"2026-03-09T04:32:52.149Z","repository":{"id":242165964,"uuid":"808863964","full_name":"qsbase/qs2","owner":"qsbase","description":null,"archived":false,"fork":false,"pushed_at":"2026-02-13T22:49:15.000Z","size":36675,"stargazers_count":72,"open_issues_count":4,"forks_count":3,"subscribers_count":4,"default_branch":"main","last_synced_at":"2026-02-27T21:30:30.996Z","etag":null,"topics":["compression","data-storage","r","serialization"],"latest_commit_sha":null,"homepage":"","language":"C","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/qsbase.png","metadata":{"files":{"readme":"README.md","changelog":"ChangeLog","contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2024-06-01T02:30:42.000Z","updated_at":"2026-02-23T15:50:34.000Z","dependencies_parsed_at":"2024-06-01T03:31:57.393Z","dependency_job_id":"4343c173-6508-4452-b835-2a3d06e55e31","html_url":"https://github.com/qsbase/qs2","commit_stats":null,"previous_names":["traversc/qs2","qsbase/qs2"],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/qsbase/qs2","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/qsbase%2Fqs2","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/qsbase%2Fqs2/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/qsbase%2Fqs2/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/qsbase%2Fqs2/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/qsbase","download_url":"https://codeload.github.com/qsbase/qs2/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/qsbase%2Fqs2/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":30283425,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-03-09T02:57:19.223Z","status":"ssl_error","status_checked_at":"2026-03-09T02:56:26.373Z","response_time":61,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["compression","data-storage","r","serialization"],"created_at":"2024-10-16T07:01:31.970Z","updated_at":"2026-03-09T04:32:52.114Z","avatar_url":"https://github.com/qsbase.png","language":"C","funding_links":[],"categories":["C"],"sub_categories":[],"readme":"qs2\n================\n\n[![R-CMD-check](https://github.com/qsbase/qs2/workflows/R-CMD-check/badge.svg)](https://github.com/qsbase/qs2/actions)\n[![CRAN-Status-Badge](https://www.r-pkg.org/badges/version/qs2)](https://cran.r-project.org/package=qs2)\n[![CRAN-Downloads-Badge](https://cranlogs.r-pkg.org/badges/qs2)](https://cran.r-project.org/package=qs2)\n[![CRAN-Downloads-Total-Badge](https://cranlogs.r-pkg.org/badges/grand-total/qs2)](https://cran.r-project.org/package=qs2)\n\n*qs2: a framework for efficient serialization*\n\n`qs2` is the successor to the `qs` package. The goal is to have reliable\nand fast performance for saving and loading objects in R.\n\nThe `qs2` format directly uses R serialization (via the\n`R_Serialize`/`R_Unserialize` C API) while improving underlying\ncompression and disk IO patterns. If you are familiar with the `qs`\npackage, the benefits and usage are the same.\n\n``` r\nqs_save(data, \"myfile.qs2\")\ndata \u003c- qs_read(\"myfile.qs2\")\n```\n\nUse the file extension `qs2` to distinguish it from the original `qs`\npackage. It is not compatible with the original `qs` format.\n\n## Installation\n\n``` r\ninstall.packages(\"qs2\")\n```\n\nOn x64 Mac or Linux, you can enable multi-threading by compiling from\nsource. It is enabled by default on Windows.\n\n``` r\nremotes::install_cran(\"qs2\", type = \"source\", configure.args = \"--with-TBB --with-simd=AVX2\")\n```\n\nOn non-x64 systems (e.g. Mac ARM) remove the AVX2 flag.\n\n``` r\nremotes::install_cran(\"qs2\", type = \"source\", configure.args = \"--with-TBB\")\n```\n\nMulti-threading in `qs2` uses the `Intel Thread Building Blocks`\nframework via the `RcppParallel` package.\n\n## Converting qs2 to RDS\n\nBecause the `qs2` format directly uses R serialization, you can convert\nit to RDS and vice versa.\n\n``` r\nfile_qs2 \u003c- tempfile(fileext = \".qs2\")\nfile_rds \u003c- tempfile(fileext = \".RDS\")\nx \u003c- runif(1e6)\n\n# save `x` with qs_save\nqs_save(x, file_qs2)\n\n# convert the file to RDS\nqs_to_rds(input_file = file_qs2, output_file = file_rds)\n\n# read `x` back in with `readRDS`\nxrds \u003c- readRDS(file_rds)\nstopifnot(identical(x, xrds))\n```\n\n## Validating file integrity\n\nThe `qs2` format saves an internal checksum. This can be used to test\nfor file corruption before deserialization via the `validate_checksum`\nparameter, but has a minor performance penalty.\n\n``` r\nqs_save(data, \"myfile.qs2\")\ndata \u003c- qs_read(\"myfile.qs2\", validate_checksum = TRUE)\n```\n\n# The qdata format\n\nThe package also introduces the `qdata` format which has its own\nserialization layout and works with only data types (vectors, lists,\ndata frames, matrices).\n\nIt will replace internal types (functions, promises, external pointers,\nenvironments, objects) with NULL. The `qdata` format differs from the\n`qs2` format in that it is NOT a general.\n\nThe eventual goal of `qdata` is to also have interoperability with other\nlanguages, particularly `Python`.\n\n``` r\nqd_save(data, \"myfile.qs2\")\ndata \u003c- qd_read(\"myfile.qs2\")\n```\n\n## Benchmarks\n\nA summary across 4 datasets is presented below.\n\n#### Single-threaded\n\n| Algorithm       | Compression | Save Time (s) | Read Time (s) |\n| --------------- | ----------- | ------------- | ------------- |\n| qs2             | 7.96        | 13.4          | 50.4          |\n| qdata           | 8.45        | 10.5          | 34.8          |\n| base::serialize | 1.1         | 8.87          | 51.4          |\n| saveRDS         | 8.68        | 107           | 63.7          |\n| fst             | 2.59        | 5.09          | 46.3          |\n| parquet         | 8.29        | 20.3          | 38.4          |\n| qs (legacy)     | 7.97        | 9.13          | 48.1          |\n\n#### Multi-threaded (8 threads)\n\n| Algorithm   | Compression | Save Time (s) | Read Time (s) |\n| ----------- | ----------- | ------------- | ------------- |\n| qs2         | 7.96        | 3.79          | 48.1          |\n| qdata       | 8.45        | 1.98          | 33.1          |\n| fst         | 2.59        | 5.05          | 46.6          |\n| parquet     | 8.29        | 20.2          | 37.0          |\n| qs (legacy) | 7.97        | 3.21          | 52.0          |\n\n  - `qs2`, `qdata` and `qs` with `compress_level = 3`\n  - `parquet` via the `arrow` package using zstd `compression_level = 3`\n  - `base::serialize` with `ascii = FALSE` and `xdr = FALSE`\n\n**Datasets used**\n\n  - `1000 genomes non-coding VCF` 1000 genomes non-coding variants (2743\n    MB)\n  - `B-cell data` B-cell mouse data, Greiff 2017 (1057 MB)\n  - `IP location` IPV4 range data with location information (198 MB)\n  - `Netflix movie ratings` Netflix ML prediction dataset (571 MB)\n\nThese datasets are openly licensed and represent a combination of\nnumeric and text data across multiple domains. See\n`inst/analysis/datasets.R` on Github.\n\n# Usage in C/C++\n\nSerialization functions can be accessed in compiled code. Below is an\nexample using Rcpp.\n\n``` cpp\n// [[Rcpp::depends(qs2)]]\n#include \u003cRcpp.h\u003e\n#include \"qs2_external.h\"\nusing namespace Rcpp;\n\n// [[Rcpp::export]]\nSEXP test_qs_serialize(SEXP x) {\n  size_t len = 0;\n  unsigned char * buffer = c_qs_serialize(x, \u0026len, 10, true, 4); // object, buffer length, compress_level, shuffle, nthreads\n  SEXP y = c_qs_deserialize(buffer, len, false, 4);              // buffer, buffer length, validate_checksum, nthreads\n  c_qs_free(buffer);                                             // must manually free buffer\n  return y;\n}\n\n// [[Rcpp::export]]\nSEXP test_qd_serialize(SEXP x) {\n  size_t len = 0;\n  unsigned char * buffer = c_qd_serialize(x, \u0026len, 10, true, 4); // object, buffer length, compress_level, shuffle, nthreads\n  SEXP y = c_qd_deserialize(buffer, len, false, false, 4);       // buffer, buffer length, use_alt_rep, validate_checksum, nthreads\n  c_qd_free(buffer);                                             // must manually free buffer\n  return y;\n}\n\n\n/*** R\nx \u003c- runif(1e7)\nstopifnot(test_qs_serialize(x) == x)\nstopifnot(test_qd_serialize(x) == x)\n*/\n```\n\n# Global Options for qs2\n\nThe following global options control the behavior of the `qs2`\nfunctions. These global options can be queried or modified using `qopt`\nfunction.\n\n  - **compress\\_level**  \n    The default compression level used when compressing data.  \n    **Default:** `3L`\n\n  - **shuffle**  \n    A logical flag indicating whether to allow byte shuffling during\n    compression.  \n    **Default:** `TRUE`\n\n  - **nthreads**  \n    The number of threads used for compression and decompression.  \n    **Default:** `1L`\n\n  - **validate\\_checksum**  \n    A logical flag indicating whether to validate the stored checksum\n    when reading data.  \n    **Default:** `FALSE`\n\n  - **warn\\_unsupported\\_types**  \n    For `qd_save`, a logical flag indicating whether to warn when saving\n    an object with unsupported types.  \n    **Default:** `TRUE`\n\n  - **use\\_alt\\_rep**  \n    For `qd_read`, a logical flag indicating whether to use ALTREP when\n    reading in string data.  \n    **Default:** `FALSE`\n\n-----\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fqsbase%2Fqs2","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fqsbase%2Fqs2","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fqsbase%2Fqs2/lists"}