{"id":18046663,"url":"https://github.com/ltla/rds2cpp","last_synced_at":"2025-04-10T04:44:15.044Z","repository":{"id":62805444,"uuid":"533191811","full_name":"LTLA/rds2cpp","owner":"LTLA","description":"Read and write RDS files in C++","archived":false,"fork":false,"pushed_at":"2023-09-04T04:18:20.000Z","size":4012,"stargazers_count":7,"open_issues_count":1,"forks_count":1,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-03-24T06:02:03.220Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"https://ltla.github.io/rds2cpp/","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/LTLA.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2022-09-06T06:32:51.000Z","updated_at":"2025-03-22T10:32:39.000Z","dependencies_parsed_at":"2023-02-05T16:31:36.378Z","dependency_job_id":null,"html_url":"https://github.com/LTLA/rds2cpp","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LTLA%2Frds2cpp","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LTLA%2Frds2cpp/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LTLA%2Frds2cpp/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LTLA%2Frds2cpp/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/LTLA","download_url":"https://codeload.github.com/LTLA/rds2cpp/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248161232,"owners_count":21057552,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-10-30T19:08:26.223Z","updated_at":"2025-04-10T04:44:14.942Z","avatar_url":"https://github.com/LTLA.png","language":"C++","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Read RDS files in C++\n\n![Unit tests](https://github.com/LTLA/rds2cpp/actions/workflows/run-tests.yaml/badge.svg)\n![Documentation](https://github.com/LTLA/rds2cpp/actions/workflows/doxygenate.yaml/badge.svg)\n\n## Overview\n\nThis repository contains a header-only C++ library for reading and writing RDS files (created with `saveRDS()`) without the need to link to R's libraries.\nIn this manner, we can use RDS as a flexible data exchange format across different frameworks that have C++ bindings, \ne.g., [Python](https://github.com/biocpy/rds2py), [Javascript (via Wasm)](https://github.com/jkanche/scran.js).\nWe currently support most user-visible data structures such as atomic vectors, lists, environments and S4 classes.\n\n## Quick start\n\nGiven a path to an RDS file, the `parse_rds()` function will return a pointer to an `RObject` interface:\n\n```cpp\n#include \"rds2cpp/rds2cpp.hpp\"\n\n// Returns an object containing the file information,\n// e.g., R version used to read/write the file.\nauto file_info = rds2cpp::parse_rds(fpath);\n\n// Get the pointer to the actual R object.\nconst auto\u0026 ptr = file_info-\u003eobject;\n```\n\nThe type of the underlying object can then be queried for further examination.\nFor example, if we wanted to process integer vectors:\n\n```cpp\nif (ptr-\u003etype() == rds2cpp::SEXPType::INT) {\n    auto iptr = static_cast\u003cconst rds2cpp::IntegerVector*\u003e(ptr.get());\n    const auto\u0026 values = iptr-\u003edata; // vector of int32_t's.\n    const auto\u0026 attr_names = iptr-\u003eattributes.names; // vector of attribute names.\n}\n```\n\nSee the [reference documentation](https://ltla.github.io/rds2cpp) for a list of known representations.\n\n## More reading examples\n\n**rds2cpp** can extract ordinary lists from an RDS file.\nUsers can inspect the attributes to determine if the list is named.\n\n```cpp\nif (ptr-\u003etype() == rds2cpp::SEXPType::VEC) {\n    auto lptr = static_cast\u003cconst rds2cpp::GenericVector*\u003e(ptr.get());\n    const auto\u0026 elements = lptr-\u003edata; // vector of pointers to list elements.\n\n    const auto\u0026 attr = lptr-\u003eattributes; \n    const auto\u0026 attr_names = sptr-\u003eattributes.names;\n    const auto\u0026 attr_values = sptr-\u003eattributes.values;\n\n    // Scanning for the list names.\n    auto nIt = std::find(attr_names.begin(), attr_names.end(), std::string(\"names\"));\n    if (nIt != attr_names.end()) {\n        size_t nindex = nIt - attr_names.begin();\n        if (attr_values[nindex]-\u003etype() == rds2cpp::SEXPType::STR) {\n            auto nptr = static_cast\u003cconst rds2cpp::StringVector*\u003e(attr_values[nindex].get());\n        }\n    }\n}\n```\n\nSlots of S4 instances are similarly encoded in the attributes -\nexcept for the class name, which is extracted into its own member.\n\n```cpp\nif (ptr-\u003etype() == rds2cpp::SEXPType::S4) {\n    auto sptr = static_cast\u003cconst rds2cpp::S4Object*\u003e(ptr.get());\n    sptr-\u003eclass_name;\n    sptr-\u003epackage_name;\n    const auto\u0026 slot_names = sptr-\u003eattributes.names;\n    const auto\u0026 slot_values = sptr-\u003eattributes.values;\n}\n```\n\nAdvanced users can also pull out serialized environments.\nThese should be treated as file-specific globals that may be referenced one or more times inside the R object.\n\n```cpp\nif (ptr-\u003etype() == rds2cpp::SEXPType::ENV) {\n    const auto\u0026 env = file_info-\u003eenvironments[eptr-\u003eindex];\n    const auto\u0026 vnames = env.variable_names;\n    const auto\u0026 vvalues = env.variable_values;\n}\n```\n\n`NULL`s are supported but not particularly interesting:\n\n```cpp\nif (ptr-\u003etype() == rds2cpp::SEXPType::NIL) {\n   // Do something.\n}\n```\n\n## Writing RDS files\n\nThe `write_rds()` function will write RDS files from an `rds2cpp::RObject` representation:\n\n```cpp\nrds2cpp::RdsFile file_info;\n\n// Setting up an integer vector.\nauto vec = new rds2cpp::IntegerVector;\nfile_info.object.reset(vec);\n\n// Storing data in the integer vector.\nvec-\u003edata = std::vector\u003cint32_t\u003e{ 0, 1, 2, 3, 4, 5 };\n\nrds2cpp::write_rds(file_info, \"some_file_path.rds\");\n```\n\nHere's a more complicated example that saves a sparse matrix (as a `dgCMatrix` from the **Matrix** package) to file.\n\n```cpp\nrds2cpp::RdsFile file_info;\nauto ptr = new rds2cpp::S4Object;\nfile_info.object.reset(ptr);\nauto\u0026 obj = *ptr;\n\nobj.class_name = \"dgCMatrix\";\nobj.package_name = \"Matrix\";\n\nauto ivec = new rds2cpp::IntegerVector;\nobj.attributes.add(\"i\", ivec);\nivec-\u003edata = std::vector\u003cint32_t\u003e{ 6, 8, 0, 3, 5, 6, 0, 1, 3, 7 };\n\nauto pvec = new rds2cpp::IntegerVector;\nobj.attributes.add(\"p\", pvec);\npvec-\u003edata = std::vector\u003cint32_t\u003e{ 0, 0, 2, 3, 4, 5, 6, 8, 8, 8, 10 };\n\nauto xvec = new rds2cpp::DoubleVector;\nobj.attributes.add(\"x\", xvec);\nxvec-\u003edata = std::vector\u003cdouble\u003e{ 0.96, -0.34, 0.82, -2, -0.72, 0.39, 0.16, 0.36, -1.5, -0.47 };\n\nauto dims = new rds2cpp::IntegerVector;\nobj.attributes.add(\"Dim\", dims);\ndims-\u003edata = std::vector\u003cint32_t\u003e{ 10, 10 };\n\nauto dimnames = new rds2cpp::GenericVector;\nobj.attributes.add(\"Dimnames\", dimnames);\ndimnames-\u003edata.emplace_back(new rds2cpp::Null);\ndimnames-\u003edata.emplace_back(new rds2cpp::Null);\n\nauto factors = new rds2cpp::GenericVector;\nobj.attributes.add(\"factors\", factors);\n\nrds2cpp::write_rds(file_info, \"my_matrix.rds\");\n``` \n\nWe can also create environments by registering the environment before creating indices to it.\n\n```cpp\nrds2cpp::RdsFile file_info;\n\n// Creating an environment with a 'foo' variable containing c('bar', NA, 'whee')\nfile_info.environments.resize(1);\nauto\u0026 current_env = file_info.environments[0];\n\nauto sptr = new rds2cpp::StringVector;\ncurrent_env.add(\"foo\", sptr);\nsptr-\u003eadd(\"bar\");\nsptr-\u003eadd(); // NA string\nsptr-\u003eadd(\"whee\");\n\n// Referencing the environment: \nauto eptr = new rds2cpp::EnvironmentIndex(0);\nfile_info.object.reset(eptr);\n\nrds2cpp::write_rds(file_info, \"my_env.rds\");\n```\n\n## Building projects\n\n### CMake with `FetchContent`\n\nIf you're using CMake, you just need to add something like this to your `CMakeLists.txt`:\n\n```cmake\ninclude(FetchContent)\n\nFetchContent_Declare(\n  rds2cpp\n  GIT_REPOSITORY https://github.com/LTLA/rds2cpp\n  GIT_TAG master # or any version of interest\n)\n\nFetchContent_MakeAvailable(rds2cpp)\n```\n\nThen you can link to **rds2cpp** to make the headers available during compilation:\n\n```cmake\n# For executables:\ntarget_link_libraries(myexe rds2cpp)\n\n# For libaries\ntarget_link_libraries(mylib INTERFACE rds2cpp)\n```\n\n### CMake using `find_package()`\n\nYou can install the library by cloning a suitable version of this repository and running the following commands:\n\n```sh\nmkdir build \u0026\u0026 cd build\ncmake .. \ncmake --build . --target install\n```\n\nThen you can use `find_package()` as usual:\n\n```cmake\nfind_package(ltla_rds2cpp CONFIG REQUIRED)\ntarget_link_libraries(mylib INTERFACE ltla::rds2cpp)\n```\n\n### Manual\n\nIf you're not using CMake, the simple approach is to just copy the files in the [`include/`](include) subdirectory -\neither directly or with Git submodules - and include their path during compilation with, e.g., GCC's `-I`.\n\nYou'll also need to add the [**byteme**](https://github.com/LTLA/byteme) header-only library to the compiler's search path.\nNormally, when using CMake, this is automatically linked to Zlib; this will now need to be done manually.\n\n## Known limitations\n\nThis library may not support RDS files created using `saveRDS()` with non-default parameters.\n\nEnvironments are written without a hash table, so as to avoid the need to replicate R's string hashing logic.\nThis may result in slower retrieval of variables when those environments are loaded into an R session.\n\nCurrently, no support is provided for unserializing built-in functions or user-defined closures.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fltla%2Frds2cpp","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fltla%2Frds2cpp","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fltla%2Frds2cpp/lists"}