{"id":13424502,"url":"https://github.com/markisus/vapid-soa","last_synced_at":"2025-03-15T18:35:18.917Z","repository":{"id":43653636,"uuid":"125680643","full_name":"markisus/vapid-soa","owner":"markisus","description":"A header only structure of arrays container for C++","archived":false,"fork":false,"pushed_at":"2023-05-29T14:34:21.000Z","size":56,"stargazers_count":38,"open_issues_count":0,"forks_count":5,"subscribers_count":4,"default_branch":"master","last_synced_at":"2024-10-26T23:55:20.657Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/markisus.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2018-03-18T00:37:00.000Z","updated_at":"2024-07-18T15:07:01.000Z","dependencies_parsed_at":"2024-01-28T09:52:49.526Z","dependency_job_id":null,"html_url":"https://github.com/markisus/vapid-soa","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/markisus%2Fvapid-soa","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/markisus%2Fvapid-soa/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/markisus%2Fvapid-soa/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/markisus%2Fvapid-soa/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/markisus","download_url":"https://codeload.github.com/markisus/vapid-soa/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":243775869,"owners_count":20346279,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-07-31T00:00:55.300Z","updated_at":"2025-03-15T18:35:13.907Z","avatar_url":"https://github.com/markisus.png","language":"C++","readme":"# vapid soa\n![soa_logo](https://user-images.githubusercontent.com/469689/155224679-98bda7fb-8a8c-4a18-90f1-bb96a2a3d23c.png)\n\nA simple c++17 header only library that implements structure of arrays data structure backed by std::vector.  \nThese are the most useful operations.  \n- `.insert(field1, field2, ...)` field_n inserts into the nth array\n- `.sort_by_field\u003ccol_idx\u003e()` sort all columns in tandem based on particular column \n- `.operator[](row_idx)` read data out as tuple of references\n- `.get_column\u003ccol_idx\u003e()` direct access to underlying std::vector column\n- `.view\u003ccol_idx1, col_idx2, ...\u003e(row_idx)` read subset of the fields out as a tuple of references\n- `.sort_by_view\u003ccol_idx1, col_idx2, ...\u003e()` sort all columns in tandem based on a subset of columns\n\nCode Example (scratch.cpp)\n------------------------\n\n```c++\n#include \u003ciostream\u003e\n#include \"vapid/soa.h\"\n\nint main(int argc, char *argv[])\n{\n    // presidents will be a soa representing\n    // order, first name, last name\n\n    constexpr int ORDER = 0;\n    constexpr int FIRST_NAME = 1;\n    constexpr int LAST_NAME = 2;\n    vapid::soa\u003cint, std::string, std::string\u003e presidents;\n\n    presidents.insert(0, \"Abraham\", \"Lincoln\");\n    presidents.insert(3, \"Barack\", \"Obama\");\n    presidents.insert(2, \"George\", \"Bush\");\n    presidents.insert(1, \"Bill\", \"Clinton\");\n    presidents.insert(4, \"Donald\", \"Trump\");\n    presidents.insert(5, \"Joe\", \"Biden\");\n\n    std::cout \u003c\u003c \"Presidents in order of insertion\" \u003c\u003c \"\\n\";\n    std::cout \u003c\u003c presidents \u003c\u003c \"\\n\";\n\n    // sort by time (first column)\n    presidents.sort_by_field\u003cORDER\u003e();\n    std::cout \u003c\u003c \"Presidents sorted by temporal order\" \u003c\u003c \"\\n\";\n    std::cout \u003c\u003c presidents \u003c\u003c \"\\n\";\n\n    // sort by first name (second column)\n    presidents.sort_by_field\u003cFIRST_NAME\u003e();\n    std::cout \u003c\u003c \"Presidents sorted by first name\" \u003c\u003c \"\\n\";\n    std::cout \u003c\u003c presidents \u003c\u003c \"\\n\";\n\n    // sort by last name (third column)\n    presidents.sort_by_field\u003cLAST_NAME\u003e();\n    std::cout \u003c\u003c \"Presidents sorted by last name\" \u003c\u003c \"\\n\";\n    std::cout \u003c\u003c presidents \u003c\u003c \"\\n\";\n\n    // operator[] returns a tuple of references\n    // Let's update Joe Biden to Joseph Biden\n    {\n        std::cout \u003c\u003c \"Editing the first row to update Joe =\u003e Joseph\" \u003c\u003c \"\\n\";\n        auto [order, fname, lname] = presidents[0];\n        fname = \"Joseph\"; // Joe =\u003e Joseph\n        std::cout \u003c\u003c presidents \u003c\u003c \"\\n\";\n    }\n\n    // view is templated to return a subset of fields \n    // Let's update Abraham Lincoln to George Washington\n    {\n        std::cout \u003c\u003c \"Editing the third row to update Abraham Lincoln =\u003e George Washington\" \u003c\u003c \"\\n\";\n        auto [fname, lname] = presidents.view\u003cFIRST_NAME,LAST_NAME\u003e(3);\n        fname = \"George\";\n        lname = \"Washington\";\n        std::cout \u003c\u003c presidents \u003c\u003c \"\\n\";\n    }\n\n    // get_column\u003cidx\u003e returns direct access\n    // to the underlying std::vector.\n    // Let's sum the characters of the first names.\n    std::cout \u003c\u003c \"Summing first name lengths\\n\";\n    int length_sum = 0;\n    for (const auto\u0026 fname : presidents.get_column\u003cFIRST_NAME\u003e()) {\n        length_sum += fname.length();\n    }\n    std::cout \u003c\u003c \"Total characters used in first names = \" \u003c\u003c length_sum \u003c\u003c \"\\n\\n\";\n\n    // We can pass a custom comparator when sorting\n    // Let's sort based on length of last name\n    std::cout \u003c\u003c \"Sorting by number of characters in the last name.\" \u003c\u003c \"\\n\";\n    presidents.sort_by_field\u003cLAST_NAME\u003e([](auto\u0026 lname_a, auto\u0026 lname_b){ \n        return lname_a.size() \u003c lname_b.size();\n    });\n    std::cout \u003c\u003c presidents \u003c\u003c \"\\n\";\n\n    // We can also sort by a view.\n    std::cout \u003c\u003c \"Sorting by first name, last name.\" \u003c\u003c \"\\n\";\n    presidents.sort_by_view\u003cFIRST_NAME,LAST_NAME\u003e();\n    std::cout \u003c\u003c presidents \u003c\u003c \"\\n\";\n\n    // We can also use a comparator when sorting by view.\n    // In that case, the comparator should take a tuple of references.\n    // For example, we can sort by difference in length of last and first name.\n    std::cout \u003c\u003c \"Sorting by length of first name - length of last name.\" \u003c\u003c \"\\n\";    \n    presidents.sort_by_view\u003cLAST_NAME, FIRST_NAME\u003e([](auto view1, auto view2) {\n        const auto\u0026 [lname_a, fname_a] = view1;\n        const auto\u0026 [lname_b, fname_b] = view2;\n        int view1_order = -int(lname_a.size()) + int(fname_a.size());\n        int view2_order = -int(lname_b.size()) + int(fname_b.size());\n        return view1_order \u003c view2_order;\n    });\n    std::cout \u003c\u003c presidents \u003c\u003c \"\\n\";\n\n    return 0;\n}\n```\nThis program outputs the following.\n```\nPresidents in order of insertion\nsoa {\n\t0, Abraham, Lincoln\n\t3, Barack, Obama\n\t2, George, Bush\n\t1, Bill, Clinton\n\t4, Donald, Trump\n\t5, Joe, Biden\n}\n\nPresidents sorted by temporal order\nsoa {\n\t0, Abraham, Lincoln\n\t1, Bill, Clinton\n\t2, George, Bush\n\t3, Barack, Obama\n\t4, Donald, Trump\n\t5, Joe, Biden\n}\n\nPresidents sorted by first name\nsoa {\n\t0, Abraham, Lincoln\n\t3, Barack, Obama\n\t1, Bill, Clinton\n\t4, Donald, Trump\n\t2, George, Bush\n\t5, Joe, Biden\n}\n\nPresidents sorted by last name\nsoa {\n\t5, Joe, Biden\n\t2, George, Bush\n\t1, Bill, Clinton\n\t0, Abraham, Lincoln\n\t3, Barack, Obama\n\t4, Donald, Trump\n}\n\nEditing the first row to update Joe =\u003e Joseph\nsoa {\n\t5, Joseph, Biden\n\t2, George, Bush\n\t1, Bill, Clinton\n\t0, Abraham, Lincoln\n\t3, Barack, Obama\n\t4, Donald, Trump\n}\n\nEditing the third row to update Abraham Lincoln =\u003e George Washington\nsoa {\n\t5, Joseph, Biden\n\t2, George, Bush\n\t1, Bill, Clinton\n\t0, George, Washington\n\t3, Barack, Obama\n\t4, Donald, Trump\n}\n\nSumming first name lengths\nTotal characters used in first names = 34\n\nSorting by number of characters in the last name.\nsoa {\n\t2, George, Bush\n\t5, Joseph, Biden\n\t3, Barack, Obama\n\t4, Donald, Trump\n\t1, Bill, Clinton\n\t0, George, Washington\n}\n\nSorting by first name, last name.\nsoa {\n        3, Barack, Obama\n        1, Bill, Clinton\n        4, Donald, Trump\n        2, George, Bush\n        0, George, Washington\n        5, Joseph, Biden\n}\n\nSorting by length of first name - length of last name.\nsoa {\n        0, George, Washington\n        1, Bill, Clinton\n        3, Barack, Obama\n        4, Donald, Trump\n        5, Joseph, Biden\n        2, George, Bush\n}\n\n```\n\nBenchmark\n-------\nWe can observe speed ups for structure of arrays (soa=vapid::soa) vs array of structs (vec=std::vector) with the benchmarks.cc.\nHere are the results using Visual Studio 2022 on Release mode on my laptop.\n\n```\n# bazel run -c opt //:benchmarks\nRun on (8 X 2995 MHz CPU s)\nCPU Caches:\n  L1 Data 48 KiB (x4)\n  L1 Instruction 32 KiB (x4)\n  L2 Unified 1280 KiB (x4)\n  L3 Unified 12288 KiB (x1)\n---------------------------------------------------------------\nBenchmark                     Time             CPU   Iterations\n---------------------------------------------------------------\nBM_SoaSortBySensorId   10599404 ns      9151786 ns           70\nBM_VecSortBySensorId   24412036 ns     25111607 ns           28\nBM_SoaSumTimestamps      106078 ns       106027 ns         5600\nBM_VecSumTimestamps      264606 ns       266841 ns         2635\n```\n\n*Note:* sort_by_field (SoaSoart benchmarks) is slow in GCC due to unknown reasons.\n\nThe benchmark contains a small program concerning simulated sensor measurements. With a straightforward array of structs, we store metadata together with the actual sensor data together and then just push_back() these onto an std::vector.\n```c++\nstruct SensorData {\n    std::array\u003cdouble, 18\u003e xyz;\n\nstruct Measurement {\n    Id sensor_id;\n    Id object_id;\n    double timestamp;\n    SensorData data;\n};\n\n...\n\nstd::vector\u003cMeasurement\u003e measurements_vec;\nmeasurements_vec.push_back(m);\n```\nAlternatively, we could split the metadata and sensor data apart into a structure of arrays.\n```c++\nvapid::soa\u003cId, Id, double, SensorData\u003e measurements_soa;\nsoa.insert(m.sensor_id, m.object_id, m.timestamp, m.data);\n```\n\nThe benchmark times the cost of sorting by sensor_id, and then the cost of finding the average measurement timestamp using an std::vector vs a vapid::soa.\n\nManual Installation\n-----------\nCopy the vapid folder into your project and #include \"vapid/soa.h\"  \n  \nBazel Installation\n------\n```starlark\n# WORKSPACE\nload(\"@bazel_tools//tools/build_defs/repo:http.bzl\", \"http_archive\")\n\nvapid_soa_version = \"d327bd00e3a52d8c04550215df5711d0545e396e\"\nhttp_archive(\n    name = \"com_github_markisus_vapid-soa\",\n    url = \"https://github.com/markisus/vapid-soa/archive/{}.zip\".format(vapid_soa_version),\n    sha256 = \"c643d20af1ce95566ff4b2b6cdca2bd2f6aa0254a4f603c74ac8e62a84a527b4\",\n    strip_prefix = \"vapid-soa-{}\".format(vapid_soa_version))\n\n```\n```starlark\n# BUILD\ncc_binary(\n    ...\n    deps = [\n        \"@com_github_markisus_vapid-soa//:soa\",\n        ...\n    ]\n)\n```\n\n","funding_links":[],"categories":["Containers and Algorithms"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmarkisus%2Fvapid-soa","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmarkisus%2Fvapid-soa","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmarkisus%2Fvapid-soa/lists"}