{"id":22635836,"url":"https://github.com/xaliphostes/dataframe","last_synced_at":"2026-04-02T02:53:18.068Z","repository":{"id":163873624,"uuid":"606358186","full_name":"xaliphostes/dataframe","owner":"xaliphostes","description":"A minimalist Python Panda like library in pure C++","archived":false,"fork":false,"pushed_at":"2025-09-22T12:05:19.000Z","size":9403,"stargazers_count":5,"open_issues_count":0,"forks_count":1,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-10-25T06:46:16.159Z","etag":null,"topics":["algebra","cplusplus","cpp","cpp23","functional-programming","geometry","mathematics","pandas-dataframe","pandas-python","statistics"],"latest_commit_sha":null,"homepage":"","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/xaliphostes.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2023-02-25T08:41:09.000Z","updated_at":"2025-09-22T12:05:24.000Z","dependencies_parsed_at":"2024-12-05T14:23:50.880Z","dependency_job_id":"ba363849-5d44-4df4-83c6-1851d97db198","html_url":"https://github.com/xaliphostes/dataframe","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/xaliphostes/dataframe","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/xaliphostes%2Fdataframe","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/xaliphostes%2Fdataframe/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/xaliphostes%2Fdataframe/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/xaliphostes%2Fdataframe/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/xaliphostes","download_url":"https://codeload.github.com/xaliphostes/dataframe/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/xaliphostes%2Fdataframe/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":280917407,"owners_count":26413206,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-25T02:00:06.499Z","response_time":81,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["algebra","cplusplus","cpp","cpp23","functional-programming","geometry","mathematics","pandas-dataframe","pandas-python","statistics"],"created_at":"2024-12-09T03:17:08.803Z","updated_at":"2025-10-25T06:46:19.939Z","avatar_url":"https://github.com/xaliphostes.png","language":"C++","readme":"\u003cp align=\"center\"\u003e\n  \u003cimg src=\"media/icon_512x512.png\" alt=\"Logo dataframe\" width=\"200\"\u003e\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"https://img.shields.io/static/v1?label=Linux\u0026logo=linux\u0026logoColor=white\u0026message=support\u0026color=success\" alt=\"Linux support\"\u003e\n  \u003cimg src=\"https://img.shields.io/static/v1?label=macOS\u0026logo=apple\u0026logoColor=white\u0026message=support\u0026color=success\" alt=\"macOS support\"\u003e\n  \u003cimg src=\"https://img.shields.io/static/v1?label=Windows\u0026logo=windows\u0026logoColor=white\u0026message=soon\u0026color=red\" alt=\"Windows support\"\u003e\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"https://img.shields.io/badge/C%2B%2B-20-blue.svg\" alt=\"Language\"\u003e\n  \u003cimg src=\"https://img.shields.io/badge/license-MIT-blue.svg\" alt=\"License\"\u003e\n\u003c/p\u003e\n\n# Simple and efficient C++ Dataframe Library (header only)\n\n\u003cp align=\"center\"\u003e\n    \u003cimg src=\"media/field_comparison.jpg\" alt=\"drawing\" width=\"800\"/\u003e\n\u003c/p\u003e\n\u003cp align=\"center\"\u003e\nExample of \u003cb\u003einterpolated\u003c/b\u003e scalar field (left: real, right: interpolation vrt to black points)\n\n```c++\nauto scattered = random_uniform\u003cVector2\u003e(50, Vector2{-1.0, -1.0}, Vector2{1.0, 1.0});\nauto values    = map([](const Vector2 \u0026p) {return sin(p[0]*2) * cos(p[1]*2);}, scattered);\nauto grid      = from_dims\u003c2\u003e({100, 100}, {0, 0}, {2.0, 2.0});\nauto interp    = interpolate_field\u003cdouble, 2\u003e(grid, scattered, values);\n```\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n    \u003cimg src=\"media/distance_field_2d.png\" alt=\"drawing\" width=\"300\"/\u003e\n\u003c/p\u003e\n\u003cp align=\"center\"\u003e\nExample of \u003cb\u003edistance field\u003c/b\u003e computation\n\n```c++\nauto ref_pts   = random_uniform(10, Vector2{-1.0, -1.0}, Vector2{1.0, 1.0});\nauto grid      = from_dims\u003c2\u003e({100, 100}, {0.0, 0.0}, {2.0, 2.0});\nauto distances = distance_field\u003c2\u003e(grid, ref_pts);\n```\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n    \u003cimg src=\"media/diffusion_evolution.gif\" alt=\"drawing\" width=\"400\"/\u003e\n\u003c/p\u003e\n\u003cp align=\"center\"\u003e\nExample of heat diffusion using \u003cb\u003eHarmonicDiffusion\u003c/b\u003e\n\u003c/p\u003e\n\n\n# \n### ***\u003ccenter\u003e...Work in progress for linear algebra, stats and geo(metry, logy, physic...) operations...\u003c/center\u003e***\n#\n\n\u003cbr\u003e\n\nA modern C++ library for data manipulation with a focus on functional programming patterns and type safety.\n\nOnly headers. No linking!\n\n# [Read the doc (in progress...)](https://xaliphostes.github.io/dataframe/)\n\n## Features\n\n- Generic series container (`Serie\u003cT\u003e`) for any data type (similar to a column in Excel sheet)\n- DataFrame for managing multiple named series\n- Rich functional operations (map, reduce, filter, etc.)\n- Parallel processing capabilities\n- Type-safe operations with compile-time checks\n- Modern C++ design (C++20)\n- Use [Eigen](https://eigen.tuxfamily.org/index.php?title=Main_Page) for libear algebra (will install it automatically)\n- Use [CGAL](https://www.cgal.org/) if needed (read [this to install CGAL](CGAL_INSTALL.md))\n- Implements the functions such as `chain`, `chunk`, `compose`, `concat`, `filter`, `find`, `flatten`, `format`, `forEach`, `groupBy`, `map_if`, `map`, `memoise`, `merge`, `ones`, `orderBy`, `parallel_map`, `partition`, `pipe` with operator `|`, `print`, `range`, `reduce`, `reject`, `skip`, `slice`, `sort`, `split`, `switch`, `take`, `unique`, `unzip`, `whenAll`, `where`, `zeros`, `zip`.\n\n## Core Concepts\n\n### `Serie\u003cT\u003e`\n\nA type-safe container for sequences of data with functional operations:\n- Supports any data type\n- Provides functional operations (map, reduce, filter)\n- Enables chaining operations using pipe syntax\n\nFor comparison, the main difference is that while Excel columns can contain mixed types and empty cells, a Serie is strongly typed and all elements must be of the same type, making it more suitable for type-safe data processing.\n\n### `Dataframe`\n\nA container for managing multiple named series:\n- Type-safe storage of different series types\n- Named access to series\n- Dynamic addition and removal of series\n\n## Examples\n\n### Basic Series Operations\n\n```cpp\n#include \u003cdataframe/Serie.h\u003e\n#include \u003cdataframe/map.h\u003e\n#include \u003cdataframe/filter.h\u003e\n\n// Create series with default values\ndf::Serie\u003cint\u003e ints(5);        // Creates [0,0,0,0,0]\ndf::Serie\u003cdouble\u003e doubles(3);  // Creates [0.0,0.0,0.0]\n\n// Create series with specific values\ndf::Serie\u003cint\u003e ones(4, 1);     // Creates [1,1,1,1]\ndf::Serie\u003cdouble\u003e pi(3, 3.14); // Creates [3.14,3.14,3.14]\n\n// -------------------------------------------\n\n// Create a serie of numbers\ndf::Serie\u003cint\u003e numbers{1, 2, 3, 4, 5};\n\n// Map operation: double each number\n// Note: \"size_t index\" is optional\nauto doubled = numbers.map([](int n, size_t index) { return n * 2; });\n\n// Filter operation: keep only even numbers\nauto evens = numbers | df::bind_filter([](int n) { return n % 2 == 0; });\n\n// Create a reusable pipeline using chaining operations\nauto pipeline = df::bind_map([](int n) { return n * 2; }) |\n                df::bind_filter([](int n) { return n \u003e 5; });\n\n// Apply the pipeline to the numbers serie\nauto result = pipeline(numbers);\n```\n\n### Operator overloading\n\n```cpp\n#include \u003cdataframe/Serie.h\u003e\n\n// Create a serie of numbers\ndf::Serie\u003cint\u003e s1{1, 2, 3, 4, 5};\ndf::Serie\u003cint\u003e s2{1, 2, 3, 4, 5};\ndf::Serie\u003cint\u003e s3{1, 2, 3, 4, 5};\ndf::Serie\u003cint\u003e s4{1, 2, 3, 4, 5};\n\nauto s = (s1 + s2) * s3 / s4;\n```\n\n### Linear algebra\n\n```cpp\n#include \u003cdataframe/algebra/eigen.h\u003e\n#include \u003cdataframe/Serie.h\u003e\n#include \u003cdataframe/types.h\u003e\n\n// Three sym tensor in 3D\n// Storage format is {xx, xy, xz, yy, yz, zz}\n//\ndf::Serie\u003cSMatrix3D\u003e serie({\n    {2, 4, 6, 3, 6, 9}, \n    {1, 2, 3, 4, 5, 6},\n    {9, 8, 7, 6, 5, 4}\n});\n\nauto [values, vectors] = df::eigenSystem(serie);\n\ndf::forEach([](const EigenVectorType\u003c3\u003e\u0026 v) {\n    std::cout \u003c\u003c \"1st eigen vector: \" \u003c\u003c v[0] \u003c\u003c std::endl ;\n    std::cout \u003c\u003c \"2nd eigen vector: \" \u003c\u003c v[1] \u003c\u003c std::endl ;\n    std::cout \u003c\u003c \"3rd eigen vector: \" \u003c\u003c v[2] \u003c\u003c std::endl ;\n}, vectors);\n```\n\n### Parallel Processing (whenAll)\n\nThe library provides several ways to perform parallel computations on Series.\n\nThe parallel processing functions are particularly useful for:\n- Large datasets where computation can be distributed\n- CPU-intensive operations on each element\n- Processing multiple series simultaneously\n- Operations that can be executed independently\n\nNote that for small datasets, the overhead of parallel execution might outweigh the benefits. Consider using parallel operations when:\n- The dataset size is large (typically \u003e 10,000 elements)\n- The operation per element is computationally expensive\n- The operation doesn't require maintaining order-dependent state\n\n```cpp\n#include \u003cdataframe/utils/whenAll.h\u003e\n\n// Process multiple series in parallel with transformation\ndf::Serie\u003cdouble\u003e s1{1.0, 2.0, 3.0, ...};\ndf::Serie\u003cdouble\u003e s2{4.0, 5.0, 6.0, ...};\n\nauto result = df::whenAll([](const df::Serie\u003cdouble\u003e\u0026 s) { \n    return s.map([](double x) { return x * 2; }); \n}, {s1, s2});\n\n// Parallel processing with tuple results\nauto [r1, r2] = df::whenAll\u003cdouble\u003e(s1, s2);\n```\n\n### Working with Custom Types\n\n```cpp\nstruct Point3D {\n    double x, y, z;\n};\n\n// Create a serie of 3D points\ndf::Serie\u003cPoint3D\u003e points{{0,0,0}, {1,1,1}, {2,2,2}};\n\n// Transform points\nauto translated = df::map(([](const Point3D\u0026 p) {\n    return Point3D{p.x + 1, p.y + 1, p.z + 1};\n}, points);\n\n// Get the norms according to (0,0,0)\nauto norms = df::map(([](const Point3D\u0026 p) {\n    return std::sqrt{std::pow(p.x, 2), std::pow(p.y, 2), std::pow(p.z, 2)};\n}, points);\n```\n\n### Dataframe Usage\n\n```cpp\n#include \u003cdataframe/Dataframe.h\u003e\n\n// Create a Dataframe\ndf::Dataframe dataframe;\n\n// Add different types of series\ndataframe.add(\"integers\", df::Serie\u003cint\u003e{1, 2, 3, 4, 5});\ndataframe.add(\"doubles\", df::Serie\u003cdouble\u003e{1.1, 2.2, 3.3, 4.4, 5.5});\n\n// Access series with type safety\nconst auto\u0026 ints = dataframe.get\u003cint\u003e(\"integers\");\nconst auto\u0026 dbls = dataframe.get\u003cdouble\u003e(\"doubles\");\n\nfor (const auto\u0026 [name, serie] : dataframe) {\n    // Work with name and serie\n}\n\n// Remove a series\ndataframe.remove(\"integers\");\n```\n\n### 3D Mesh Example\n\n```cpp\n#include \u003cdataframe/Serie.h\u003e\n#include \u003cdataframe/DataFrame.h\u003e\n#include \u003cdataframe/map.h\u003e\n#include \u003cdataframe/math/norm.h\u003e\n#include \u003cdataframe/geo/normal.h\u003e\n\n// Define types for clarity\nusing Point    = std::array\u003cdouble, 3\u003e;\nusing Triangle = std::array\u003cuint32_t, 3\u003e;\n\n// Create a simple mesh\ndf::Dataframe mesh;\n\n// Create vertices\ndf::Serie\u003cPoint\u003e vertices{\n    {0.0, 0.0, 0.0},\n    {1.0, 0.0, 0.0},\n    {0.0, 1.0, 0.0},\n    {0.0, 0.0, 1.0}\n};\n\n// Create triangles\ndf::Serie\u003cTriangle\u003e triangles{\n    {0, 1, 2},\n    {0, 2, 3},\n    {0, 3, 1},\n    {1, 3, 2}\n};\n\n// Add to DataFrame\nmesh.add(\"vertices\", vertices);\nmesh.add(\"triangles\", triangles);\n\n// Transform vertices\nauto transformed_vertices = df::map([](const Point\u0026 p) {\n    return Point{p[0] * 2.0, p[1] * 2.0, p[2] * 2.0};\n}, vertices);\nmesh.add(\"transformed_vertices\", transformed_vertices);\n\n// Add attributes at vertices\nmesh.add(\"norm\", df::norm(vertices));\nmesh.add(\"normal\", df::normals(vertices));\n```\n\n## Installation\n\nHeader-only library. Simply include the headers that you need in your project.\n\n## Requirements\n\n- C++20 or later\n- Modern C++ compiler (GCC, Clang, MSVC)\n\n## License\n\nMIT License - See LICENSE file for details.\n\n## Contact\nfmaerten@gmail.com\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fxaliphostes%2Fdataframe","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fxaliphostes%2Fdataframe","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fxaliphostes%2Fdataframe/lists"}