https://github.com/man-group/sparrow
C++20 idiomatic APIs for the Apache Arrow Columnar Format
https://github.com/man-group/sparrow
apache apache-arrow cpp20
Last synced: 7 months ago
JSON representation
C++20 idiomatic APIs for the Apache Arrow Columnar Format
- Host: GitHub
- URL: https://github.com/man-group/sparrow
- Owner: man-group
- License: apache-2.0
- Created: 2024-02-07T13:06:48.000Z (almost 2 years ago)
- Default Branch: main
- Last Pushed: 2025-06-23T09:50:20.000Z (7 months ago)
- Last Synced: 2025-06-24T05:46:47.889Z (7 months ago)
- Topics: apache, apache-arrow, cpp20
- Language: C++
- Homepage:
- Size: 1.71 MB
- Stars: 88
- Watchers: 9
- Forks: 20
- Open Issues: 31
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# sparrow
[](https://github.com/man-group/sparrow/actions/workflows/linux.yml)
[](https://github.com/man-group/sparrow/actions/workflows/osx.yml)
[](https://github.com/man-group/sparrow/actions/workflows/windows.yml)
[](https://github.com/man-group/sparrow/actions/workflows/docs.yaml)
[](https://app.codecov.io/gh/man-group/sparrow)
C++20 idiomatic APIs for the Apache Arrow Columnar Format
## Introduction
`sparrow` is an implementation of the Apache Arrow Columnar format in C++. It provides array structures
with idiomatic APIs and convenient conversions from and to the [C interface](https://arrow.apache.org/docs/dev/format/CDataInterface.html#structure-definitions).
`sparrow` requires a modern C++ compiler supporting C++20.
## Installation
### Package managers
We provide a package for the mamba (or conda) package manager:
```bash
mamba install -c conda-forge sparrow
```
### Install from sources
`sparrow` has a few dependencies that you can install in a mamba environment:
```bash
mamba env create -f environment-dev.yml
mamba activate sparrow
```
You can then create a build directory, and build the project and install it with cmake:
```bash
mkdir build
cd build
cmake .. \
-DCMAKE_BUILD_TYPE=Debug \
-DCMAKE_INSTALL_PREFIX=$CONDA_PREFIX \
-DBUILD_EXAMPLES=ON \
-DBUILD_TESTS=ON \
-BUILD_DOCS=ON \
..
make install
```
## Usage
### Requirements
Compilers:
- Clang 18 or higher
- GCC 11.2 or higher
- Apple Clang 16 or higher
- MSVC 19.41 or higher
### Initialize data with sparrow and extract C data structures
```cpp
#include "sparrow/sparrow.hpp"
namespace sp = sparrow;
sp::primitive_array ar = { 1, 3, 5, 7, 9 };
auto [arrow_array, arrow_schema] = sp::extract_arrow_structures(std::move(ar));
// Use arrow_array and arrow_schema as you need (serialization, passing it to
// a third party library)
// ...
// You are responsible for releasing the structure in the end
arrow_array.release(&arrow_array);
arrow_schema.release(&arrow_schema);
```
### Initialize data with sparrow and use C data structures
```cpp
#include "sparrow/sparrow.hpp"
namespace sp = sparrow;
sp::primitive_array ar = { 1, 3, 5, 7, 9 };
// Caution: get_arrow_structures returns pointers, not values
auto [arrow_array, arrow_schema] = sp::get_arrow_structures(ar);
// Use arrow_array and arrow_schema as you need (serialization, passing it to
// a third party library)
// ...
// do NOT release the C structures in the end, the "ar" variable will do it for you
```
### Read data from somewhere and pass it to sparrow
```cpp
#include "sparrow/sparrow.hpp"
#include "thrid-party-lib.hpp"
namespace sp = sparrow;
namespace tpl = third_party_library;
ArrowArray array;
ArrowSchema schema;
tpl::read_arrow_structures(&array, &schema);
sp::array ar(&array, &schema);
// Use ar as you need
// ...
// You are responsible for releasing the structure in the end
array.release(&array);
schema.release(&schema);
```
### Read data from somewhere and move it into sparrow
```cpp
#include "sparrow/sparrow.hpp"
#include "thrid-party-lib.hpp"
namespace sp = sparrow;
namespace tpl = third_party_library;
ArrowArray array;
ArrowSchema schema;
tpl::read_arrow_structures(&array, &schema);
sp::array ar(std::move(array), std::move(schema));
// Use ar as you need
// ...
// do NOT release the C structures in the end, the "ar" variable will do it for you
```
## Documentation
The documentation (currently being written) can be found at https://man-group.github.io/sparrow/index.html
## Acknowledgements
This development has been funded as part of a collaboration between [ArcticDB](https://github.com/man-group/ArcticDB), Bloomberg, and [QuantStack](https://quantstack.net/).
## License
This software is licensed under the Apache License 2.0. See the [LICENSE](LICENSE) file for details.