Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/apache/arrow
Apache Arrow is a multi-language toolbox for accelerated data interchange and in-memory processing
https://github.com/apache/arrow
arrow
Last synced: about 1 month ago
JSON representation
Apache Arrow is a multi-language toolbox for accelerated data interchange and in-memory processing
- Host: GitHub
- URL: https://github.com/apache/arrow
- Owner: apache
- License: apache-2.0
- Created: 2016-02-17T08:00:23.000Z (over 8 years ago)
- Default Branch: main
- Last Pushed: 2024-04-27T08:46:39.000Z (about 2 months ago)
- Last Synced: 2024-04-27T09:45:49.546Z (about 2 months ago)
- Topics: arrow
- Language: C++
- Homepage: https://arrow.apache.org/
- Size: 179 MB
- Stars: 13,523
- Watchers: 355
- Forks: 3,302
- Open Issues: 4,509
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- Contributing: .github/CONTRIBUTING.md
- License: LICENSE.txt
- Code of conduct: CODE_OF_CONDUCT.md
- Codeowners: .github/CODEOWNERS
Lists
- Fuchsia-Guide - Apache Arrow - memory analytics. It contains a set of technologies that enable big data systems to process and move data fast. Arrow's libraries are available for C, C++, C#, Go, Java, JavaScript, MATLAB, Python, R, Ruby, and Rust. (Rust Tools and Frameworks)
- Virtualization-Emulation-Guide - Apache Arrow - memory analytics. It contains a set of technologies that enable big data systems to process and move data fast. Arrow's libraries are available for C, C++, C#, Go, Java, JavaScript, MATLAB, Python, R, Ruby, and Rust. (Rust Tools, Libraries, and Frameworks / Interfaces)
- VSCode-Guide - Apache Arrow - memory analytics. It contains a set of technologies that enable big data systems to process and move data fast. Arrow's libraries are available for C, C++, C#, Go, Java, JavaScript, MATLAB, Python, R, Ruby, and Rust. (Rust Tools / JavaScript Libraries for Machine Learning)
- Android-Guide - Apache Arrow - memory analytics. It contains a set of technologies that enable big data systems to process and move data fast. Arrow's libraries are available for C, C++, C#, Go, Java, JavaScript, MATLAB, Python, R, Ruby, and Rust. (Rust Tools and Frameworks / VS Code Extensions for Developer Productivity)
- Firmware-Guide - Apache Arrow - memory analytics. It contains a set of technologies that enable big data systems to process and move data fast. Arrow's libraries are available for C, C++, C#, Go, Java, JavaScript, MATLAB, Python, R, Ruby, and Rust. (Rust Tools)
- ARM-Guide - Apache Arrow - memory analytics. It contains a set of technologies that enable big data systems to process and move data fast. Arrow's libraries are available for C, C++, C#, Go, Java, JavaScript, MATLAB, Python, R, Ruby, and Rust. (Rust Tools)
- Blockchain-Guide - Apache Arrow - memory analytics. It contains a set of technologies that enable big data systems to process and move data fast. Arrow's libraries are available for C, C++, C#, Go, Java, JavaScript, MATLAB, Python, R, Ruby, and Rust. (Rust Tools / E-Books)
- Developer-Handbook - Apache Arrow - memory analytics. It contains a set of technologies that enable big data systems to process and move data fast. Arrow's libraries are available for C, C++, C#, Go, Java, JavaScript, MATLAB, Python, R, Ruby, and Rust. (Tools / Mesh networks)
- awesome-stars - apache/arrow - Apache Arrow is a multi-language toolbox for accelerated data interchange and in-memory processing (C++)
- IoT-Guide - Apache Arrow - memory analytics. It contains a set of technologies that enable big data systems to process and move data fast. Arrow's libraries are available for C, C++, C#, Go, Java, JavaScript, MATLAB, Python, R, Ruby, and Rust. (Rust Tools / In-memory data grids)
- awesome-stars - apache/arrow - Apache Arrow is a multi-language toolbox for accelerated data interchange and in-memory processing (C++)
- Self-Hosting-Guide - Apache Arrow - memory analytics. It contains a set of technologies that enable big data systems to process and move data fast. Arrow's libraries are available for C, C++, C#, Go, Java, JavaScript, MATLAB, Python, R, Ruby, and Rust. (Rust Tools / In-memory data grids)
- Self-Hosting-Guide - Apache Arrow - memory analytics. It contains a set of technologies that enable big data systems to process and move data fast. Arrow's libraries are available for C, C++, C#, Go, Java, JavaScript, MATLAB, Python, R, Ruby, and Rust. (Rust Tools / In-memory data grids)
- awesome-stars - apache/arrow - Apache Arrow is a multi-language toolbox for accelerated data interchange and in-memory processing (C++)
- awesome-stars - apache/arrow - Apache Arrow is a multi-language toolbox for accelerated data interchange and in-memory processing (C++)
- awesome-stars - apache/arrow - language toolbox for accelerated data interchange and in-memory processing (C++)
- awesome-stars - apache/arrow - Apache Arrow is a multi-language toolbox for accelerated data interchange and in-memory processing (C++)
- jimsghstars - apache/arrow - Apache Arrow is a multi-language toolbox for accelerated data interchange and in-memory processing (C++)
- jimsghstars - apache/arrow - Apache Arrow is a multi-language toolbox for accelerated data interchange and in-memory processing (C++)
- awesome - arrow - Apache Arrow is a multi-language toolbox for accelerated data interchange and in-memory processing (C++)
- AWS-Guide - Apache Arrow - memory analytics. It contains a set of technologies that enable big data systems to process and move data fast. Arrow's libraries are available for C, C++, C#, Go, Java, JavaScript, MATLAB, Python, R, Ruby, and Rust. (Rust Tools / Interfaces)
- awesome-stars-copy - apache/arrow - Apache Arrow is a multi-language toolbox for accelerated data interchange and in-memory processing (C++)
- awesome-dataframes - Arrow - A cross-language development platform for in-memory data. (Other)
- awesome-python-machine-learning-resources - GitHub - 6% open · ⏱️ 25.08.2022): (数据容器和结构)
- awesome-starred - apache/arrow - Apache Arrow is a multi-language toolbox for accelerated data interchange and in-memory processing (others)
- my_awesome -
- awesome-android-cpp - apache/arrow - Apache Arrow is a cross-language development platform for in-memory data. It specifies a standardized language-independent columnar memory format for flat and hierarchical data, organized for efficient analytic operations on modern hardware. It also provides computational libraries and zero-copy streaming messaging and interprocess communication. Languages currently supported include C, C++, Java, JavaScript, Python, and Ruby. (TODO scan for Android support in followings)
- awesome-stars - arrow - language toolbox for accelerated data interchange and in-memory processing | apache | 13748 | (C++)
- my-awesome - apache/arrow - 06 star:13.7k fork:3.4k Apache Arrow is a multi-language toolbox for accelerated data interchange and in-memory processing (C++)
- stars - apache/arrow - language toolbox for accelerated data interchange and in-memory processing (HarmonyOS / Windows Manager)
- my-awesome-stars - apache/arrow - Apache Arrow is a multi-language toolbox for accelerated data interchange and in-memory processing (C++)
- AwesomeCppGameDev - arrow - language development platform for in-memory data. It specifies a standardized language-independent columnar memory format for flat and hierarchical data, organized for effic… (C++)
README
# Apache Arrow
[![Fuzzing Status](https://oss-fuzz-build-logs.storage.googleapis.com/badges/arrow.svg)](https://bugs.chromium.org/p/oss-fuzz/issues/list?sort=-opened&can=1&q=proj:arrow)
[![License](http://img.shields.io/:license-Apache%202-blue.svg)](https://github.com/apache/arrow/blob/main/LICENSE.txt)
[![Twitter Follow](https://img.shields.io/twitter/follow/apachearrow.svg?style=social&label=Follow)](https://twitter.com/apachearrow)## Powering In-Memory Analytics
Apache Arrow is a development platform for in-memory analytics. It contains a
set of technologies that enable big data systems to process and move data fast.Major components of the project include:
- [The Arrow Columnar In-Memory Format](https://arrow.apache.org/docs/dev/format/Columnar.html):
a standard and efficient in-memory representation of various datatypes, plain or nested
- [The Arrow IPC Format](https://arrow.apache.org/docs/dev/format/Columnar.html#serialization-and-interprocess-communication-ipc):
an efficient serialization of the Arrow format and associated metadata,
for communication between processes and heterogeneous environments
- [The Arrow Flight RPC protocol](https://github.com/apache/arrow/tree/main/format/Flight.proto):
based on the Arrow IPC format, a building block for remote services exchanging
Arrow data with application-defined semantics (for example a storage server or a database)
- [C++ libraries](https://github.com/apache/arrow/tree/main/cpp)
- [C bindings using GLib](https://github.com/apache/arrow/tree/main/c_glib)
- [C# .NET libraries](https://github.com/apache/arrow/tree/main/csharp)
- [Gandiva](https://github.com/apache/arrow/tree/main/cpp/src/gandiva):
an [LLVM](https://llvm.org)-based Arrow expression compiler, part of the C++ codebase
- [Go libraries](https://github.com/apache/arrow/tree/main/go)
- [Java libraries](https://github.com/apache/arrow/tree/main/java)
- [JavaScript libraries](https://github.com/apache/arrow/tree/main/js)
- [Python libraries](https://github.com/apache/arrow/tree/main/python)
- [R libraries](https://github.com/apache/arrow/tree/main/r)
- [Ruby libraries](https://github.com/apache/arrow/tree/main/ruby)
- [Rust libraries](https://github.com/apache/arrow-rs)Arrow is an [Apache Software Foundation](https://www.apache.org) project. Learn more at
[arrow.apache.org](https://arrow.apache.org).## What's in the Arrow libraries?
The reference Arrow libraries contain many distinct software components:
- Columnar vector and table-like containers (similar to data frames) supporting
flat or nested types
- Fast, language agnostic metadata messaging layer (using Google's Flatbuffers
library)
- Reference-counted off-heap buffer memory management, for zero-copy memory
sharing and handling memory-mapped files
- IO interfaces to local and remote filesystems
- Self-describing binary wire formats (streaming and batch/file-like) for
remote procedure calls (RPC) and interprocess communication (IPC)
- Integration tests for verifying binary compatibility between the
implementations (e.g. sending data from Java to C++)
- Conversions to and from other in-memory data structures
- Readers and writers for various widely-used file formats (such as Parquet, CSV)## Implementation status
The official Arrow libraries in this repository are in different stages of
implementing the Arrow format and related features. See our current
[feature matrix](https://arrow.apache.org/docs/dev/status.html)
on git main.## How to Contribute
Please read our latest [project contribution guide][5].
## Getting involved
Even if you do not plan to contribute to Apache Arrow itself or Arrow
integrations in other projects, we'd be happy to have you involved:- Join the mailing list: send an email to
[[email protected]][1]. Share your ideas and use cases for the
project.
- Follow our activity on [GitHub issues][3]
- [Learn the format][2]
- Contribute code to one of the reference implementations[1]: mailto:[email protected]
[2]: https://github.com/apache/arrow/tree/main/format
[3]: https://github.com/apache/arrow/issues
[4]: https://github.com/apache/arrow
[5]: https://arrow.apache.org/docs/dev/developers/index.html