Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/ArroyoSystems/arroyo
Distributed stream processing engine in Rust
https://github.com/ArroyoSystems/arroyo
data data-stream-processing dev-tools infrastructure kafka rust sql stream-processing stream-processing-engine
Last synced: 2 months ago
JSON representation
Distributed stream processing engine in Rust
- Host: GitHub
- URL: https://github.com/ArroyoSystems/arroyo
- Owner: ArroyoSystems
- License: apache-2.0
- Created: 2023-03-31T17:41:56.000Z (almost 2 years ago)
- Default Branch: master
- Last Pushed: 2024-10-29T22:39:29.000Z (2 months ago)
- Last Synced: 2024-10-30T00:46:07.603Z (2 months ago)
- Topics: data, data-stream-processing, dev-tools, infrastructure, kafka, rust, sql, stream-processing, stream-processing-engine
- Language: Rust
- Homepage: https://arroyo.dev
- Size: 13.6 MB
- Stars: 3,732
- Watchers: 42
- Forks: 215
- Open Issues: 64
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE-APACHE
Awesome Lists containing this project
- awesome-rust - ArroyoSystems/arroyo - High-performance real-time analytics in Rust and SQL [![CI](https://github.com/ArroyoSystems/arroyo/actions/workflows/ci.yml/badge.svg?branch=master)](https://github.com/ArroyoSystems/arroyo/actions) (Libraries / Data streaming)
- awesome-repositories - ArroyoSystems/arroyo - Distributed stream processing engine in Rust (Rust)
- awesome-rust - ArroyoSystems/arroyo - High-performance real-time analytics in Rust and SQL [![CI](https://github.com/ArroyoSystems/arroyo/actions/workflows/ci.yml/badge.svg?branch=master)](https://github.com/ArroyoSystems/arroyo/actions) (Libraries / Data streaming)
- my-awesome - ArroyoSystems/arroyo - stream-processing,dev-tools,infrastructure,kafka,rust,sql,stream-processing,stream-processing-engine pushed_at:2024-12 star:3.9k fork:0.2k Distributed stream processing engine in Rust (Rust)
- fucking-awesome-rust - ArroyoSystems/arroyo - High-performance real-time analytics in Rust and SQL [![CI](https://github.com/ArroyoSystems/arroyo/actions/workflows/ci.yml/badge.svg?branch=master)](https://github.com/ArroyoSystems/arroyo/actions) (Libraries / Data streaming)
- fucking-awesome-rust - ArroyoSystems/arroyo - High-performance real-time analytics in Rust and SQL [![CI](https://github.com/ArroyoSystems/arroyo/actions/workflows/ci.yml/badge.svg?branch=master)](https://github.com/ArroyoSystems/arroyo/actions) (Libraries / Data streaming)
README
Arroyo Cloud |
Getting started |
Docs |
Discord |
Website
[Arroyo](https://arroyo.dev) is a distributed stream processing engine written in Rust, designed to efficiently
perform stateful computations on streams of data. Unlike traditional batch processing, streaming engines can operate
on both bounded and unbounded sources, emitting results as soon as they are available.In short: Arroyo lets you ask complex questions of high-volume real-time data with subsecond results.
![running job](https://raw.githubusercontent.com/ArroyoSystems/arroyo/760aabdbdb019d95f0c5ebb60933233aa735f830/images/header_image.png)
## Features
π¦ SQL and Rust pipelines
π Scales up to millions of events per second
πͺ Stateful operations like windows and joins
π₯State checkpointing for fault-tolerance and recovery of pipelines
π Timely stream processing via the [Dataflow model](https://www.oreilly.com/radar/the-world-beyond-batch-streaming-101/)
## Use cases
Some example use cases include:
* Detecting fraud and security incidents
* Real-time product and business analytics
* Real-time ingestion into your data warehouse or data lake
* Real-time ML feature generation## Why Arroyo
There are already a number of existing streaming engines out there, including [Apache Flink](https://flink.apache.org),
[Spark Streaming](https://spark.apache.org/docs/latest/streaming-programming-guide.html), and
[Kafka Streams](https://kafka.apache.org/documentation/streams/). Why create a new one?* _Serverless operations_: Arroyo pipelines are designed to run in modern cloud environments, supporting seamless scaling,
recovery, and rescheduling
* _High performance SQL_: SQL is a first-class concern, with consistently excellent performance
* _Designed for non-experts_: Arroyo cleanly separates the pipeline APIs from its internal implementation. You don't
need to be a streaming expert to build real-time data pipelines.## Installing
Arroyo ships as a single binary. You can install it locally on MacOS using Homebrew
```shellsession
brew install arroyosystems/tap/arroyo
```or on MacOS or Linux with this script:
```shellsession
curl -LsSf https://arroyo.dev/install.sh | sh
```or you can download a binary for your platform from the [releases page](https://github.com/ArroyoSystems/arroyo/releases).
Once you have Arroyo installed, start a cluster with
```shellsession
$ arroyo cluster
```You can also run a cluster in Docker, with
```shellsession
docker run -p 5115:5115 \
ghcr.io/arroyosystems/arroyo:latest
```Then, load the Web UI at http://localhost:5115.
For a more in-depth guide, see the [getting started guide](https://doc.arroyo.dev/getting-started).
Once you have Arroyo running, follow the [tutorial](https://doc.arroyo.dev/tutorial) to create your first real-time
pipeline.## Developing Arroyo
We love contributions from the community! See the [developer setup](https://doc.arroyo.dev/developing/dev-setup) guide
to get started, and reach out to the team on [discord](https://discord.gg/cjCr5rVmyR) or create an issue.## Community
* [Discord](https://discord.gg/cjCr5rVmyR) β support and project discussion
* [GitHub issues](https://github.com/ArroyoSystems/arroyo/issues) β bugs and feature requests
* [Arroyo Blog](https://arroyo.dev/blog) β updates from the Arroyo team## Arroyo Enterprise
Running in production? Arroyo Systems provides [enterprise features and support](https://www.arroyo.dev/enterprise) for
Arroyo users. Get in touch at [[email protected]](mailto:[email protected]).