{"id":27306214,"url":"https://github.com/arkflow-rs/arkflow","last_synced_at":"2026-02-24T23:01:49.175Z","repository":{"id":280072180,"uuid":"940905405","full_name":"arkflow-rs/arkflow","owner":"arkflow-rs","description":"High performance Rust stream processing engine seamlessly integrates AI capabilities, providing powerful real-time data processing and intelligent analysis. ","archived":false,"fork":false,"pushed_at":"2026-02-16T16:29:02.000Z","size":4468,"stargazers_count":1249,"open_issues_count":27,"forks_count":41,"subscribers_count":2,"default_branch":"main","last_synced_at":"2026-02-16T23:56:17.150Z","etag":null,"topics":["ai","arkflow","datafusion","deep-learning","duckdb","flow","kafka","machine-learning","mysql","nats","postgresql","redis","rust","rust-lang","sql","sqlite","stream","tokio","tokio-rs","websocket"],"latest_commit_sha":null,"homepage":"https://arkflow-rs.com/","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/arkflow-rs.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-03-01T03:02:55.000Z","updated_at":"2026-02-16T16:28:59.000Z","dependencies_parsed_at":"2025-04-27T01:21:22.217Z","dependency_job_id":"4366b7bc-eff0-449f-84a5-5034ed9be4d0","html_url":"https://github.com/arkflow-rs/arkflow","commit_stats":null,"previous_names":["chenquan/rtflow","chenquan/rsflow","chenquan/xflow","chenquan/arkflow","ark-flow/arkflow","arkflow-rs/arkflow"],"tags_count":9,"template":false,"template_full_name":null,"purl":"pkg:github/arkflow-rs/arkflow","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/arkflow-rs%2Farkflow","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/arkflow-rs%2Farkflow/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/arkflow-rs%2Farkflow/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/arkflow-rs%2Farkflow/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/arkflow-rs","download_url":"https://codeload.github.com/arkflow-rs/arkflow/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/arkflow-rs%2Farkflow/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29804137,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-24T22:43:48.403Z","status":"ssl_error","status_checked_at":"2026-02-24T22:43:18.536Z","response_time":75,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","arkflow","datafusion","deep-learning","duckdb","flow","kafka","machine-learning","mysql","nats","postgresql","redis","rust","rust-lang","sql","sqlite","stream","tokio","tokio-rs","websocket"],"created_at":"2025-04-12T03:48:50.059Z","updated_at":"2026-02-24T23:01:49.169Z","avatar_url":"https://github.com/arkflow-rs.png","language":"Rust","readme":"# ArkFlow\n\n\u003cp align=\"center\"\u003e\n\u003cimg align=\"center\" width=\"150px\" src=\"images/logo.svg\"\u003e\n\u003cp align=\"center\"\u003e\n\nEnglish | [中文](README_zh.md)\n\n[![Rust](https://github.com/arkflow-rs/arkflow/actions/workflows/rust.yml/badge.svg)](https://github.com/arkflow-rs/arkflow/actions/workflows/rust.yml)\n[![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](LICENSE)\n\n[Latest docs](https://arkflow-rs.com/docs/intro) | [Dev docs](https://arkflow-rs.com/docs/next/intro)\n\n\u003ca href=\"https://www.producthunt.com/posts/arkflow?embed=true\u0026utm_source=badge-featured\u0026utm_medium=badge\u0026utm_souce=badge-arkflow\" target=\"_blank\"\u003e\u003cimg src=\"https://api.producthunt.com/widgets/embed-image/v1/featured.svg?post_id=942804\u0026theme=light\u0026t=1743136262336\" alt=\"ArkFlow - High\u0026#0045;performance\u0026#0032;rust\u0026#0032;stream\u0026#0032;processing\u0026#0032;engine | Product Hunt\" style=\"width: 250px; height: 54px;\" width=\"250\" height=\"54\" /\u003e\u003c/a\u003e\n\nHigh performance Rust stream processing engine seamlessly integrates AI capabilities, \nproviding powerful real-time data processing and intelligent analysis. \nIt not only supports multiple input/output sources and processors, but also enables easy loading and execution of machine learning models, \nenabling streaming data and inference, anomaly detection, and complex event processing.\n\n##  Cloud Native Landscape\n\n\u003cp float=\"left\"\u003e\n\u003cimg src=\"images/cncf-logo.svg\" width=\"200\"/\u003e\u0026nbsp;\u0026nbsp;\u0026nbsp;\n\u003cimg src=\"images/cncf-landscape-logo.svg\" width=\"150\"/\u003e\n\u003c/p\u003e\n\nArkFlow enlisted in the [CNCF Cloud Native Landscape](https://landscape.cncf.io/?item=app-definition-and-development--streaming-messaging--arkflow).\n\n## Features\n\n- **High Performance**: Built on Rust and Tokio async runtime, offering excellent performance and low latency\n- **Multiple Data Sources**: Support for Kafka, MQTT, HTTP, files, and other input/output sources\n- **Powerful Processing Capabilities**: Built-in SQL queries, Python script, JSON processing, Protobuf encoding/decoding, batch\n  processing, and other processors\n- **Extensible**: Modular design, easy to extend with new input, buffer, output, and processor components\n\n## Installation\n\n### Building from Source\n\n```bash\n# Clone the repository\ngit clone https://github.com/arkflow-rs/arkflow.git\ncd arkflow\n\n# Build the project\ncargo build --release\n\n# Run tests\ncargo test\n```\n\n## Quick Start\n\n1. Create a configuration file `config.yaml`:\n\n```yaml\nlogging:\n  level: info\nstreams:\n  - input:\n      type: \"generate\"\n      context: '{ \"timestamp\": 1625000000000, \"value\": 10, \"sensor\": \"temp_1\" }'\n      interval: 1s\n      batch_size: 10\n\n    pipeline:\n      thread_num: 4\n      processors:\n        - type: \"json_to_arrow\"\n        - type: \"sql\"\n          query: \"SELECT * FROM flow WHERE value \u003e= 10\"\n\n    output:\n      type: \"stdout\"\n    error_output:\n      type: \"stdout\"\n```\n\n2. Run ArkFlow:\n\n```bash\n./target/release/arkflow --config config.yaml\n```\n\n## Configuration Guide\n\nArkFlow uses YAML format configuration files, supporting the following main configuration items:\n\n### Top-level Configuration\n\n```yaml\nlogging:\n  level: info  # Log level: debug, info, warn, error\n\nstreams: # Stream definition list\n  - input:      # Input configuration\n    # ...\n    pipeline:   # Processing pipeline configuration\n    # ...\n    output:     # Output configuration\n    # ...\n    error_output: # Error output configuration\n    # ...\n    buffer:     # Buffer configuration\n    # ... \n```\n\n### Input Components\n\nArkFlow supports multiple input sources:\n\n- **Kafka**: Read data from Kafka topics\n- **MQTT**: Subscribe to messages from MQTT topics\n- **HTTP**: Receive data via HTTP\n- **File**: Reading data from files(Csv,Json, Parquet, Avro, Arrow) using SQL\n- **Generator**: Generate test data\n- **Database**: Query data from databases(MySQL, PostgreSQL, SQLite, Duckdb)\n- **Nats**: Subscribe to messages from Nats topics\n- **Redis**: Subscribe to messages from Redis channels or lists\n- **Websocket**: Subscribe to messages from WebSocket connections\n- **Modbus**: Read data from Modbus devices\n\nExample:\n\n```yaml\ninput:\n  type: kafka\n  brokers:\n    - localhost:9092\n  topics:\n    - test-topic\n  consumer_group: test-group\n  client_id: arkflow\n  start_from_latest: true\n```\n\n### Processors\n\nArkFlow provides multiple data processors:\n\n- **JSON**: JSON data processing and transformation\n- **SQL**: Process data using SQL queries\n- **Protobuf**: Protobuf encoding/decoding\n- **Batch Processing**: Process messages in batches\n- **Vrl**: Process data using [VRL](https://vector.dev/docs/reference/vrl/)\n\nExample:\n\n```yaml\npipeline:\n  thread_num: 4\n  processors:\n    - type: json_to_arrow\n    - type: sql\n      query: \"SELECT * FROM flow WHERE value \u003e= 10\"\n```\n\n### Output Components\n\nArkFlow supports multiple output targets:\n\n- **Kafka**: Write data to Kafka topics\n- **MQTT**: Publish messages to MQTT topics\n- **HTTP**: Send data via HTTP\n- **Standard Output**: Output data to the console\n- **Drop**: Discard data\n- **Nats**: Publish messages to Nats topics\n\nExample:\n\n```yaml\noutput:\n  type: kafka\n  brokers:\n    - localhost:9092\n  topic:\n    type: value\n    value:\n      type: value\n      value: test-topic\n  client_id: arkflow-producer\n```\n\n### Error Output Components\n\nArkFlow supports multiple error output targets:\n\n- **Kafka**: Write error data to Kafka topics\n- **MQTT**: Publish error messages to MQTT topics\n- **HTTP**: Send error data via HTTP\n- **Standard Output**: Output error data to the console\n- **Drop**: Discard error data\n- **Nats**: Publish messages to Nats topics\n\nExample:\n\n```yaml\nerror_output:\n  type: kafka\n  brokers:\n    - localhost:9092\n  topic:\n    type: value\n    value: error-topic\n  client_id: error-arkflow-producer\n``` \n\n### Buffer Components\n\nArkFlow provides buffer capabilities to handle backpressure and temporary storage of messages:\n\n- **Memory Buffer**: Memory buffer, for high-throughput scenarios and window aggregation.\n- **Session Window**: The Session Window buffer component provides a session-based message grouping mechanism where\n  messages are grouped based on activity gaps. It implements a session window that closes after a configurable period of\n  inactivity.\n- **Sliding Window**: The Sliding Window buffer component provides a time-based windowing mechanism for processing\n  message batches. It implements a sliding window algorithm with configurable window size, slide interval and slide\n  size.\n- **Tumbling Window**: The Tumbling Window buffer component provides a fixed-size, non-overlapping windowing mechanism\n  for processing message batches. It implements a tumbling window algorithm with configurable interval settings.\n\nExample:\n\n```yaml\nbuffer:\n  type: memory\n  capacity: 10000  # Maximum number of messages to buffer\n  timeout: 10s  # Maximum time to buffer messages\n```\n\n## Examples\n\n### Kafka to Kafka Data Processing\n\n```yaml\nstreams:\n  - input:\n      type: kafka\n      brokers:\n        - localhost:9092\n      topics:\n        - test-topic\n      consumer_group: test-group\n\n    pipeline:\n      thread_num: 4\n      processors:\n        - type: json_to_arrow\n        - type: sql\n          query: \"SELECT * FROM flow WHERE value \u003e 100\"\n\n    output:\n      type: kafka\n      brokers:\n        - localhost:9092\n      topic:\n        type: value\n        value: test-topic\n```\n\n### Generate Test Data and Process\n\n```yaml\nstreams:\n  - input:\n      type: \"generate\"\n      context: '{ \"timestamp\": 1625000000000, \"value\": 10, \"sensor\": \"temp_1\" }'\n      interval: 1ms\n      batch_size: 10000\n\n    pipeline:\n      thread_num: 4\n      processors:\n        - type: \"json_to_arrow\"\n        - type: \"sql\"\n          query: \"SELECT count(*) FROM flow WHERE value \u003e= 10 group by sensor\"\n\n    output:\n      type: \"stdout\"\n```\n\n## Users\n\n- Conalog(Country: South Korea)\n\n## ArkFlow Plugin\n\n[ArkFlow Plugin Examples](https://github.com/arkflow-rs/arkflow-plugin-examples)\n\n## License\n\nArkFlow is licensed under the [Apache License 2.0](LICENSE).\n\n## Community\n\nDiscord: https://discord.gg/CwKhzb8pux\n\nIf you like or are using this project to learn or start your solution, please give it a star⭐. Thanks!","funding_links":[],"categories":["Table of Contents","Libraries","Recently Updated","\u003ca name=\"Rust\"\u003e\u003c/a\u003eRust"],"sub_categories":["Streaming Engine","Data streaming","[May 11, 2025](/content/2025/05/11/README.md)"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Farkflow-rs%2Farkflow","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Farkflow-rs%2Farkflow","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Farkflow-rs%2Farkflow/lists"}