{"id":18489093,"url":"https://github.com/hachreak/stepflow","last_synced_at":"2025-08-10T06:43:46.055Z","repository":{"id":139147518,"uuid":"97038987","full_name":"hachreak/stepflow","owner":"hachreak","description":"Streaming Engine that implements Flume patterns.","archived":false,"fork":false,"pushed_at":"2018-01-25T14:18:49.000Z","size":97,"stargazers_count":8,"open_issues_count":0,"forks_count":1,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-04-09T15:57:14.134Z","etag":null,"topics":["aggregate","collection","computing","distributed","dsl","elasticsearch","erlang","erlang-libraries","erlang-library","flume","flume-patterns","mnesia","pipeline","rabbitmq","reliable","stepflow","streaming"],"latest_commit_sha":null,"homepage":"","language":"Erlang","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/hachreak.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2017-07-12T18:13:23.000Z","updated_at":"2019-06-04T00:18:00.000Z","dependencies_parsed_at":null,"dependency_job_id":"b7a1e779-0b43-4cbb-9eed-f149b5bfc851","html_url":"https://github.com/hachreak/stepflow","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/hachreak/stepflow","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hachreak%2Fstepflow","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hachreak%2Fstepflow/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hachreak%2Fstepflow/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hachreak%2Fstepflow/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/hachreak","download_url":"https://codeload.github.com/hachreak/stepflow/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hachreak%2Fstepflow/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":269688010,"owners_count":24459398,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-10T02:00:08.965Z","response_time":71,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["aggregate","collection","computing","distributed","dsl","elasticsearch","erlang","erlang-libraries","erlang-library","flume","flume-patterns","mnesia","pipeline","rabbitmq","reliable","stepflow","streaming"],"created_at":"2024-11-06T12:55:13.513Z","updated_at":"2025-08-10T06:43:46.012Z","avatar_url":"https://github.com/hachreak.png","language":"Erlang","funding_links":[],"categories":[],"sub_categories":[],"readme":"stepflow\n========\n\n[![Build Status](https://travis-ci.org/hachreak/stepflow.svg?branch=master)](https://travis-ci.org/hachreak/stepflow)\n\nAn OTP application that implements Flume patterns.\n\nIt can be useful if you need to collect, aggregate, transform, move large\namount of data from/to different sources/destinations.\n\nImplements ingest and real-time processing pipelines.\n\nYou can define `agents` that will forms a pipeline for events.\nA event will represent a unit of information.\nEvery `agent` if made by one source and one or more sinks.\n\nA source-sink is connected by a `channel`.\nAfter a `source` and before every `sink` you can inject interceptors as many as\nyou want.\nEvery `interceptor` can enrich, transforms, aggregates, reject, ...\n\nThere are different channels: on RAM, on mnesia table, on RabbitMQ.\n\nEvery channels is made to take advantages of the technology used and\nmaximize the reliability of the system also if something goes wrong, depending\nhow much the memory is permanent.\n\nAll the events are staged inside the channel until they are successfully stored\ninside the next agent or in a terminal repository (e.g. database, file, ...).\n\nBuild\n-----\n\n    $ rebar3 compile\n\nRun demo 1\n----------\n\nTwo agents connected:\n\n```\n  +-----------------------------+        +-----------------------------+\n  |         Agent 1             |        |            Agent 2          |\n  |                             |        |                             |\n  |Source \u003c--\u003e Channel \u003c--\u003e Sink| \u003c----\u003e |Source \u003c--\u003e Channel \u003c--\u003e Sink|\n  |                             |        |                             |\n  +-----------------------------+        +-----------------------------+\n```\n\n    $ rebar3 auto --sname pippo --apps stepflow --config priv/example.config\n\n    # Run Agent 1 and Agent 2\n\n    1\u003e [{_, {_, PidS1, _}}, {_, {_, PidS2, _}}] = stepflow_config:run(\"\n          interceptor Counter = stepflow_interceptor_counter#{}.\n          source FromMsg = stepflow_source_message[Counter]#{}.\n          channel Memory = stepflow_channel_memory#{}.\n          sink Echo = stepflow_sink_echo[Counter]#{}.\n\n          flow Agent2: FromMsg |\u003e Memory |\u003e Echo.\n\n          sink Connector = stepflow_sink_message[Counter]#{source =\u003e Agent2}.\n\n          flow Agent1: FromMsg |\u003e Memory |\u003e Connector.\n    \").\n\n    # Send a message from Agent 1 to Agent 2\n    2\u003e stepflow_source_message:append(\n        PidS1, [stepflow_event:new(#{}, \u003c\u003c\"hello\"\u003e\u003e)]).\n\nRun demo 2\n----------\n\nOne source and two sinks (passing from memory and rabbitmq):\n\n```\n  +-------------------------------------------+\n  |         Agent 1                           |\n  |                                           |\n  |Source \u003c--\u003e Channel1 (memory)   \u003c--\u003e Sink1 |\n  |        |                                  |\n  |        +-\u003e Channel2 (rabbitmq) \u003c--\u003e Sink2 |\n  +-------------------------------------------+\n```\n\n    $ rebar3 auto --sname pippo --apps stepflow --config priv/example.config\n\n    1\u003e [{_, {_, PidS, _}}] = stepflow_config:run(\"\n        \u003c\u003c\u003c\n          FilterFun = fun(Events) -\u003e\n            lists:any(fun(E) -\u003e E == \u003c\u003c\\\"filtered\\\"\u003e\u003e end, Events)\n          end.\n        \u003e\u003e\u003e\n\n        interceptor Filter = stepflow_interceptor_filter#{filter =\u003e FilterFun}.\n        interceptor Echo = stepflow_interceptor_echo#{}.\n        source FromMsg = stepflow_source_message[]#{}.\n        channel Memory = stepflow_channel_memory#{}.\n        channel Rabbitmq = stepflow_channel_rabbitmq#{}.\n        sink EchoMemory = stepflow_sink_echo[Echo]#{}.\n        sink EchoRabbitmq = stepflow_sink_echo[Filter]#{}.\n\n        flow Agent: FromMsg |\u003e Memory   |\u003e EchoMemory;\n                            |\u003e Rabbitmq |\u003e EchoRabbitmq.\n        \").\n\n    \u003e stepflow_source_message:append(PidS, [\u003c\u003c\"hello\"\u003e\u003e]).\n    \u003e % filtered message!\n    \u003e stepflow_source_message:append(PidS, [\u003c\u003c\"filtered\"\u003e\u003e]).\n\nRun demo 3\n----------\n\nCount events but skip body `\u003c\u003c\"found\"\u003e\u003e`:\n\n    1\u003e [{_, {_, PidS, _}}] = stepflow_config:run(\"\n        \u003c\u003c\u003c\n        FilterFun = fun(Events) -\u003e\n            lists:any(fun(Event) -\u003e\n                stepflow_event:body(Event) == \u003c\u003c\\\"found\\\"\u003e\u003e\n              end, Events)\n          end.\n        \u003e\u003e\u003e\n\n        interceptor Counter = stepflow_interceptor_counter#{\n          header =\u003e mycounter, eval =\u003e FilterFun\n        }.\n        interceptor Show = stepflow_interceptor_echo#{}.\n        source FromMsg = stepflow_source_message[]#{}.\n        channel Rabbitmq = stepflow_channel_rabbitmq#{}.\n        sink Echo = stepflow_sink_echo[Counter, Show]#{}.\n\n        flow Agent: FromMsg |\u003e Rabbitmq |\u003e Echo.\n        \").\n\n    # One event that is counted\n    stepflow_source_message:append(PidS, [stepflow_event:new(#{}, \u003c\u003c\"hello\"\u003e\u003e)]).\n\n    # One event that is NOT counted\n    stepflow_source_message:append(PidS, [stepflow_event:new(#{}, \u003c\u003c\"found\"\u003e\u003e)]).\n\nRun demo 4\n----------\n\nHandle bulk of 7 events with a window of 10 seconds:\n\n    1\u003e [{_, {_, PidS, _}}] = stepflow_config:run(\"\n        interceptor Counter = stepflow_interceptor_counter#{}.\n        source FromMsg = stepflow_source_message[Counter]#{}.\n        channel Buffer = stepflow_channel_mnesia#{\n            flush_period =\u003e 5000, capacity =\u003e 7, table =\u003e mytable\n        }.\n        sink Echo = stepflow_sink_echo[]#{}.\n        flow Squeeze: FromMsg |\u003e Buffer |\u003e Echo.\n    \").\n\n    # send multiple message quickly to fill the buffer!\n    # you will see that they arrive all together.\u003cF11\u003e\n    7\u003e stepflow_source_message:append(PidS, [stepflow_event:new(#{}, \u003c\u003c\"hello\"\u003e\u003e)]).\n    8\u003e stepflow_source_message:append(PidS, [stepflow_event:new(#{}, \u003c\u003c\"hello\"\u003e\u003e)]).\n    9\u003e stepflow_source_message:append(PidS, [stepflow_event:new(#{}, \u003c\u003c\"hello\"\u003e\u003e)]).\n\nRun demo 5\n----------\n\nAggregate events in a single one:\n\n    1\u003e [{_, {_, PidS, _}}] = stepflow_config:run(\"\n        \u003c\u003c\u003c\n        SqueezeFun = fun(Events) -\u003e\n                 BodyNew = lists:foldr(fun(Event, Acc) -\u003e\n                     Body = stepflow_event:body(Event),\n                     \u003c\u003c Body/binary, \u003c\u003c\\\" \\\"\u003e\u003e/binary, Acc/binary \u003e\u003e\n                   end, \u003c\u003c\\\"\\\"\u003e\u003e, Events),\n                 {ok, [stepflow_event:new(#{}, BodyNew)]}\n               end.\n        \u003e\u003e\u003e\n\n        interceptor Squeezer = stepflow_interceptor_transform#{\n          eval =\u003e SqueezeFun\n        }.\n        source FromMsg = stepflow_source_message[Squeezer]#{}.\n        channel Mnesia = stepflow_channel_mnesia#{\n          flush_period =\u003e 10, capacity =\u003e 2, table =\u003e pippo\n        }.\n        sink Echo = stepflow_sink_echo[]#{}.\n\n        flow Aggretator: FromMsg |\u003e Mnesia |\u003e Echo.\n        \").\n\n    8\u003e stepflow_source_message:append(PidS, [\n         stepflow_event:new(#{}, \u003c\u003c\"hello\"\u003e\u003e),\n         stepflow_event:new(#{}, \u003c\u003c\" world\"\u003e\u003e)\n       ]).\n\nRun demo 6\n----------\n\nIndex events in ElasticSearch.\n\n```\n          +------------------------------------------------------------------+\n          |                              Agent 1                             |\nUser      |                                                                  |\n |        |     Source \u003c---------------\u003e Channel \u003c--------\u003e Sink             |\n +-------\u003e| (erlang message)             (memory)       (index inside ES)    |\n   SEND   |                                                                  |\n   Event  +------------------------------------------------------------------+\n \u003c\u003c\"hello\"\u003e\u003e\n```\n\n    $ rebar3 shell --apps stepflow_sink_elasticsearch\n\n    1\u003e [{_, {_, PidS, _}}] = stepflow_config:run(\"\n          interceptor Counter = stepflow_interceptor_counter#{}.\n          source FromMsg = stepflow_source_message[Counter]#{}.\n          channel Memory = stepflow_channel_memory#{}.\n          sink Elasticsearch = stepflow_sink_elasticsearch[]#{\n            host =\u003e \u003c\u003c\\\"localhost\\\"\u003e\u003e, port =\u003e 9200, index =\u003e \u003c\u003c\\\"myindex\\\"\u003e\u003e\n          }.\n\n          flow Agent: FromMsg |\u003e Memory |\u003e Elasticsearch.\n       \").\n\n    2\u003e stepflow_source_message:append(\n          PidS, [stepflow_event:new(#{}, \u003c\u003c\"hello\"\u003e\u003e)]).\n\n\nNote\n----\n\nYou can run `RabbitMQ` with docker:\n\n    $ docker run --rm --hostname my-rabbit --name some-rabbit -p 5672:5672 -p 15672:15672 rabbitmq:3-management\n\nAnd open the web interface:\n\n    $ firefox http://0.0.0.0:15672/#/\n\nYou can run `ElasticSearch` with docker:\n\n    $ docker pull docker.elastic.co/elasticsearch/elasticsearch:5.5.0\n    $ docker run -p 9200:9200 -e \"http.host=0.0.0.0\" -e \"transport.host=127.0.0.1\" -e \"xpack.security.enabled=false\" docker.elastic.co/elasticsearch/elasticsearch:5.5.0\n\nStatus\n------\n\nThe module is still quite unstable because the heavy development.\nThe API could change until at least v0.1.0.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhachreak%2Fstepflow","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fhachreak%2Fstepflow","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhachreak%2Fstepflow/lists"}