{"id":13807491,"url":"https://paragroup.github.io/WindFlow/","last_synced_at":"2025-05-14T00:31:38.851Z","repository":{"id":68999863,"uuid":"154855323","full_name":"ParaGroup/WindFlow","owner":"ParaGroup","description":"A C++17 Data Stream Processing Parallel Library for Multicores and GPUs","archived":false,"fork":false,"pushed_at":"2025-03-06T07:11:49.000Z","size":51326,"stargazers_count":81,"open_issues_count":12,"forks_count":19,"subscribers_count":6,"default_branch":"master","last_synced_at":"2025-05-08T04:03:09.968Z","etag":null,"topics":["cuda","gpu","gpu-acceleration","gpu-computing","gpu-programming","multi-core","multicore","multithreading","parallel-computing","parallel-patterns","parallel-programming","parallelism","sliding-windows","stream","stream-api","stream-processing","streaming","streaming-api","streaming-data","streams"],"latest_commit_sha":null,"homepage":"","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"lgpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ParaGroup.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.LGPL","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":"AUTHORS","dei":null,"publiccode":null,"codemeta":null}},"created_at":"2018-10-26T15:26:30.000Z","updated_at":"2025-03-06T06:53:14.000Z","dependencies_parsed_at":"2023-03-01T07:31:13.953Z","dependency_job_id":"3451b1ec-5679-4992-9b69-622d4600ea7a","html_url":"https://github.com/ParaGroup/WindFlow","commit_stats":null,"previous_names":[],"tags_count":29,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ParaGroup%2FWindFlow","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ParaGroup%2FWindFlow/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ParaGroup%2FWindFlow/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ParaGroup%2FWindFlow/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ParaGroup","download_url":"https://codeload.github.com/ParaGroup/WindFlow/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254046332,"owners_count":22005573,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cuda","gpu","gpu-acceleration","gpu-computing","gpu-programming","multi-core","multicore","multithreading","parallel-computing","parallel-patterns","parallel-programming","parallelism","sliding-windows","stream","stream-api","stream-processing","streaming","streaming-api","streaming-data","streams"],"created_at":"2024-08-04T01:01:26.040Z","updated_at":"2025-05-14T00:31:33.833Z","avatar_url":"https://github.com/ParaGroup.png","language":"C++","funding_links":["https://paypal.me/GabrieleMencagli"],"categories":["Table of Contents"],"sub_categories":["Streaming Engine"],"readme":"[![License: LGPL v3](https://img.shields.io/badge/License-LGPL%20v3-blue.svg)](https://www.gnu.org/licenses/lgpl-3.0)\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n[![Release](https://img.shields.io/github/release/paragroup/windflow.svg)](https://github.com/paragroup/windflow/releases/latest)\n[![Hits](https://hits.seeyoufarm.com/api/count/incr/badge.svg?url=https%3A%2F%2Fgithub.com%2FParaGroup%2FWindFlow\u0026count_bg=%2379C83D\u0026title_bg=%23555555\u0026icon=\u0026icon_color=%23E7E7E7\u0026title=hits\u0026edge_flat=false)](https://hits.seeyoufarm.com)\n[![Say Thanks!](https://img.shields.io/badge/Say%20Thanks-!-1EAEDB.svg)](https://saythanks.io/to/mencagli@di.unipi.it)\n[![Donate](https://img.shields.io/badge/Donate-PayPal-green.svg)](https://paypal.me/GabrieleMencagli)\n\n\u003cp align=\"center\"\u003e\u003cimg src=\"https://paragroup.github.io/WindFlow/img/logo_white.png\" width=\"400\" title=\"WindFlow Logo\"\u003e\u003c/p\u003e\n\n# Introduction\nWindFlow is a C++17 header-only library for parallel data stream processing targeting heterogeneous shared-memory architectures equipped with multi-core CPUs and NVIDIA GPUs. The library provides traditional stream processing operators like map, flatmap, filter, reduce as well as window-based operators. The API allows building streaming applications through the \u003cb\u003eMultiPipe\u003c/b\u003e and the \u003cb\u003ePipeGraph\u003c/b\u003e programming constructs. The first is used to create parallel pipelines (with shuffle connections), while the second allows several \u003cb\u003eMultiPipe\u003c/b\u003e instances to be interconnected through \u003cb\u003emerge\u003c/b\u003e and \u003cb\u003esplit\u003c/b\u003e operations, in order to create complex directed acyclic graphs of interconnected operators.\n\nAnalogously to existing popular stream processing engines like Apache Storm and FLink, WindFlow supports general-purpose streaming applications by enabling operators to run user-defined code. The WindFlow runtime system has been designed to be suitable for embedded architectures equipped with low-power multi-core CPUs and integrated NVIDIA GPUs (like the Jetson family of NVIDIA boards). However, it works well also on traditional multi-core servers equipped with discrete NVIDIA GPUs.\n\nAt the moment WindFlow is for single-node execution. We are working to a distributed implementation.\n\nThe web site of the library is available at: https://paragroup.github.io/WindFlow/.\n\n# Dependencies\nThe library requires the following dependencies:\n* \u003cstrong\u003ea C++ compiler\u003c/strong\u003e with full support for C++17 (WindFlow tests have been successfully compiled with both GCC and CLANG)\n* \u003cstrong\u003eFastFlow\u003c/strong\u003e version \u003e= 3.0 (https://github.com/fastflow/fastflow)\n* \u003cstrong\u003eCUDA\u003c/strong\u003e (version \u003e= 11.5 is preferred for using operators targeting GPUs)\n* \u003cstrong\u003elibtbb-dev\u003c/strong\u003e required by GPU operators only\n* \u003cstrong\u003elibgraphviz-dev\u003c/strong\u003e and \u003cstrong\u003erapidjson-dev\u003c/strong\u003e when compiling with -DWF_TRACING_ENABLED to report statistics and to use the Web Dashboard for monitoring purposes\n* \u003cstrong\u003elibrdkafka-dev\u003c/strong\u003e for using the integration with Kafka (special Kafka_Source and Kafka_Sink operators)\n* \u003cstrong\u003elibrocksdb-dev\u003c/strong\u003e for using the suite of persistent operators keeping their internal state in RocksDB KVS\n* \u003cstrong\u003edoxygen\u003c/strong\u003e (to generate the documentation)\n\n\u003cb\u003eImportant about the FastFlow dependency\u003c/b\u003e -\u003e after downloading FastFlow, the user needs to configure the library for the underlying multi-core environment. By default, FastFlow pins its threads onto the cores of the machine. To make FastFlow aware of the ordering of cores, and their correspondence in CPUs and NUMA regions, it is important to run (just one time) the script \u003cstrong\u003e\"mapping_string.sh\"\u003c/strong\u003e in the folder \u003ctt\u003efastflow/ff\u003c/tt\u003e before compiling your WindFlow programs.\n\n# Macros\nWindFlow, and its underlying level FastFlow, come with some important macros that can be used during compilation to enable specific behaviors. Some of them are reported below:\n* \u003cstrong\u003e-DWF_TRACING_ENABLED\u003c/strong\u003e -\u003e enables tracing (logging) at the WindFlow level (operator replicas), and allows streaming applications to continuously report statistics to a Web Dashboard (which is a separate sub-project). Outputs are also written in log files at the end of the processing\n* \u003cstrong\u003e-DTRACE_FASTFLOW\u003c/strong\u003e -\u003e enables tracing (logging) at the FastFlow level (raw threads and FastFlow nodes). Outputs are written in log files at the end of the processing\n* \u003cstrong\u003e-DFF_BOUNDED_BUFFER\u003c/strong\u003e -\u003e enables the use of bounded lock-free queues for pointer passing between threads. Otherwise, queues are unbounded (no backpressure mechanism)\n* \u003cstrong\u003e-DDEFAULT_BUFFER_CAPACITY=VALUE\u003c/strong\u003e -\u003e set the size of the lock-free queues capacity. The default size of the queues is of 2048 entries\n* \u003cstrong\u003e-DNO_DEFAULT_MAPPING\u003c/strong\u003e -\u003e if this macro is enabled, FastFlow threads are not pinned onto CPU cores, but they are scheduled by the Operating System\n* \u003cstrong\u003e-DBLOCKING_MODE\u003c/strong\u003e -\u003e if this macro is enabled, FastFlow queues use the blocking concurrency mode (pushing to a full queue or polling from an empty queue might suspend the underlying thread). If not set, waiting conditions are implemented by busy-waiting spin loops.\n\nSome macros are useful to configure the runtime system when GPU operators are utilized in your application. The default version of the GPU support is based on explicit CUDA memory management and overlapped data transfers, which is a version suitable for a wide range of NVIDIA GPU models. However, the developer might want to switch to a different implementation that makes use of the CUDA unified memory support. This can be done by compiling with the macro \u003cstrong\u003e-DWF_GPU_UNIFIED_MEMORY\u003c/strong\u003e. Alternatively, the user can configure the runtime system to use pinned memory on NVIDIA System-on-Chip devices (e.g., Jetson Nano and Jetson Xavier), where pinned memory is directly accessed by CPU and GPU without extra copies. This can be done by compiling with the macro \u003cstrong\u003e-DWF_GPU_PINNED_MEMORY\u003c/strong\u003e.\n\n# Build the Examples\nWindFlow is a header-only template library. To build your applications you have to include the main header of the library (\u003ctt\u003ewindflow.hpp\u003c/tt\u003e). For using the operators targeting GPUs, you further have to include the \u003ctt\u003ewindflow_gpu.hpp\u003c/tt\u003e header file and compile using the \u003ccode\u003envcc\u003c/code\u003e CUDA compiler (or through \u003ccode\u003eclang\u003c/code\u003e with CUDA support). The source code in this repository includes several examples that can be used to understand the use of the API and the advanced features of the library. The examples can be found in the \u003ctt\u003etests\u003c/tt\u003e folder. To compile them:\n```\n    $ cd \u003cWINDFLOW_ROOT\u003e\n    $ mkdir ./build\n    $ cd build\n    $ cmake ..\n    $ make -j\u003cno_cores\u003e # compile all the tests (not the doxygen documentation)\n    $ make all_cpu -j\u003cno_cores\u003e # compile only CPU tests\n    $ make all_gpu -j\u003cno_cores\u003e # compile only GPU tests\n    $ make docs # generate the doxygen documentation (if doxygen has been installed)\n```\n\nIn order to use the Kafka integration, consisting of special Source and Sink operators, the developer has to include the additional header \u003ctt\u003ekafka/windflow_kafka.hpp\u003c/tt\u003e and properly link the library \u003ctt\u003elibrdkafka-dev\u003c/tt\u003e. Analogously, to use persistent operators, you need to include the header \u003ctt\u003epersistent/windflow_rocksdb.hpp\u003c/tt\u003e and link the library \u003ctt\u003elibrocksdb-dev\u003c/tt\u003e.\n\n# Docker Images\nTwo Docker images are available in the WindFlow GitHub repository. The images contain all the synthetic tests compiled and ready to be executed. To build the first image (the one without tests using GPU operators) execute the following commands:\n```\n    $ cd \u003cWINDFLOW_ROOT\u003e\n    $ cd dockerimages\n    $ docker build -t windflow_nogpu -f Dockerfile_nogpu .\n    $ docker run windflow_nogpu ./bin/graph_tests/test_graph_1 -r 1 -l 10000 -k 10\n```\nThe last command executes one of the synthetic experiments (test_graph_1). You can execute any of the compiled tests in the same mannner.\n\nThe second image contains all synthetic tests with GPU operators. To use your GPU device with Docker, please follow the guidelines in the following page (https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html). Then, you can build the image and run the container as follows:\n```\n    $ cd \u003cWINDFLOW_ROOT\u003e\n    $ cd dockerimages\n    $ docker build -t windflow_gpu -f Dockerfile_gpu .\n    $ docker run --gpus all windflow_gpu ./bin/graph_tests_gpu/test_graph_gpu_1 -r 1 -l 10000 -k 10\n```\nAgain, the last command executes one of the synthetic experiments (test_graph_gpu_1). You can execute any of the compiled tests in the same mannner.\n\n# Web Dashboard\nWindFlow has its own Web Dashboard that can be used to profile and monitor the execution of running WindFlow applications. The dashboard code is in the sub-folder \u003ctt\u003eWINDFLOW_ROOT/dashboard\u003c/tt\u003e. It is a Java package based on Spring (for the Web Server) and developed using React for the front-end part. To start the Web Dashboard run the following commands:\n```\n    cd \u003cWINDFLOW_ROOT\u003e/dashboard/Server\n    mvn spring-boot:run\n```\nThe web server listens on the default port \u003ctt\u003e8080\u003c/tt\u003e of the machine. To change the port, and other configuration parameters, users can modify the configuration file \u003ctt\u003eWINDFLOW_ROOT/dashboard/Server/src/main/resources/application.properties\u003c/tt\u003e for the Spring server (e.g., to change the HTTP port), and the file \u003ctt\u003eWINDFLOW_ROOT/dashboard/Server/src/main/java/com/server/CustomServer/Configuration/config.json\u003c/tt\u003e for the internal server receiving reports of statistics from the WindFlow applications (e.g., to change the port used by applications to report statistics to the dashboard).\n\nWindFlow applications compiled with the macro \u003cstrong\u003e-DWF_TRACING_ENABLED\u003c/strong\u003e try to connect to the Web Dashboard and report statistics to it every second. By default, the applications assume that the dashboard is running on the local machine. To change the hostname and the port number, developers can use the macros \u003cstrong\u003eWF_DASHBOARD_MACHINE=hostname/ip_addr\u003c/strong\u003e and \u003cstrong\u003eWF_DASHBOARD_PORT=port_number\u003c/strong\u003e.\n\n# About the License\nFrom version 3.1.0, WindFlow is released with a double license: \u003cstrong\u003eLGPL-3\u003c/strong\u003e and \u003cstrong\u003eMIT\u003c/strong\u003e. Programmers should check the licenses of the other libraries used as dependencies.\n\n# Cite our Work\nIn order to cite our work, we kindly ask interested people to use the following references:\n```\n@article{WindFlow,\n author={Mencagli, Gabriele and Torquati, Massimo and Cardaci, Andrea and Fais, Alessandra and Rinaldi, Luca and Danelutto, Marco},\n journal={IEEE Transactions on Parallel and Distributed Systems},\n title={WindFlow: High-Speed Continuous Stream Processing With Parallel Building Blocks},\n year={2021},\n volume={32},\n number={11},\n pages={2748-2763},\n doi={10.1109/TPDS.2021.3073970}\n}\n```\n\n```\n@article{WindFlow-GPU,\n title = {General-purpose data stream processing on heterogeneous architectures with WindFlow},\n journal = {Journal of Parallel and Distributed Computing},\n volume = {184},\n pages = {104782},\n year = {2024},\n issn = {0743-7315},\n doi = {https://doi.org/10.1016/j.jpdc.2023.104782},\n url = {https://www.sciencedirect.com/science/article/pii/S0743731523001521},\n author = {Gabriele Mencagli and Massimo Torquati and Dalvan Griebler and Alessandra Fais and Marco Danelutto},\n}\n```\n\n# Requests for Modifications\nIf you are using WindFlow for your purposes and you are interested in specific modifications of the API (or of the runtime system), please send an email to the maintainer.\n\n# Contributors\nThe main developer and maintainer of WindFlow is [Gabriele Mencagli](mailto:gabriele.mencagli@unipi.it) (Department of Computer Science, University of Pisa, Italy).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/paragroup.github.io%2FWindFlow%2F","html_url":"https://awesome.ecosyste.ms/projects/paragroup.github.io%2FWindFlow%2F","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/paragroup.github.io%2FWindFlow%2F/lists"}