{"id":15941124,"url":"https://github.com/hvass-labs/parallel-pipelines","last_synced_at":"2025-04-03T21:15:00.355Z","repository":{"id":129938075,"uuid":"536091748","full_name":"Hvass-Labs/Parallel-Pipelines","owner":"Hvass-Labs","description":"Convert serial computations into parallel pipelines","archived":false,"fork":false,"pushed_at":"2022-10-02T10:09:49.000Z","size":155,"stargazers_count":3,"open_issues_count":0,"forks_count":0,"subscribers_count":3,"default_branch":"main","last_synced_at":"2025-02-09T09:12:47.681Z","etag":null,"topics":["audio","cpp","parallel-computing"],"latest_commit_sha":null,"homepage":"","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Hvass-Labs.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-09-13T11:24:51.000Z","updated_at":"2023-04-26T20:55:37.000Z","dependencies_parsed_at":null,"dependency_job_id":"8f769473-946a-427b-b644-5a7c7f595ba7","html_url":"https://github.com/Hvass-Labs/Parallel-Pipelines","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Hvass-Labs%2FParallel-Pipelines","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Hvass-Labs%2FParallel-Pipelines/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Hvass-Labs%2FParallel-Pipelines/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Hvass-Labs%2FParallel-Pipelines/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Hvass-Labs","download_url":"https://codeload.github.com/Hvass-Labs/Parallel-Pipelines/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247078863,"owners_count":20879952,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["audio","cpp","parallel-computing"],"created_at":"2024-10-07T07:02:26.952Z","updated_at":"2025-04-03T21:15:00.333Z","avatar_url":"https://github.com/Hvass-Labs.png","language":"C++","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Parallel Pipelines for Streaming Data\n\n[Original repository on GitHub](https://github.com/Hvass-Labs/Parallel-Pipelines)\n\nOriginal author is [Magnus Erik Hvass Pedersen](http://www.hvass-labs.org)\n\n\n## Introduction\n\nThis is a demonstration using C++ source-code of a little-known method for parallelizing the computation of serially dependent functions on streaming data, which is a particular kind of Parallel Pipeline. It can be used to turn a serial computation into a parallel computation whenever you have serially chained or nested functions that are working on streaming data.\n\nFor example, in audio processing this can be used to make audio effects that are connected in series instead run in parallel on multiple CPU cores. This can be used to greatly improve the multi-core CPU efficiency of Digital Audio Workstations (DAW).\n\nAll of this is explained in more detail in the [paper](https://github.com/Hvass-Labs/Parallel-Pipelines/raw/main/pedersen2022parallel-pipelines.pdf).\n\n\n## Example\n\nConsider the expression `y[i] = G(F(x[i]))` where `F` and `G` are some functions, and `x[i]` is the input data and `y[i]` is the output data for index `i`. Because the output of the function `F` is used as the input to the function `G`, and because the computations must be made in the correct order on the index `i`, the two functions are said to be serially dependent and it may seem impossible to calculate them in parallel.\n\nBut when `x[i]` is just one element in a stream of data e.g. with `i` going from `0` to some value `n-1`, then we can parallelize the nested computation of the functions `F` and `G` by first computing `F(x[i])` in one thread and saving the result in a variable called `F_buffer`, and in the second thread we then use the buffer that was written in the previous iteration to compute `y[i-1] = G(F_buffer)`. Once the two threads are both finished with their computations, we update `F_buffer` with the new result from the function `F`. This can be written as simplified pseudo-code:\n\n    for (int i=0; i\u003cn+1; i++)\n    {\n        Use thread 1 to calculate F(x[i]);\n        Use thread 2 to calculate G(F_buffer);\n        Wait for both threads to finish;\n        Get final result from thread 2: y[i-1] = G(F_buffer);\n        Update buffer with result from thread 1: F_buffer = F(x[i]);\n    }\n\nThe only drawback of doing the computation as a Parallel Pipeline, is that it needs 1 extra iteration to finish the entire stream of input-data, because the output for iteration `i` is `y[i-1]` due to the buffering. If this is being used in a real-time application, then it would mean an extra iteration of latency / delay before the final result is available, which may be undesirable. If the input data `x[i]` is actually an array or block of numbers, as is the case with audio processing, then the latency can be reduced by using mini-blocks instead, as explained in the [paper](https://github.com/Hvass-Labs/Parallel-Pipeline/raw/main/pedersen2022parallel-pipelines.pdf). \n\n\n## Source-Code in C++\n\nThe source-code in C++ gives 4 examples of different expressions that can be calculated in a Parallel Pipeline. From these examples you can understand the underlying idea of buffering the output of each thread, and create a Parallel Pipeline for your own particular problem.\n\nFor simplicity, these examples use \"dummy\" functions with strings as input and output data. The dummy functions \"sleep\" their execution thread for 100 msec to simulate heavy processing. The summation + operator does not \"sleep\" the thread in these examples.\n\n- `main1.cpp` shows how to calculate `y[i] = G(F(x[i]))` using 2 parallel threads.\n- `main2.cpp` shows how to calculate `y[i] = H(G(F(x[i])))` using 3 parallel threads.\n- `main3.cpp` shows how to calculate `y[i] = F(x[i]) + G(F(x[i]))` using 2 parallel threads.\n- `main4.cpp` shows how to calculate `y[i] = H(F(x[i]) + G(z[i]))` using 3 parallel threads.\n\n\n## How To Run\n\nThe easiest way to download and install this is by using `git` from the command-line:\n\n    git clone https://github.com/Hvass-Labs/Parallel-Pipelines.git\n\nThis creates the directory `Parallel-Pipelines` and downloads all the files to it.\n\nYou can also [download](https://github.com/Hvass-Labs/Parallel-Pipelines/archive/master.zip) the contents of the GitHub repository as a Zip-file and extract it manually.\n\nTo build the C++ source-code into executable files, run the following commands in a shell. This was implemented on Linux and uses the `g++` compiler and `make` tools:\n\n    cd Parallel-Pipelines\n    make -B\n\nThis should have created four executable files named `main1` to `main4` which can be run as follows:\n\n    ./main1\n\nThis prints the following output to the screen. Note that the parallel execution has finished in nearly half the time of the serial execution, except for approximately 100 msec, which is the extra latency of one iteration, as described above. This can also be seen by the `G(--)` in the first iteration and `F(--)` in the last iteration, which means that the functions are called with empty input at the start and end of the input stream.\n\n    Serial:\n    Step 0:  Thread 1: G(F(x_0))\n    Step 1:  Thread 1: G(F(x_1))\n    Step 2:  Thread 1: G(F(x_2))\n    Step 3:  Thread 1: G(F(x_3))\n    Step 4:  Thread 1: G(F(x_4))\n    Step 5:  Thread 1: G(F(x_5))\n    Step 6:  Thread 1: G(F(x_6))\n    Step 7:  Thread 1: G(F(x_7))\n    Step 8:  Thread 1: G(F(x_8))\n    Step 9:  Thread 1: G(F(x_9))\n    Elapsed time: 2005.375444ms\n\n    Parallel:\n    Step 0:  Thread 1: F(x_0)  Thread 2: G(--)\n    Step 1:  Thread 1: F(x_1)  Thread 2: G(F(x_0))\n    Step 2:  Thread 1: F(x_2)  Thread 2: G(F(x_1))\n    Step 3:  Thread 1: F(x_3)  Thread 2: G(F(x_2))\n    Step 4:  Thread 1: F(x_4)  Thread 2: G(F(x_3))\n    Step 5:  Thread 1: F(x_5)  Thread 2: G(F(x_4))\n    Step 6:  Thread 1: F(x_6)  Thread 2: G(F(x_5))\n    Step 7:  Thread 1: F(x_7)  Thread 2: G(F(x_6))\n    Step 8:  Thread 1: F(x_8)  Thread 2: G(F(x_7))\n    Step 9:  Thread 1: F(x_9)  Thread 2: G(F(x_8))\n    Step 10:  Thread 1: F(--)  Thread 2: G(F(x_9))\n    Elapsed time: 1107.676666ms\n\n\n## License (MIT)\n\nThis is published under the [MIT License](https://github.com/Hvass-Labs/Parallel-Pipelines/blob/main/LICENSE) which allows very broad use for both academic and commercial purposes.\n\nYou are very welcome to modify and use this source-code in your own project. Please keep a link to the [original repository](https://github.com/Hvass-Labs/Parallel-Pipelines).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhvass-labs%2Fparallel-pipelines","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fhvass-labs%2Fparallel-pipelines","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhvass-labs%2Fparallel-pipelines/lists"}