{"id":19881551,"url":"https://github.com/chifisource/parametricprocesses.jl","last_synced_at":"2025-03-01T02:47:01.508Z","repository":{"id":216646225,"uuid":"740532171","full_name":"ChifiSource/ParametricProcesses.jl","owner":"ChifiSource","description":"manage different types of processes using declarative syntax","archived":false,"fork":false,"pushed_at":"2024-12-25T15:45:16.000Z","size":84,"stargazers_count":7,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-02-25T18:52:27.706Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Julia","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ChifiSource.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null},"funding":{"github":["emmaccode","UnformalPenguin"]}},"created_at":"2024-01-08T14:37:02.000Z","updated_at":"2025-02-05T11:58:54.000Z","dependencies_parsed_at":"2024-01-11T16:22:44.852Z","dependency_job_id":"f46400c6-49d4-4414-8c47-05a6d51d0668","html_url":"https://github.com/ChifiSource/ParametricProcesses.jl","commit_stats":null,"previous_names":["chifisource/parametricprocesses.jl"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ChifiSource%2FParametricProcesses.jl","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ChifiSource%2FParametricProcesses.jl/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ChifiSource%2FParametricProcesses.jl/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ChifiSource%2FParametricProcesses.jl/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ChifiSource","download_url":"https://codeload.github.com/ChifiSource/ParametricProcesses.jl/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":241309104,"owners_count":19941725,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-12T17:14:35.928Z","updated_at":"2025-03-01T02:47:01.483Z","avatar_url":"https://github.com/ChifiSource.png","language":"Julia","funding_links":["https://github.com/sponsors/emmaccode","https://github.com/sponsors/UnformalPenguin"],"categories":[],"sub_categories":[],"readme":"\u003cdiv align=\"center\"\u003e\n  \u003cimg src=\"https://github.com/ChifiSource/image_dump/blob/main/parametricprocesses/parproc.png\" width=\"375\"\u003e\u003c/img\u003e\n\n  [![version](https://juliahub.com/docs/General/ParametricProcesses/stable/version.svg)](https://juliahub.com/ui/Packages/General/ParametricProcesses)\n  \n\u003c/div\u003e\n\n\n`ParametricProcesses` offers a parametric `Worker` type and a `ProcessManager` API capable of facilitating multiple forms of parallel processing and high-level declarative `Distributed` worker management.\n```julia\nusing Pkg; Pkg.add(\"ParametricProcesses\")\n# Unstable:\nusing Pkg; Pkg.add(\"ParametricProcesses\", rev = \"Unstable\")\n```\n---\n- [usage](#usage)\n  -  [workers](#workers)\n  -  [jobs](#jobs)\n-  [examples](#examples)\n- [contributing](#contributing)\n  - [adding workers](#adding-extensions)\n  - [contributing guidelines](#guidelines)\n### usage\nBefore trying to use threaded `Workers` (`Workers{Threaded}`), make sure to start **julia with multiple threads**!\n```julia\njulia --threads 6\n```\n- For a **full** list of exports, try `?ParametricProcesses`\n```julia\nusing ParametricProcesses\nprocs = processes(5)\nx = 5\nfirstjob = new_job(x) do x::Int64\n   for n in 1:x\n       println(\"hello\")\n       sleep(2)\n   end \nend\nsecondjob = new_job(x) do x::Int64\n   sleep(1)\n   for n in 1:x\n       println(\"world\")\n       sleep(2)\n   end \nend\n\ndistribute!(procs, firstjob, secondjob)\n```\n```julia\njulia\u003e distribute!(procs, firstjob, secondjob)\n2-element Vector{Int64}:\n 7\n 8\n\njulia\u003e       From worker 7:\thello\n      From worker 8:\tworld\n      From worker 7:\thello\n      From worker 8:\tworld\n      From worker 7:\thello\n      From worker 8:\tworld\n      From worker 7:\thello\n      From worker 8:\tworld\n      From worker 7:\thello\n      From worker 8:\tworld\n\n```\n##### workers\nThe *typical* `ParametricProcesses` workflow involves creating a process manager with workers, then creating jobs and distributing them amongst those workers using `assign!` and `distribute!`.  To get started, we can create a `ProcessManager` by using the `processes` Function. This `Function` will take an `Int64` and optionally, a `Process` type. The default process type will be `Threaded`, so ensure you have multiple threads for the following example:\n```julia\nprocs = processes(5)\n```\nWe can create a process manager with workers of any type using this same `Function`, `processes`.\n```julia\nasync_procs = processes(2, Async)\n```\n`Workers` are held in the `ProcessManager.workers` field, we can also add workers directly with the `add_workers!` function, or create workers manually and `push!` them.\n```julia\njulia\u003e add_workers!(pm, 1, Threaded, \"emma the worker \u003c3\")\n2 |Threaded process: emma the worker \u003c3 (inactive)\n\njulia\u003e w = Worker{Async}(\"steve the worker\", 20)\n20 |Async process: steve the worker (inactive)\n\n\njulia\u003e push!(pm, w)\n2 |Threaded process: emma the worker \u003c3 (inactive)\n20 |Async process: steve the worker (inactive)\n\n\njoin(\"$(w.name)\\n\" for w in pm.workers)\n\"emma the worker \u003c3\nsteve the worker\n\"\n\n```\n`Workers` can be indexed by their name or their pid.\n```julia\njulia\u003e pm[\"steve the worker\"]\n20 |Async process: steve the worker (inactive)\n\n\njulia\u003e pm[2]\n2 |Threaded process: emma the worker \u003c3 (inactive)\n\n\n```\nHere is a list of other functions used to manage workers.\n- `close(pm::ProcessManager)` - closes **all** active `Workers` in `pm`.\n- `delete!(pm::ProcessManager, pid::Int64)` - closes `Worker` by `pid`\n- `delete!(pm::ProcessManager, name::String)` - closes `Worker` by `name`.\n- `worker_pids(pm::ProcessManager)`  - returns worker process identifiers for all `Workers` in `pm.workers`\n- `waitfor(pm::ProcessManager, pids::Any ...)` - waits for `pids` to finish, then returns their returns in a `Vector{Any}`\n- `put!(pm::ProcessManager, pids::Vector{Int64}, vals ...)` - serializes data and defines in in the `Main` of each process in `pids`.\n\nThere is also `@everywhere` used to define functions and modules across all workers, as well as `@distribute` to use all available workers for iteration.\n```julia\n@time @distribute for x in 1:5\n    sleep(3)\nend\n@time for x in 1:5\n    sleep(3)\nend\n```\n`@everywhere` is the more important of the two. `put!` can be used to transmit data, but this will not work for functions or modules -- **`@everywhere` must be used for this, after the workers are open.**\n```julia\nusing ParametricProcesses\n\n# make workers first\npm = processes(2)\n\n# using a `Module`\n@everywhere using JSON\n\n# using a `Function`\n@everywhere function sample()\n    println(\"sample\")\nend\n\njbs = (new_job(JSON.parse, \"{\\\"x\\\":5}\"), new_job(sample))\n\n\npids = distribute!(pm, jbs ...)\n# -- v output\nFrom worker 3:\tsample\n2-element Vector{Int64}:\n 2\n 3\n# --\n\nrets = waitfor(pm, pids ...)\nprintln(\"x is $(rets[1][\"x\"])\")\n# - v output\nx is 5\n```\n- For a **full** list of exports, try `?ParametricProcesses`\n##### jobs\nIn order to use our threads to complete tasks, we will need to construct a sub-type of `AbstractJob`. The running type for this is `ProcessJob`, which may be called from the `new_job` binding. We provide this with a `Function` that takes arguments, as well as the arguments we seek to provide to that `Function` (if any).\n```julia\nnew_job(f::Function, args ...; keyargs ...)\n```\n```julia\nmyjob = new_job(readdir, \".\")\n```\nFrom here, we have access to the following functions to distribute our jobs amongst our `Workers`.\n```julia\ndistribute!\nassign!\nassign_open!\ndistribute_open!\n```\n`waitfor` is used to wait for certain workers to finish their tasks, getting their returns as they complete.  \nConsider the following `waitfor` example:\n```julia\npm = processes(4)\n\njb = new_job() do \n    sleep(10)\n    @info \"hello world!\"\n    return 55\nend\n\nassign!(pm, 2, jb)\n\nret = waitfor(pm, 2); println(\"worker 2 completed, it returned: \", ret[1])\n\n# From worker 2:\t[ Info: hello world!\n# worker 2 completed, it returned: 55\n```\nFeasibly, you can pass the `ProcessManager` to all workers and manage processes from different workers by using `@everywhere`.\n### examples\n###### css property parsing\nThis simple example shows how jobs (which ideally would be more CPU intensive and less memory-intensive than this,) can easily be distributed amongst dependencies -- especially for simple `Function` calls like `parse_props` below:\n```julia\nusing ParametricProcesses\nusing Test\nprocs = processes(2)\n\n@everywhere function parse_props(s::String)\n    propkeys = split(s, \";\")\n    filter!(t -\u003e ~(isnothing(t)), [begin \n        splts = split(kp, \":\")\n        if length(splts) \u003c 2\n            nothing\n        else\n            splts[1] =\u003e splts[2]\n        end\n    end for kp in propkeys])\nend\n\nfirstset = join(\"$(rand(500:5000)):$(rand(500:5000));\" for n in 1:5000)\nsecondset = join(\"$(rand(500:5000)):$(rand(500:5000));\" for n in 1:50000)\nthirdset = join(\"$(rand(500:5000)):$(rand(500:5000));\" for n in 1:5000)\nfourthset = join(\"$(rand(500:5000)):$(rand(500:5000));\" for n in 1:100000)\nfifthset = join(\"$(rand(500:5000)):$(rand(500:5000));\" for n in 1:50000)\nsets = (firstset, secondset, thirdset, fourthset, fifthset)\nret = vcat([parse_props(set) for set in sets] ...)\njbs = (new_job(parse_props, set) for set in sets)\nids = distribute!(procs, worker_pids(procs), jbs ...)\nmret = vcat(waitfor(procs, ids ...) ...)\n@test length(ret) == length(mret)\n```\nIn the above example, `distribute!` is used to perform the tasks on 5 threads instead of one. While this does not necessarily offer a huge benefit to performance as parsing CSS is pretty simple and it is more CPU work to serialize the data for the thread, this examples does show pretty well how to easily replicate tasks across several workers.\n ---\n### contributing\nThere are several ways to contribute to the `ParametricProcesses` package.\n- submitting [issues](https://github.com/ChifiSource/ParametricProcesses.jl/issues) ([guidelines](#guidelines))\n- [creating `Worker` extensions](#adding-workers).\n- forking and pull-requestion ([guildelines](#guidelines))\n- trying other [chifi](https://github.com/ChifiSource) projects.\n- contributing to other [chifi](https://github.com/ChifiSource) projects (gives more attention here).\n##### adding workers\nAdding your own `Workers` is pretty straightforward. We can create new functionality by creating a new \u003c: `Process` or a new \u003c: `AbstractWorker`. A `Process` is used to change the functionality of a `Worker`, an `AbstractWorker` extension usually means we need to facilitate different types of `Worker` data or `ProcessManager` functionality. Creating a `Process` is very simple, as a `Process` is simply an `abstract` type.\n```julia\nabstract type CUDA \u003c: Process end\n```\nFrom here, we have a few bindings which will need to be defined:\n```julia\nclose(w::Worker{Process})\ncreate_workers(n::Int64, of::Type{Process}, \n    names::Vector{String} = [\"$e\" for e in 1:n])\nassign!(assigned_worker::Worker{Process}, job::AbstractJob)\n```\nPretty simple; these are the main functionality that changes when we are using different hardware -- allocating jobs, creating workers to do the jobs, and closing the workers will all be different depending on what `Process` we are using. Fortunately, a `Worker` will fit entirely into the API by simply extending these three, so with these simple functions we can easily create high-level bindings to distribute our jobs over a myriad of different worker types. If we wanted to create our own `Worker`, things would get a little more complicated. It is also possible to make your own sub-type of `AbstractProcessManager` or `AbstractJob` and extend that way. All of the information needed to follow consistencies for these super-types are available in the documentation.\n##### guidelines\nWe are not super picky on contributions, as the goal of [chifi](https://github.com/ChifiSource) is to get more people involved in computing. However, if you want your code merged there are definitely a few things to be aware of before contributing to this package.\n- If there is no issue for what you want to do, [create an issue](https://github.com/ChifiSource/ParametricProcesses.jl)\n- If you have multiple issues, **submit multiple issues** rather than typing each issue into one issue.\n- Make sure the issue you are solving or feature you want to implement is still feasible on `Unstable` -- this is the top-level development branch which represents the latest unstable changes.\n- Please format your documentation using the technique presented in the rest of the file.\n- Make sure `Pkg.test(\"ParametricProcesses\")` works with your version of `ParametricProcesses` before making a pull-request.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fchifisource%2Fparametricprocesses.jl","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fchifisource%2Fparametricprocesses.jl","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fchifisource%2Fparametricprocesses.jl/lists"}