{"id":22823163,"url":"https://github.com/okeuday/cpg","last_synced_at":"2025-04-09T10:07:44.715Z","repository":{"id":3909947,"uuid":"4998751","full_name":"okeuday/cpg","owner":"okeuday","description":"CloudI Process Groups","archived":false,"fork":false,"pushed_at":"2023-11-29T22:05:17.000Z","size":616,"stargazers_count":91,"open_issues_count":0,"forks_count":19,"subscribers_count":13,"default_branch":"master","last_synced_at":"2025-04-02T02:40:40.355Z","etag":null,"topics":["crdt","data-structures","erlang","erlang-process-pool"],"latest_commit_sha":null,"homepage":null,"language":"Erlang","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/okeuday.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2012-07-12T04:00:00.000Z","updated_at":"2024-12-11T22:01:38.000Z","dependencies_parsed_at":"2025-01-02T19:15:47.594Z","dependency_job_id":null,"html_url":"https://github.com/okeuday/cpg","commit_stats":{"total_commits":203,"total_committers":2,"mean_commits":101.5,"dds":"0.0049261083743842304","last_synced_commit":"e3ed9117ee87bb49b3d4480e8406f2f3918dcd71"},"previous_names":[],"tags_count":40,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/okeuday%2Fcpg","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/okeuday%2Fcpg/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/okeuday%2Fcpg/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/okeuday%2Fcpg/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/okeuday","download_url":"https://codeload.github.com/okeuday/cpg/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248018060,"owners_count":21034048,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["crdt","data-structures","erlang","erlang-process-pool"],"created_at":"2024-12-12T16:14:39.998Z","updated_at":"2025-04-09T10:07:44.683Z","avatar_url":"https://github.com/okeuday.png","language":"Erlang","funding_links":[],"categories":[],"sub_categories":[],"readme":"# CPG ([CloudI](https://cloudi.org) Process Groups)\n\n[![Build Status](https://app.travis-ci.com/okeuday/cpg.svg?branch=master)](https://app.travis-ci.com/okeuday/cpg)\n\n## Purpose\n\ncpg provides a process group interface that is focused on\navailability and partition tolerance (in the CAP theorem).\nThe pg process group implementation added in Erlang/OTP 23 by\nWhatsApp Inc. (Meta Platforms Inc. / Facebook Inc.) is based on cpg.\nThe cpg interface is compatible with pg2 (removed in Erlang/OTP 24).\n\n## Features (Compare and Contrast)\n\n### cpg\n\n* By default, cpg utilizes Erlang strings for group names (list of integers) and provides the ability to set a pattern string as a group name.  A pattern string is a string that includes the `\"*\"` or `\"?\"` wildcard characters (equivalent to a \".+\" regex while `\"**\"`, `\"??\"`, `\"*?\"`, and `\"?*\"` are forbidden).  When a group name is a pattern string, a process can be retrieved by matching the pattern (more information at the [CloudI FAQ](https://cloudi.org/faq.html#4_URLregex)).  To not use this approach for group names, refer to the [Usage](#usage) section below.\n* cpg provides its internal state for usage in separate Erlang processes as cached data with the `cpg_data` module.  That approach is more efficient than usage of ets.\n* Each cpg scope is an atom used as a locally registered process name for the cpg scope Erlang process.  Separate cpg scopes may be used to keep group memberships entirely separate.\n* cpg data lookups are done based on the Erlang process being local or remote, or the relative age of the local membership to the group, or with random selection (using the terminology `closest`, `furthest`, `random`, `local`, `remote`, `oldest`, `newest`).  `closest` prefers local processes if they are present while `furthest` prefers remote processes if they are present.  The `oldest` process in a group is naturally the most stable process.\n* cpg provides an interface for `via` process registry use (examples are provided in the [tests](https://github.com/okeuday/cpg/blob/master/test/cpg_tests.erl)).\n* cpg [supports](#usage) hidden node connections (hidden node connections are a way to avoid distributed Erlang node connection limitations by not creating a fully-connected network topology).\n\n### pg (\u003e= Erlang/OTP 23) (https://github.com/max-au/spg)\n\n* pg uses one monitor per remote node (it takes longer to update a group after an Erlang process dies and may never remove remote group members).\n* pg uses ets while cpg does not (cpg instead provides cached data for more efficient access to the process group data).\n\n### pg2 (\u003c Erlang/OTP 24)\n\n* pg2 uses global:trans/2 which is unable to handle network or node failures.\n* pg2 uses ets while cpg does not (cpg instead provides cached data for more efficient access to the process group data).\n\n### [gproc](https://github.com/uwiger/gproc/) / [syn](https://github.com/ostinelli/syn)\n\n* Both are focused on consistency with leader election and are unable to be available when suffering network or node failures.  Failures can cause unpredictable conflict resolution, in an attempt to achieve consistency.\n\n## Design\n\ncpg is a Commutative/Convergent Replicated Data-Type (CRDT) that uses\nnode ownership of Erlang processes to ensure a set of keys has\nadd and remove operations that commute with an internal map data structure.\nThe cpg module provides add and remove operations with the function names\njoin and leave, that may only be called on the node that owns the\nErlang process which is the value for the join or leave operation.\nThe key is the process group name which represents a list of Erlang processes\n(with an single Erlang process being able to be added or removed any\nnumber of times).\n\nAll cpg join and leave operations change global state as a\nCommutative Replicated Data-Type (CmRDT) by sending the operation to the\nassociated cpg Erlang process as a distributed Erlang message to all remote\nnodes after the operation successfully completes on the local node.\n\ncpg also uses distributed Erlang node monitoring to handle netsplits as a\nConvergent Replicated Data-Type (CvRDT) by sending all of the internal\ncpg state to remote nodes that have recently connected.  The associated\ncpg Erlang process on the remote node then performs a merge operation to\nmake sure the count of each Erlang pid is consistent with the internal\ncpg state it received.\n\nThe CRDT functionality in cpg is most similar to the\n[POLog (Partially Ordered Log of operations)](#references)\nthough the cpg approach would instead be called an\n\"Ordered Log of operations\" because it is depending on Erlang messaging on\na local node to have causal ordering (no vclocks are necessary to establish\ncausality on the local node with the cpg scope Erlang process message queue\nproviding an \"Ordered Log of operations\").  After a cpg operation\ncompletes successfully on the local node, it is sent to all remote nodes which\nact as read-only views of the local node.\n\nThe cpg scope process on the local node enforces causality by existing as the\nonly read/write store of the local process memberships (i.e., serialized\nmutability similar to a mutex lock) while the remote nodes obtain the\nprocess memberships as soon as possible.  If a remote node is down due to a\nnetsplit, it will obtain the local node's state once it reconnects as\ndescribed above.\n\n## Build\n\n    rebar get-deps\n    rebar compile\n\n## Usage\n\nIf you need non-string (not a list of integers) group names,\nset the cpg application `group_storage` env value to a module name that\nprovides a dict module interface\n(e.g., use `dict` or [`mapsd`](https://github.com/okeuday/mapsd)).\n\nNode names that have a prefix of `NODE_script_process`\n(where NODE is the current node name) are automatically ignored\nbecause they are assumed to be release scripts (e.g., nodetool).\nTo process hidden node membership data, set the cpg application `node_type`\nenv value to `all` (instead of `visible`).\n\n## Example\n\n    $ erl -sname cpg@localhost -pz ebin/ -pz deps/*/ebin/\n\n    (cpg@localhost)1\u003e reltool_util:application_start(cpg).\n    ok\n    (cpg@localhost)2\u003e cpg:join(groups_scope1, \"Hello\", self()).\n    ok\n    (cpg@localhost)3\u003e cpg:join(groups_scope1, \"World!\", self()).\n    ok\n    (cpg@localhost)4\u003e cpg:get_local_members(groups_scope1, \"Hello\").\n    {ok,\"Hello\",[\u003c0.39.0\u003e]}\n    (cpg@localhost)5\u003e cpg:get_local_members(groups_scope1, \"World!\").\n    {ok,\"World!\",[\u003c0.39.0\u003e]}\n    (cpg@localhost)6\u003e cpg:which_groups(groups_scope1).\n    [\"Hello\",\"World!\"]\n    (cpg@localhost)7\u003e cpg:which_groups(groups_scope2).\n    []\n\nWhat does this example mean?  The cpg interface allows you to define groups of\nErlang processes and each group exists within a scope.  A scope is represented\nas an atom which is used to locally register a cpg Erlang process using\n`start_link/1`.  For a given cpg scope, any Erlang process can join or leave\na group.  The group name is a string (list of integers) due to the default\nusage of the trie data structure, but that can be changed\n(see the [Usage](#usage) section above).  If the scope is not specified, the default\nscope is used: `cpg_default_scope`.\n\nIn the example, both the process group \"Hello\" and the process group \"World!\"\nare created within the `groups_scope1` scope.  Within both progress groups,\na single Erlang process is added once.  If more scopes were required, they\ncould be created automatically by being provided within the cpg application\nscope list.  There is no restriction on the number of process groups that\ncan be created within a scope, and there is nothing limiting the number\nof Erlang processes that can be added to a single group.  A single Erlang\nprocess can be added to a single process group in a single scope multiple times\nto change the probability of returning a particular Erlang process, when\nonly a single process is requested from the cpg interface (e.g., from\nthe `get_closest_pid` function).\n\n## Tests\n\n    rebar get-deps\n    rebar compile\n    ERL_LIBS=\"/path/to/proper\" rebar eunit\n\n## Author\n\nMichael Truog (mjtruog at protonmail dot com)\n\n## License\n\nMIT License\n\n## References\n\n1. Carlos Baquero, Paulo Sérgio Almeida, Ali Shoker.  Making operation-based crdts operation-based. In Proceedings of the First Workshop on Principles and Practice of Eventual Consistency, page 7. ACM, 2014. [http://haslab.uminho.pt/ashoker/files/opbaseddais14.pdf](http://haslab.uminho.pt/ashoker/files/opbaseddais14.pdf)\n1. Carlos Baquero, Paulo Sérgio Almeida, Ali Shoker.  Pure Operation-Based Replicated Data Types. 2017. [https://arxiv.org/abs/1710.04469](https://arxiv.org/abs/1710.04469)\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fokeuday%2Fcpg","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fokeuday%2Fcpg","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fokeuday%2Fcpg/lists"}