{"id":28563690,"url":"https://github.com/googlielmo/fastpuss","last_synced_at":"2025-10-19T21:33:01.400Z","repository":{"id":184863403,"uuid":"659799928","full_name":"googlielmo/fastpuss","owner":"googlielmo","description":"A proof of concept for a fast pub-sub system that can scale to millions of topics and subscribers.","archived":false,"fork":false,"pushed_at":"2024-10-23T07:02:54.000Z","size":23,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-06-03T10:38:48.590Z","etag":null,"topics":["distributed-systems","proof-of-concept","pubsub","scalable"],"latest_commit_sha":null,"homepage":"","language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/googlielmo.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-06-28T15:28:34.000Z","updated_at":"2023-08-28T21:16:02.000Z","dependencies_parsed_at":"2024-10-23T07:00:18.246Z","dependency_job_id":null,"html_url":"https://github.com/googlielmo/fastpuss","commit_stats":null,"previous_names":["googlielmo/fastpuss"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/googlielmo%2Ffastpuss","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/googlielmo%2Ffastpuss/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/googlielmo%2Ffastpuss/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/googlielmo%2Ffastpuss/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/googlielmo","download_url":"https://codeload.github.com/googlielmo/fastpuss/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/googlielmo%2Ffastpuss/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":259081038,"owners_count":22802404,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["distributed-systems","proof-of-concept","pubsub","scalable"],"created_at":"2025-06-10T13:07:44.546Z","updated_at":"2025-10-19T21:32:56.377Z","avatar_url":"https://github.com/googlielmo.png","language":"Java","funding_links":[],"categories":[],"sub_categories":[],"readme":"# FASTPUSS (🐾)\n\nA proof of concept for a fast pub-sub system that can scale to millions of topics and subscribers.\n\n## Theory of operation\n\nA Fastpuss (🐾) system consists of one or more message broker(s) and an unlimited number of clients.\n\nClients communicate with brokers over UDP.\n\nIn a distributed configuration (see below), brokers communicate with one another over TCP.\n\nEach client can subscribe to any number of topics by sending a subscription request to a broker.\n\nTopics are completely _dynamic_, meaning that they don't have to be instantiated beforehand, and _ephemeral_, meaning\nthey exist only as long as they are in usage by one or more clients.\n\nA client can publish a message to a specific topic by sending it to a broker. One-shot publishing to more than one topic\nis not supported. Upon receiving the message, the broker will forward the message contents to the subscribed clients for\nthat topic.\n\n## LICENSE\n\n\n\n## PoC implementation\n\nSubscription data mapping topics to data is implemented within\nthe [ThreadSafeSubscriberManager](src/main/java/io/github/googlielmo/fastpuss/ThreadSafeSubscriberManager.java) class.\n\nThe single-node [MessageBroker](src/main/java/io/github/googlielmo/fastpuss/MessageBroker.java),\n[MessagePublisher](src/main/java/io/github/googlielmo/fastpuss/MessagePublisher.java), and\n[MessageSubscriber](src/main/java/io/github/googlielmo/fastpuss/MessageSubscriber.java)\nclasses use this subscriber manager.\n\nAs a proof of concept, I implemented a [LocalRunner](src/main/java/io/github/googlielmo/fastpuss/LocalRunner.java)\nclass that runs a few hundreds threads: one broker, some publishers and a number of subscribers.\nThis proved useful in manual test and debugging.\n\nYou can run it directly with Maven:\n\n  ```shell\n  mvn compile exec:java -Dexec.mainClass=io.github.googlielmo.fastpuss.LocalRunner\n  ```\n\n### Design decisions\n\nFor the Subscriber Manager:\n\n- A Map holds the subscription state: the keys are topic filters (strings) and the values are Collections of client IDs.\n- Client IDs are strings in the form \"/host:port\".\n- A `ConcurrentHashMap` is used, so to allow concurrent, thread-safe operations.\n- The collection type chosen to hold client IDs is the `ConcurrentLinkedDeque`, which offers concurrent, thread-safe\n  operations with some useful properties, such as _weakly consistent_ iterators, which allow iterating over the\n  collection itself even in the face of concurrent modification. This is quite useful to avoid expensive copy operations\n  of (potentially) millions of elements to an immutable temporary copy, which would be necessary to avoid concurrent\n  modification exceptions, had we used a regular collection. One caveat of this implementation is that `size()`\n  is NOT a constant time operation and may return inaccurate results, as such we couldn't check the collection size in\n  tests.\n\nFor the Broker:\n\n- UDP is used as the transport protocol for the individual messages\n- Wildcards will be implemented in the future, therefore at the moment topic filters are in fact just topic names\n- Topic names cannot contain spaces, so that parsing messages becomes trivial (see [Message format](#message-format)\n  below)\n\n### Message format\n\nThe messages are UTF-8 strings in the format specified informally by the following EBNF grammar:\n\n```ebnf\nmessage     = verb, S, topic, S, body ;\nverb        = 'PUB' | 'SUB' | 'MSG' ;\nS           = { white space } ;\ntopic       = ? a valid, non-empty sequence of utf-8 characters, excluding white space ? ;\nbody        = ? a valid, possibly empty sequence of utf-8 characters, including white space ? ;\nwhite space = ? white space as per regexp /\\s/ ? ;\n```\n\n#### Message types\n\nMessages are distinguished by their _verb_.\nThe following verbs are used in messages sent by clients to the broker:\n\n- **PUB** publish a message to a topic. E.g.\n  ```\n  PUB topic1\n  41.8874314503 12.4886930452\n  ```\n- **SUB** subscribe to a topic. E.g. (please note the LF after the topic name)\n  ```\n  SUB topic1\n    \n  ```\n\nA different verb is used in messages sent by the broker to the clients:\n\n- **MSG** represents a message published to a topic. E.g.\n  ```\n  MSG topic1\n  41.8874314503 12.4886930452\n  ```\n\n## Distributed broker implementation (Work in progress)\n\nThe [DistributedBroker](src/main/java/io/github/googlielmo/fastpuss/DistributedBroker.java) class implements a\ndistributed message broker as an extension of the standalone broker.\n\n### Design decisions\n\nMy assumptions for the distributed broker are:\n\n- _Shared-nothing architecture_ where each node contains a copy of all the data.\n- The data itself coincides with the state of the topic subscriptions.\n- The number of nodes is fixed and their addresses is constant, plus they are known beforehand to each client and broker\n  node via configuration.\n- Each client can talk to any broker node via UDP, using the same protocol detailed above. In case of a network failure\n  detected when sending a PUB or SUB message to a broker node, it is up to the client to switch to another node and\n  retry.\n- In case of a node crash, the node will eventually restart via some external mechanism.\n- The broker nodes keep in sync with one another by communicating via TCP\n- As soon as a node starts, possibly after a crash, it picks one of the other nodes randomly and asks for a fail-over\n  initial (\"massive\") data transfer in \"pull\" fashion.\n- Whenever a client subscribes to a topic, it will do so by sending a SUB message to one of the nodes as described for\n  the non-distributed PoC. After updating its internal state, the broker will update all the other nodes, in \"push\"\n  fashion.\n\nWhen _node1_ connects to _node2_ it will send a command string, which can be \"PULL\" or \"PUSH\", followed by a newline.\n_node2_ on receiving the command string will do the following:\n\n- In case of `PULL` will transmit a full copy of its data to the requesting node.\n- In case of `PUSH` will receive one subscription and update its state accordingly\n\nThe data sync protocol itself is very simple.\nIt consists of lines of utf-8 text in the form:\n\n    topic-name\n    client-id-1\n    client-id-2\n    ...\n    client-id-n\n    \u003cempty line\u003e\n\nAn empty line signals that the next non-empty line starts a new topic with the list of client IDs.\n\n### Traffic sizing\n\nSuppose we have a cluster of _n_ broker nodes.\n\nA SUB message sent from a client to one broker node _N(i)_ will use:\n\n- 1 UDP datagram from client to _N(i)_\n- _n - 1_ TCP PUSH messages with state update of topic + client id\n\nNow suppose that for a particular topic, we have _k_ (with _k_ \u003e\u003e _n_) subscribed clients.\nA PUB message sent for that topic from a client to one broker node _N(i)_ will use:\n\n- 1 UDP datagram from client to _N(i)_\n- _k_ UDP datagrams from _N(i)_ to the subscribed clients\n\nA node fail-over in the conditions above will then transfer the full state, which we can assume to be _O_( _k_ * _t_ )\nwhere _t_ is the max number of subscribed topics by any client and _k_ is the max number of clients subscribed to any\ntopic.\n\n## Possible extensions\n\nData partitioning:\n\n- Each node could in principle hold only a partition for _1/n_ of the topics.\n- For redundancy, each node should have at least 2 such partitions: a primary and a secondary for backup of another\n  node. Which primary and secondary partitions are associated to a specific topic can be determined by a hash function\n  which would give an integer that is then reduced modulo _n_ to the node holding the primary. The next node (mod _n_)\n  could be the secondary.\n- The application protocol between pairs of nodes would change as the secondary partitions only should be updated when a\n  new subscription is made.\n- The client itself should change in order to contact exactly the node with the primary partition for a topic, not \"any\"\n  node.\n- In case of failure of one node, it is up to the clients to keep the secondary partition up to date.\n- When a node is restarted it shall retrieve its primary partition from the secondary of another node, plus its own\n  secondary from the primary of another node.\n\nDynamic addition and removal of nodes:\n\n- The configuration is not immutable anymore, instead there is a static startup config and a dynamic one that overrides\n  the former.\n- Possibly one or more node discovery techniques can be employed to autoconfigure a new node based on the particular\n  environment constraints (LAN, specific cloud such as AWS, Kubernetes, etc.)\n- The partitioning (if implemented) needs to be independent of the number of nodes. This can be achieved by creating a\n  high number of partitions which are then assigned to the current cluster nodes via a consistent hash function.\n- A rebalancing operation should happen when the number of active nodes change.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgooglielmo%2Ffastpuss","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgooglielmo%2Ffastpuss","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgooglielmo%2Ffastpuss/lists"}