{"id":18421943,"url":"https://github.com/spcl/faaskeeper","last_synced_at":"2025-08-25T06:04:43.176Z","repository":{"id":135158736,"uuid":"276915114","full_name":"spcl/faaskeeper","owner":"spcl","description":"A fully serverless implementation of the ZooKeeper coordination protocol.","archived":false,"fork":false,"pushed_at":"2024-08-20T19:41:28.000Z","size":240,"stargazers_count":20,"open_issues_count":29,"forks_count":13,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-04-07T14:39:13.303Z","etag":null,"topics":["aws-lambda","faas","serverless","zookeeper"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-3-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/spcl.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-07-03T14:10:34.000Z","updated_at":"2025-02-18T22:26:04.000Z","dependencies_parsed_at":"2024-05-02T17:48:58.961Z","dependency_job_id":"2033cd95-df09-4002-a5cd-e555f21e6ae4","html_url":"https://github.com/spcl/faaskeeper","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/spcl/faaskeeper","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/spcl%2Ffaaskeeper","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/spcl%2Ffaaskeeper/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/spcl%2Ffaaskeeper/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/spcl%2Ffaaskeeper/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/spcl","download_url":"https://codeload.github.com/spcl/faaskeeper/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/spcl%2Ffaaskeeper/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":272013535,"owners_count":24858474,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-25T02:00:12.092Z","response_time":1107,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["aws-lambda","faas","serverless","zookeeper"],"created_at":"2024-11-06T04:27:23.812Z","updated_at":"2025-08-25T06:04:43.158Z","avatar_url":"https://github.com/spcl.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# FaaSKeeper\n\n**The ZooKeeper-like serverless coordination service FaaSKeeper.**\n\n[![GH Actions](https://github.com/spcl/faaskeeper/actions/workflows/build.yml/badge.svg)](https://github.com/spcl/faaskeeper/actions/workflows/build.yml)\n\nThe main implementation of the serverless FaaSKeeper service.\nAt the moment we support an AWS deployment only.\nAdding support for other commercial clouds (Azure, GCP)\nis planned in the future.\n\nTo use a deployed FaaSKeeper instance, check our Python client library: [spcl/faaskeeper-python](https://github.com/spcl/faaskeeper-python).\n\n### Paper\n\nWhen using FaaSKeeper, please cite our upcoming HPDC'24 paper. You can find more details about research work [in this paper summary](https://mcopik.github.io/projects/faaskeeper/)\nand [paper preprint on arXiv](https://arxiv.org/abs/2203.14859).\n\n```\n@inproceedings{copik2024faaskeeper,\n  author = {Copik, Marcin and Calotoiu, Alexandru and Zhou, Pengyu and Taranov, Konstantin and Hoefler, Torsten},\n  title = {FaaSKeeper: Learning from Building Serverless Services with ZooKeeper as an Example},\n  year = {2024},\n  series = {HPDC '24},\n  booktitle = {Proceedings of the 33rd International Symposium on High-Performance Parallel and Distributed Computing}\n}\n```\n\n## Dependencies\n\n* Python \u003e= 3.7\n* Node.js **\u003c= 15.4.0**\n* Cloud credentials to deploy the service\n\nWe use the [serverless](https://www.serverless.com/) framework to manage the deployment of serverless\nfunctions and cloud resources. The framework with all dependencies and plugins is installed automatically.\nConfigure the provider credentials according to instructions provided in the serverless framework,\ne.g., [instructions for AWS](https://www.serverless.com/framework/docs/providers/aws/guide/credentials/).\n\n\nCurrently, serverless has a bug causing it to generate empty deployment packages ([bug 1](https://github.com/serverless-heaven/serverless-webpack/issues/682), [bug 2](https://github.com/serverless/serverless/issues/8794)),\nand the `node 15.4.0` is confirmed to work - any newer version is NOT guaranteed to work.\n\n## Configuration.\n\nThe FaaSKeeper architecture can be customized with the following settings:\n* **User storage** - the permitted values are *key-value* (DynamoDB on AWS), and *persistent* (S3 on AWS).\n* **Worker queue** - there are two types of *writer* and *distributor* queues - *dynamodb* using DynamoDB streams, and *sqs* using the SQS queue.\n* **Client channel** - functions deliver the client notification with a *tcp* and *sqs* channel. The former requires the client to accept incoming TCP connections, i.e., it needs to have public IP.\nSee the JSON config example in `config/user_config.json` for details.\n\n## Installation \u0026 Deployment\n\nTo install the local development environment with all necessary packages, please use the `install.py`\nscript. The script takes one optional argument `--venv` with a path to the Python and Node.js virtual\nenvironment, and the default path is `python-venv`. Use `source {venv-path}/bin/activate` to use it.\n\nThe deployment with `serverless` framework is wrapped with a helper executable `fk.py`.\nUse the JSON config example in `config/user_config.json` to change the deployment name and parameters.\nUse this to deploy the service with all functions, storage, and queue services:\n\n```\n./fk.py deploy service config/user_config_final.json --provider aws --config config/user_config.json\n```\n\nThe script will generate a new version of the config in `config/user_config_final.json`, which includes\ndata needed for clients, such as the location of S3 data bucket - its name is partially randomized to\ngenerate unique S3 bucket names.\nTo update functions, they can be redeployed separately:\n\n```\n./fk.py deploy functions --provider aws --config config/user_config.json\n```\n\nThe existing deployment can be cleared by removing entire service before redeployment:\n\n```\n./fk.py deploy service --provider aws --clean --config config/user_config.json\n```\n\nTo enable verbose debugging of functions, set the flag `verbose` in the config.\n\n## Using CLI\n\nA CLI for FaaSKeeper is available in `bin/fkCli.py`. It allows to run interactive FaaSKeeper session,\nand it includes history and command suggestions.\n\n```console\nbin/fkCli.py \u003cconfig-file\u003e\n[fk: aws:faaskeeper-dev(CONNECTED) session:f3c1ba70 0] create /root/test1 \"test_data\" false false\n```\n\n## Design\n\n### Storage\n\nFaaSKeeper uses two types of storage - system and user.\nFor system, we require strong consistency, as different writer functions must be able to safely\nmodify data in parallel.\nFor user data storage, we require strong consistency as well to prevent stale reads that would\nviolate ZooKeeper consistency principles.\n\n### Components\n\nAll resources are allocated with the help of Serverless framework. See `config/aws.yml` for an example on the AWS platform.\n\n#### Storage\n\nWe use key-value tables to store system data, user store, and to function as queus for incoming requests. Furthermore, we allocate persistent object storage buckets to store user data as well.\n\n#### Functions\n\nWe create four basic functions: `heartbeat`, `watch`, `writer` and `distributor`. Users invoke `writer` indirectly via a queue or a table to proces a write request, and this function in turn invokes `distributor` and `watch` to process updated data nad new watch events. Heartbeat is invoked periodically by the system.\n\n#### Communication\n\nWe use queues to process incoming requests and invoke functions. Queues must uphold FIFO ordering in FaaSKeeper.\n\n## Functionalities\n\n#### Watches\n\nTo register a watch, the client performs the following sequence of operations: read node, add\nwatch in storage, read node again.\nIf the node has not been update, then we have inserted the node correctly.\nThis is guaranteed by the fact that system first updates the data, and then it sends watch notifications.\n\nIf there's a concurrent update happening in the background, then interleaving between update and watch setting can happen.\nIf the client manages to add notification before the update, it will be notified.\nHowever, if the data is updated, then client has no guarantee that it managed to create watch before watch function started delivering notifications - thus, watch creation failed.\nThe watch might have been created correctly, but this case is managed by adding timestamps to let `watch` function detect when the watch is new enough that\nit shouldn't be trigerred.\n\nBad interleaving - entire watch process happens between system writing data and starting notifications. The system should not trigger the watch and\nretain them for future usage.\n\n##### GetData\n\nThe watch is triggered by `set_data` and `delete` operations on the node.\nTo detect if the watch is not set on an older version of the node, the timestamp\nis compared.\n\n##### Exists\n\nThe watch is triggered by `create`, `set_data`, and `delete` call on the node.\nTo detect if the watch is not set on an older version of the node, the timestamp\nis compared.\nWhen the node does not exist, a \"none\" timestamp is set - such watch is\nretained when the update was `delete` and triggered when the update was `create`.\n\n##### GetChildren\n\nTriggered by delete on the node and `create` and `delete` on its children.\nTo detect if the watch is not set on an older version of the node, we\nuse the `children` timestamp.\n\n## Development\n\nWe use `black` and `flake8` for code linting. Before commiting and pushing changes,\nplease run `tools/linting.py functions` to verify that there are no issues with your code.\n\nWe use Python type hints ([PEP](https://www.python.org/dev/peps/pep-0484/), [docs](https://docs.python.org/3/library/typing.html))\nto enhance the readability of our code. When adding new interfaces and types, please use type hints.\nThe linting helper `tools/linting.py` includes a call to `mypy` to check for static typing errors.\n\n## Authors\n\n* [Marcin Copik (ETH Zurich)](https://github.com/mcopik/) - main author.\n* [Pengyu Zhou (University of Toronto)](https://github.com/EricPyZhou) - Google Cloud port and various fixes.\n* [Ziad Hany](https://github.com/ziadhany) - SQS client channel.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fspcl%2Ffaaskeeper","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fspcl%2Ffaaskeeper","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fspcl%2Ffaaskeeper/lists"}