{"id":18470433,"url":"https://github.com/ds2-lab/wukong","last_synced_at":"2025-07-09T18:33:25.603Z","repository":{"id":44348928,"uuid":"212924906","full_name":"ds2-lab/Wukong","owner":"ds2-lab","description":"Wukong: A scalable and locality-enhanced serverless parallel framework (ACM SoCC'20)","archived":false,"fork":false,"pushed_at":"2024-11-02T21:46:49.000Z","size":15301,"stargazers_count":74,"open_issues_count":1,"forks_count":16,"subscribers_count":6,"default_branch":"socc2020","last_synced_at":"2025-05-19T18:11:25.187Z","etag":null,"topics":["analytics","aws","aws-lambda","cloud-computing","dask","data-analytics","faas","linear-algebra","machine-learning","parallel-computing","python","serverless","serverless-computing"],"latest_commit_sha":null,"homepage":"https://ds2-lab.github.io/Wukong/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ds2-lab.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-10-05T00:55:13.000Z","updated_at":"2025-05-05T01:18:35.000Z","dependencies_parsed_at":"2023-02-17T07:16:09.731Z","dependency_job_id":"1599c00e-fe6d-4cd3-b878-25211c054846","html_url":"https://github.com/ds2-lab/Wukong","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/ds2-lab/Wukong","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ds2-lab%2FWukong","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ds2-lab%2FWukong/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ds2-lab%2FWukong/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ds2-lab%2FWukong/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ds2-lab","download_url":"https://codeload.github.com/ds2-lab/Wukong/tar.gz/refs/heads/socc2020","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ds2-lab%2FWukong/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":264502387,"owners_count":23618587,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["analytics","aws","aws-lambda","cloud-computing","dask","data-analytics","faas","linear-algebra","machine-learning","parallel-computing","python","serverless","serverless-computing"],"created_at":"2024-11-06T10:13:58.887Z","updated_at":"2025-07-09T18:33:25.546Z","avatar_url":"https://github.com/ds2-lab.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Wukong\n\n![Logo](https://github.com/ds2-lab/ds2-lab.github.io/blob/master/docs/images/wukong_logo.png)\n\n[![Documentation Status](https://readthedocs.org/projects/leap-wukong/badge/?version=latest)](https://leap-wukong.readthedocs.io/en/latest/?badge=latest)\n\nWukong is a high-performance and highly scalable locality-aware, serverless workflow and DAG engine. Wukong uses serverless computing to accelerate the execution of DAG-based scientific, linear algebra, machine learning, and data analytics workloads. \n\n## What is Wukong?\n\nWukong is a serverless parallel computing framework attuned to FaaS platforms such as AWS Lambda. Wukong provides decentralized scheduling using a combination of static and dynamic scheduling. Wukong supports general Python data analytics workloads at any scale. \n\n![Architecture](https://i.imgur.com/QDqMiFs.png \"Wukong's Architecture\")\n\n## Publications\n\nFirst Paper: In Search of a Fast and Efficient Serverless DAG Engine (Appeared at PDSW '19)\nhttps://arxiv.org/abs/1910.05896\n\nLatest Paper: Wukong: A Scalable and Locality-Enhanced Framework for Serverless Parallel Computing (Appeared at ACM SoCC '20) https://arxiv.org/abs/2010.07268 .\nIf you use our source code for a publication or project, please cite the paper using this [bibtex](#to-cite-wukong).\n\nThis branch contains the source code of Wukong corresponding to the SoCC 2020 publication, which is a later version than the PDSW paper referenced above.\n\n## Installation \n\n**Required Python version**: Python3.12\n\nA majority of the required AWS infrastructure can be created using the provided `aws_setup.py` script in `Wukong/Static Scheduler/install/` directory. Please be sure to read through the `wukong_setup_config.yaml` configuration file located in the same directory prior to running the script. In particular, your public IP address should be added to the configuration file if you'd like SSH to be enabled from your machine to VMs created in the Wukong VPC. \n\nIn addition, there is documentation in the `setup/` directory for additional/supplementary instructions concerning the creation of the AWS infrastructure for Wukong.\n\n### Verifying Your Installation\n\nThere is a sample Redis configuration file available at `Static Scheduler/install/redis.conf`.\n\nSimilarly, there is a simple test application available at `Static Scheduler/simple-test-app.py`. You can use these to test your installation.\n\nStart the proxy by navigating to the `KV Store Proxy` directory and executing the command:\n``` sh\npython3.12 proxy.py --redis 127.0.0.1\n```\n\nYou'll also need to start Redis on your machine. In another terminal window, execute the command:\n``` sh\nredis-server \u003cpath/to/redis/config/file\u003e\n```\nYou can use the provided sample Redis configuration file (`Static Scheduler/install/redis.conf`).\n\nFinally, to test your installation, you may run:\n``` sh\npython3.12 simple-test-app.py\n```\nfrom the `Static Scheduler` directory. \n\nIf your installation is working, then you should see, among other things, the following in the program's output:\n```\n...\n[CLIENT] Obtained value for key incr-\u003ctask id\u003e from Redis.\n[CLIENT] Returning {'status': 'OK', 'data': {'incr-\u003ctask id\u003e': 6}} to the user...\nResult: 6\n...\n```\n\n## Code Overview/Explanation \n\n*This section is currently under development...*\n\n### The Static Scheduler\n\nGenerally speaking, a user submits a job by calling the `.compute()` function on an underlying Dask collection. Support for Dask's asynchronous `client.compute()` API is coming soon.\n\nWhen the `.compute()` function is called, the `update_graph()` function is called within the static Scheduler, specifically in **scheduler.py**. This function is responsible for adding computations to the Scheduler's internal graph. It's triggered whenever a Client calls `.submit()`, `.map()`, `.get()`, or `.compute()`. The depth-first search (DFS) method is defined with the `update_graph` function, and the DFS also occurs during the `update_graph` function's execution. \n\nOnce the DFS has completed, the Scheduler will serialize all of the generated paths and store them in the KV Store (Redis). Next, the Scheduler will begin the computation by submitting the leaf tasks to the `BatchedLambdaInvoker` object (which is defined in **batched_lambda_invoker.py**). The \"Leaf Task Invoker\" processes are defined within the `BatchedLambdaInvoker` class as the `invoker_polling_process` function. Additionally, the `_background_send` function is running asynchronously on an interval (using [Tornado](https://www.tornadoweb.org/en/stable/)). This function takes whatever tasks have been submitted by the Scheduler and divdes them up among itself and the Leaf Task Invoker processes, which then invoke the leaf tasks. \n\nThe Scheduler listens for results from Lambda using a \"Subscriber Process\", which is defined by the `poll_redis_process` function. This process is created in the Scheduler's `start` function. (All of this is defined in **scheduler.py**.) The Scheduler is also executing the `consume_redis_queue()` function asynchronously (i.e., on the [Tornado IOLoop](https://www.tornadoweb.org/en/stable/ioloop.html)). This function processes whatever messages were received by the aforementioned \"Subscriber Process(es)\". Whenever a message is processed, it is passed to the `result_from_lambda()` function, which begins the process of recording the fact that a \"final result\" is available. \n\n### The KV Store Proxy\n\nThis component is used to parallelize Lambda function invocations in the middle of a workload's execution. This is advantageous, as invoking individual AWS Lambda functions has a relatively high overhead (~50ms per invocation). For large fan-outs, this overhead can bottleneck Wukong's performance. Thus, the KV store proxy alleviates that by leveraging a VM with many cores to parallelize such invocations.\n\n### The AWS Lambda Task Executor\n\nThe Task Executors are responsible for executing tasks and performing dynamic scheduling. Executors cooperate with one another to decide who should execute downstream tasks during fan-ins. Executors communicate through intermediate storage (e.g., Redis). \n\n### Developer Setup Notes\n\nWhen setting up Wukong, make sure to update the variables referencing the name of the AWS Lambda function used as the Wukong Task Executor. For example, in \"AWS Lambda Task Executor/function.py\", this is a variable *lambda_function_name* whose value should be the same as the name of the Lambda function as defined in AWS Lambda itself.\n\nThere is also a variable referencing the function's name in \"Static Scheduler/wukong/batched_lambda_invoker.py\" (as a keyword argument to the constructor of the BatchedLambdaInvoker object) and in \"KV Store Proxy/proxy_lambda_invoker.py\" (also as a keyword argument to the constructor of ProxyLambdaInvoker).\n\nBy default, Wukong is configured to run within the us-east-1 region. If you would like to use a different region, then you need to pass the \"region_name\" parameter to the Lambda Client objects created in \"Static Scheduler/wukong/batched_lambda_invoker.py\", \"KV Store Proxy/proxy_lambda_invoker.py\", \"KV Store Proxy/proxy.py\", \"AWS Lambda Task Executor/function.py\", and \"Static Scheduler/wukong/scheduler.py\". \n\n## Code Examples\n\nIn the following examples, modifying the value of the *chunks* parameter will essentially change the granularity of the tasks generated in the DAG. Essentially, *chunks* specifies how the initial input data is partitioned. Increasing the size of *chunks* will yield fewer individual tasks, and each task will operate over a large proportion of the input data. Decreasing the size of *chunks* will result in a greater number of individual tasks, with each task operating on a smaller portion of the input data. \n\n### LocalCluster Overview\n```\nLocalCluster(object):\n  host : string\n    The public DNS IPv4 address associated with the EC2 instance on which the Scheduler process is executing, along with the port on \n    which the Scheduler is listening. The format of this string should be \"IPv4:port\". \n  n_workers : int,\n    Artifact from Dask. Leave this at zero.\n  proxy_adderss : string,\n    The public DNS IPv4 address associated with the EC2 instance on which the KV Store Proxy process is executing.\n  proxy_port : 8989,\n    The port on which the KV Store Proxy process is listening.\n  redis_endpoints : list of tuples of the form (string, int)\n    List of the public DNS IPv4 addresses and ports on which KV Store (Redis) instances are listening. The format\n    of this list should be [(\"IP_1\", port_1), (\"IP_2\", port_2), ..., (\"IP_n\", port_n)] \n  num_lambda_invokers : int\n    This value specifies how many 'Initial Task Executor Invokers' should be created by the Scheduler. The 'Initial Task \n    Executor Invokers' are processes that are used by the Scheduler to parallelize the invocation of Task Executors\n    associated with leaf tasks. These are particularly useful for large workloads with a big number of leaf tasks.\n  max_task_fanout : int\n    This specifies the size of a \"fanout\" required for a Task Executor to utilize the KV Store Proxy for parallelizing downstream\n    task invocation. The principle here is the same as with the initial task invokers. Our tests found that invoking Lambda functions\n    takes about 50ms on average. As a result, if a given Task T has a large fanout (i.e., there are a large number of downstream tasks \n    directly dependent on T), then it may be advantageous to parallelize the invocation of these downstream tasks.\n  use_fargate : bool\n    If True, then Wukong will attempt to use AWS Fargate for its intermediate storage. This requires that the AWS Fargate infrastructure\n    already exists and that Wukong has been correctly configured to use it (i.e., passing the required information to the 'LocalCluster'\n    instance. This defaults to False, in which case Wukong simply uses a single Redis instance for all intermediate storage.\n  use_local_proxy : bool\n    If True, automatically deploy the KV Store Proxy locally on the same VM as the Static Scheduler.\n    Note that the user should pass a value for the `local_proxy_path` property if setting `use_local_proxy` to True.\n    If not, then Wukong will attempt to locate the proxy in \"../KV Store Proxy/\", which may or may not work\n    depending on where Wukong is being executed from. \n  local_proxy_path: str\n    Fully-qualified path to the KV Store Proxy source code, specifically the proxy.py file. \n    This is only used when `use_local_proxy` is True.\n    Example: \"/home/ec2-user/Wukong/KV Store Proxy/proxy.py\"\n```\n\n### Single-Node DAG Example\n```python\nimport dask.array as da\nfrom dask import delayed\nfrom wukong import LocalCluster, Client\nlocal_cluster = LocalCluster(host='\u003cprivate IPv4 of Static Scheduler VM\u003e:8786',\n                  proxy_address = '\u003cprivate IPv4 of KV Store Proxy VM\u003e', \n                  num_lambda_invokers = 4,\n                  # Automatically create proxy locally. Pass same IPv4 for `host` and `proxy_address`\n                  use_local_proxy = True, \n                  # Path to `proxy.py` file.\n                  local_proxy_path = \"/home/ec2-user/Wukong/KV Store Proxy/proxy.py\",\n                  redis_endpoints = [(\"\u003cprivate IPv4 of Static Scheduler VM\u003e\", 6379)],\n                  use_fargate = False) \nclient = Client(local_cluster)\n\n# Define a function.\n\ndef incr(x):\n  return x + 1\n  \nexample_computation = delayed(incr)(5)\n\n# Start the workload. \nresult = example_computation.compute(scheduler = client.get)\nprint(\"Result: %d\" % result)  \n```\n\n### 3-Node DAG Examples\n```python\nimport dask.array as da\nfrom dask import delayed\nfrom wukong import LocalCluster, Client\nlocal_cluster = LocalCluster(host='\u003cprivate IPv4 of Static Scheduler VM\u003e:8786',\n                  proxy_address = '\u003cprivate IPv4 of KV Store Proxy VM\u003e', \n                  num_lambda_invokers = 4,\n                  # Automatically create proxy locally. Pass same IPv4 for `host` and `proxy_address`\n                  use_local_proxy = True, \n                  # Path to `proxy.py` file.\n                  local_proxy_path = \"/home/ec2-user/Wukong/KV Store Proxy/proxy.py\",\n                  redis_endpoints = [(\"\u003cprivate IPv4 of Static Scheduler VM\u003e\", 6379)],\n                  use_fargate = False) \nclient = Client(local_cluster)\n\n# Define some functions.\n\ndef incr(x):\n  return x + 1\n\ndef decr(x):\n  return x - 1\n\ndef double(x):\n  return x * 2\n\ndef add_values(x, y):\n  return x + y \n  \n# Linear 3-Node DAG\na = delayed(incr)(5)\nb = delayed(decr)(a)\nc = delayed(double)(b)\n\nresult1 = c.compute(scheduler = client.get)\nprint(\"Result: %d\" % result1)  \n\n# 3-Node DAG with a Fan-In\nx = delayed(incr)(3)\ny = delayed(decr)(7)\nz = delayed(add_values)(x,y)\n\nresult2 = z.compute(scheduler = client.get)\nprint(\"Result: %d\" % result2)  \n\n```\n\n### Tree Reduction\n``` python\nfrom dask import delayed \nimport operator \nfrom wukong import LocalCluster, Client\nlocal_cluster = LocalCluster(host='\u003cprivate IPv4 of Static Scheduler VM\u003e:8786',\n                  proxy_address = '\u003cprivate IPv4 of KV Store Proxy VM\u003e', \n                  num_lambda_invokers = 4,\n                  # Automatically create proxy locally. Pass same IPv4 for `host` and `proxy_address`\n                  use_local_proxy = True, \n                  # Path to `proxy.py` file.\n                  local_proxy_path = \"/home/ec2-user/Wukong/KV Store Proxy/proxy.py\",\n                  redis_endpoints = [(\"\u003cprivate IPv4 of Static Scheduler VM\u003e\", 6379)],\n                  use_fargate = False) \nclient = Client(local_cluster)\n\nL = range(1024)\nwhile len(L) \u003e 1:\n  L = list(map(delayed(operator.add), L[0::2], L[1::2]))\n\n# Start the computation.\nL[0].compute(scheduler = client.get)\n```\n\n### SVD of 'Tall-and-Skinny' Matrix \n```python\nimport dask.array as da\nfrom wukong import LocalCluster, Client\nlocal_cluster = LocalCluster(host='\u003cprivate IPv4 of Static Scheduler VM\u003e:8786',\n                  proxy_address = '\u003cprivate IPv4 of KV Store Proxy VM\u003e', \n                  num_lambda_invokers = 4,\n                  # Automatically create proxy locally. Pass same IPv4 for `host` and `proxy_address`\n                  use_local_proxy = True, \n                  # Path to `proxy.py` file.\n                  local_proxy_path = \"/home/ec2-user/Wukong/KV Store Proxy/proxy.py\",\n                  redis_endpoints = [(\"\u003cprivate IPv4 of Static Scheduler VM\u003e\", 6379)],\n                  use_fargate = False) \nclient = Client(local_cluster)\n\n# Compute the SVD of 'Tall-and-Skinny' Matrix \nX = da.random.random((200000, 1000), chunks=(10000, 1000))\nu, s, v = da.linalg.svd(X)\n\n# Start the computation.\nv.compute(scheduler = client.get)\n```\n\n### SVD of Square Matrix with Approximation Algorithm\n```python\nimport dask.array as da\nfrom wukong import LocalCluster, Client\nlocal_cluster = LocalCluster(host='\u003cprivate IPv4 of Static Scheduler VM\u003e:8786',\n                  proxy_address = '\u003cprivate IPv4 of KV Store Proxy VM\u003e', \n                  num_lambda_invokers = 4,\n                  # Automatically create proxy locally. Pass same IPv4 for `host` and `proxy_address`\n                  use_local_proxy = True, \n                  # Path to `proxy.py` file.\n                  local_proxy_path = \"/home/ec2-user/Wukong/KV Store Proxy/proxy.py\",\n                  redis_endpoints = [(\"\u003cprivate IPv4 of Static Scheduler VM\u003e\", 6379)],\n                  use_fargate = False) \nclient = Client(local_cluster)\n\n# Compute the SVD of 'Tall-and-Skinny' Matrix \nX = da.random.random((10000, 10000), chunks=(2000, 2000))\nu, s, v = da.linalg.svd_compressed(X, k=5)\n\n# Start the computation.\nv.compute(scheduler = client.get)\n```\n\n### GEMM (Matrix Multiplication) \n``` python\nimport dask.array as da\nfrom wukong import LocalCluster, Client\nlocal_cluster = LocalCluster(host='\u003cprivate IPv4 of Static Scheduler VM\u003e:8786',\n                  proxy_address = '\u003cprivate IPv4 of KV Store Proxy VM\u003e', \n                  num_lambda_invokers = 4,\n                  # Automatically create proxy locally. Pass same IPv4 for `host` and `proxy_address`\n                  use_local_proxy = True, \n                  # Path to `proxy.py` file.\n                  local_proxy_path = \"/home/ec2-user/Wukong/KV Store Proxy/proxy.py\",\n                  redis_endpoints = [(\"\u003cprivate IPv4 of Static Scheduler VM\u003e\", 6379)],\n                  use_fargate = False) \nclient = Client(local_cluster)\n\nx = da.random.random((10000, 10000), chunks = (1000, 1000))\ny = da.random.random((10000, 10000), chunks = (1000, 1000))\nz = da.matmul(x, y)\n\n# Start the computation.\nz.compute(scheduler = client.get) \n```\n\n### Parallelizing Prediction (sklearn.svm.SVC)\n``` python\nimport pandas as pd\nimport seaborn as sns\nimport sklearn.datasets\nfrom sklearn.svm import SVC\n\nimport dask_ml.datasets\nfrom dask_ml.wrappers import ParallelPostFit\nfrom wukong import LocalCluster, Client\nlocal_cluster = LocalCluster(host='\u003cprivate IPv4 of Static Scheduler VM\u003e:8786',\n                  proxy_address = '\u003cprivate IPv4 of KV Store Proxy VM\u003e', \n                  num_lambda_invokers = 4,\n                  # Automatically create proxy locally. Pass same IPv4 for `host` and `proxy_address`\n                  use_local_proxy = True, \n                  # Path to `proxy.py` file.\n                  local_proxy_path = \"/home/ec2-user/Wukong/KV Store Proxy/proxy.py\",\n                  redis_endpoints = [(\"\u003cprivate IPv4 of Static Scheduler VM\u003e\", 6379)],\n                  use_fargate = False) \nclient = Client(local_cluster)\n\nX, y = sklearn.datasets.make_classification(n_samples=1000)\nclf = ParallelPostFit(SVC(gamma='scale'))\nclf.fit(X, y)\n\nX, y = dask_ml.datasets.make_classification(n_samples=800000,\n                                            random_state=800000,\n                                            chunks=800000 // 20)\n\n# Start the computation.\nclf.predict(X).compute(scheduler = client.get)\n\n```\n\n## To Cite Wukong\n\n```\n@inproceedings{socc20-wukong,\nauthor = {Carver, Benjamin and Zhang, Jingyuan and Wang, Ao and Anwar, Ali and Wu, Panruo and Cheng, Yue},\n  title = {Wukong: A Scalable and Locality-Enhanced Framework for Serverless Parallel Computing},\n  year = {2020},\n  isbn = {9781450381376},\n  publisher = {Association for Computing Machinery},\n  url = {https://doi.org/10.1145/3419111.3421286},\n  doi = {10.1145/3419111.3421286},\n  series = {SoCC '20}\n}\n```\n\n```\n@INPROCEEDINGS {pdsw19-wukong,\nauthor = {B. Carver and J. Zhang and A. Wang and Y. Cheng},\n  booktitle = {2019 IEEE/ACM Fourth International Parallel Data Systems Workshop (PDSW)},\n  title = {In Search of a Fast and Efficient Serverless DAG Engine},\n  year = {2019},\n  doi = {10.1109/PDSW49588.2019.00005},\n  url = {https://doi.ieeecomputersociety.org/10.1109/PDSW49588.2019.00005},\n  publisher = {IEEE Computer Society}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fds2-lab%2Fwukong","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fds2-lab%2Fwukong","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fds2-lab%2Fwukong/lists"}