{"id":13528161,"url":"https://github.com/excamera/mu","last_synced_at":"2025-10-25T10:37:22.723Z","repository":{"id":78126643,"uuid":"62818036","full_name":"excamera/mu","owner":"excamera","description":"Framework to Run General-Purpose Parallel Computations on AWS Lambda","archived":false,"fork":false,"pushed_at":"2018-04-18T23:04:59.000Z","size":17277,"stargazers_count":94,"open_issues_count":6,"forks_count":23,"subscribers_count":10,"default_branch":"master","last_synced_at":"2025-04-04T08:11:28.332Z","etag":null,"topics":["aws-lambda","serverless"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/excamera.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2016-07-07T15:32:12.000Z","updated_at":"2024-03-18T19:52:37.000Z","dependencies_parsed_at":"2023-03-06T08:30:24.606Z","dependency_job_id":null,"html_url":"https://github.com/excamera/mu","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/excamera%2Fmu","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/excamera%2Fmu/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/excamera%2Fmu/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/excamera%2Fmu/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/excamera","download_url":"https://codeload.github.com/excamera/mu/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":250937229,"owners_count":21510923,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["aws-lambda","serverless"],"created_at":"2024-08-01T06:02:15.573Z","updated_at":"2025-10-25T10:37:17.683Z","avatar_url":"https://github.com/excamera.png","language":"Python","funding_links":[],"categories":["Python"],"sub_categories":[],"readme":"[![Build Status](https://travis-ci.org/excamera/mu.svg?branch=master)](https://travis-ci.org/excamera/mu)\n\n# Example (WIP) #\n\nIn this example, we are going to run lambdas that grab PNG files stored on S3 as\n`mybucket:sintel-1k-png16/%08d.png`, encode them 6 frames at a time as Y4M files,\nand upload them to `mybucket:sintel-1k-y4m_06/%08d.y4m`.\n\nIf you want more information on running xc-enc, see\n[src/lambdaize/README\\_xc-enc.md](https://github.com/excamera/mu/tree/master/src/lambdaize/README_xc-enc.md).\n\n## Prerequisites ##\n\nI assume that you've already got the `mybucket:sintel-1k-png16/%08d.png` files. You should\nget these [from Xiph](http://media.xiph.org/sintel/sintel-1k-png16/) and upload them to S3.\n\nI also assume you're using a Debian-ish system of recent vintage (I'm running Debian testing\nas of September 2016).\n\nYou will need the following packages:\n\n    apt-get install build-essential g++-5 automake pkg-config \\\n                    python-dev python-boto3 libssl-dev python-openssl \\\n                    libpng-dev zlib1g-dev libtool libtool-bin awscli\n\nYou'll also need an AWS ID, both for the\n[AWS CLI](http://docs.aws.amazon.com/cli/latest/userguide/cli-chap-getting-started.html)\nand for the mu scripts (after you've run `aws configure`, your credentials will be in `~/.aws/credentials`).\nYou will also need a lambda\n[execution role](http://docs.aws.amazon.com/lambda/latest/dg/with-s3-example-create-iam-role.html).\nPut these in your environment now so that you don't forget!\n\n    export AWS_ACCESS_KEY_ID=xxxxxx\n    export AWS_SECRET_ACCESS_KEY=yyyyyy\n    export AWS_ROLE=arn:aws:iam::0123456789:role/somerole\n\n## Getting started: building binaries ##\n\nTo start, let's build the [mu](https://github.com/excamera/mu) repository:\n\n    mkdir -p /tmp/mu_example\n    cd /tmp/mu_example\n    git clone https://github.com/excamera/mu\n    cd mu\n    ./autogen.sh\n    ./configure\n    make -j$(nproc)\n\nThe other thing we'll need is the [daala\\_tools](https://github.com/excamera/daala_tools) repo,\nwhich contains the `png2y4m` tool we are going to run on each lambda worker.\n\n**Important:** note `STATIC=1` in the `make` invocation. The\n[lambda environment](http://docs.aws.amazon.com/lambda/latest/dg/current-supported-versions.html)\nprobably does not have the same system libraries as our machine, so to be safe, we should only\nuse statically linked binaries on lambda workers.\n\n    cd /tmp/mu_example\n    git clone https://github.com/excamera/daala_tools\n    cd daala_tools\n    make -j$(nproc) STATIC=1\n\n## Assembling the lambda function ##\n\nThe next step is preparing a lambda function. Our goal is for the lambda to execute a command\nlike `./png2y4m -o /tmp/somefile.y4m /tmp/%08d.png`, which will convert PNGs to a Y4M.  (Don't\nworry, we'll figure out how the PNGs get downloaded below.)\n\nTo do this, we'll invoke the `lambdaize.sh` script in the `mu` repo:\n\n    cd /tmp/mu_example\n    MEM_SIZE=1536 TIMEOUT=180 ./mu/src/lambdaize/lambdaize.sh \\\n        ./daala_tools/png2y4m \\\n        '' \\\n        '-i -d -o ##OUTFILE## ##INFILE##'\n\n`MEM_SIZE` and `TIMEOUT` are configuration options for the lambda function.  Note that this\ncommand will use `AWS_ROLE` (see above) as the role for executing the lambda function we've\njust created. The command's output looks something like:\n\n    {\n        \"CodeSize\": 3996942,\n        \"LastModified\": \"2016-09-01T00:00:00.000+0000\",\n        \"MemorySize\": 1536,\n        \"CodeSha256\": \"yv+mJC0/2hsjTcu3BpFwWyhix1YVRimph8O1y8Oy/Lw=\",\n        \"Description\": \"png2y4m\",\n        \"FunctionName\": \"png2y4m_cP4Mf5pn\",\n        \"Role\": \"arn:aws:iam::0123456789:role/somerole\",\n        \"Handler\": \"lambda_function.lambda_handler\",\n        \"Runtime\": \"python2.7\",\n        \"Timeout\": 180,\n        \"Version\": \"1\",\n        \"FunctionArn\": \"arn:aws:lambda:us-east-1:0123456789:function:png2y4m_cP4Mf5pn\"\n    }\n\nYour new lambda function's name is `png2y4m_cP4Mf5pn`, and you will find a correspondingly-named\nzipfile in `/tmp/mu_example`. `lambdaize.sh` generates a random suffix and appends it to the\nlambda function name to avoid collisions with existing functions.  If you forget the name\nof your function, you can invoke `aws lambda list-functions`.\n\n## Coordinating server ##\n\nFinally, we will run a server to launch and coordinate the lambda instances. The full script is in\n[mu/src/lambdaize/png2y4m\\_server.py](https://github.com/excamera/mu/blob/master/src/lambdaize/png2y4m_server.py).\n\n    Usage: ./png2y4m_server.py [args ...]\n\n    You must also set the AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY envvars.\n\n      switch         description                                     default\n      --             --                                              --\n      -h:            show this message\n      -D:            enable debug                                    (disabled)\n      -O oFile:      state machine times output file                 (None)\n      -P pFile:      profiling data output file                      (None)\n\n      -n nParts:     launch nParts lambdas                           (1)\n      -f nFrames:    number of frames to process in each chunk       (6)\n      -o nOffset:    skip this many input chunks when processing     (0)\n\n      -v vidName:    video name                                      ('sintel-1k')\n      -b bucket:     S3 bucket in which videos are stored            ('excamera-us-east-1')\n      -i inFormat:   input format ('png16', 'y4m_06', etc)           ('png16')\n\n      -t portNum:    listen on portNum                               (13579)\n      -l fnName:     lambda function name                            ('png2y4m')\n      -r r1,r2,...:  comma-separated list of regions                 ('us-east-1')\n\n      -c caCert:     CA certificate file                             (None)\n      -s srvCert:    server certificate file                         (None)\n      -k srvKey:     server key file                                 (None)\n         (hint: you can generate new keys with \u003cmu\u003e/bin/genkeys.sh)\n         (hint: you can use CA_CERT, SRV_CERT, SRV_KEY envvars instead)\n\nWe will need to generate SSL certs:\n\n\tmkdir -p /tmp/mu_example/ssl\n    cd /tmp/mu_example/ssl\n\t/tmp/mu_example/mu/bin/genkeys.sh\n\nNow we're ready to go!\n\n    /tmp/mu_example/mu/src/lambdaize/png2y4m_server.py \\\n        -n 5 \\\n        -l png2y4m_cP4Mf5pn \\\n        -b mybucket \\\n        -c /tmp/mu_example/ssl/ca_cert.pem \\\n        -s /tmp/mu_example/ssl/server_cert.pem \\\n        -k /tmp/mu_example/ssl/server_key.pem\n\nThat's it! You're encoding files.\n\n## In more detail... ##\n\n### pylaunch ###\n\nCoordinating servers use the `pylaunch` module to launch many lambdas at once in parallel.\nThis module is an interface to [liblaunch](https://github.com/excamera/mu/tree/master/src/launch).\nUsage:\n\n    pylaunch.launchpar(num_to_launch, lambda_function_name, \\\n                       access_key_id, secret_access_key, \\\n                       json_payload, [ region1, region2, ... ])\n\n### `machine_state.py` overview ###\n\n[libmu/machine\\_state.py](https://github.com/excamera/mu/tree/master/src/lambdaize/libmu/machine_state.py)\nprovides general functionality for building coordinating servers.\n\nAt a high level, the idea is that we can build a state machine out of these generic classes, and\nthat state machine drives the computation for each worker. Each state in the machine represents\na pair, (expected client message, server command); the client always \"goes first\". Client\nresponses depend on the prior command; all responses indicating success begin with \"OK\".\n(For more information on commands and responses, see\n[libmu/handler.py](https://github.com/excamera/mu/tree/master/src/lambdaize/libmu/handler.py).)\n\nWe represent state machines as subclasses of `MachineState`, which is itself a subclass of\n`SocketNB`. `SocketNB` is a wrapper around socket-like objects that handles non-blocking reads\nand writes, a simple chunking protocol, etc.\n\n`MachineState` defines the general state transition framework, but one should probably not inherit\ndirectly from `MachineState`. Instead, most of the time a state will inherit from classes like\n`TerminalState`, `CommandListState`, or `ForLoopState`. These are the three subclasses we\nuse in `png2y4m\\_server.py`;\n[xcenc\\_server.py](https://github.com/excamera/mu/tree/master/src/lambdaize/xcenc_server.py) encodes\na more complex state machine that makes use of several other subclasses.\n\nImmediately below I give a bit more background on each of the parent classes we use in building the\n`png2y4m_server.py` state machine; below, I discuss the state machine classes themselves.\n\n#### `TerminalState` ####\n\n`TerminalState` is simple: it's a state from which the machine never transitions. In\n`png2y4m_server.py`, we have `FinalState`, which simply overrides the `extra` attribute to make\nthe string representation of the state more comprehensible in debug mode.\n\nAnother important subclass of `TerminalState` is `ErrorState`. If a state machine enters this\nstate, the server will report a corresponding error after execution.\n\n#### `CommandListState` ####\n\nA `CommandListState` comprises a list of (client response, server command), and tracks the progress\nthrough this command list. (One can think of a `CommandListState` as a straight-line sequence\nof independent states.)\n\nThe `commandlist` attribute is a list of strings or tuples from which the `CommandListState`\nbuilds the set of expected responses and the resulting commands. If an entry in `commandlist`\nis a string, this is interpreted as the command that the server will send. The state will\nautomatically decide an expected response based on the previous command (or just \"OK\" for the\nfirst command).\n\nIf an entry in `commandlist` is a tuple, this is interpreted as `(client_response, server_command)`.\nThis allows more explicit control over the client's expected response. A special case for both\n`client_response` and `server_command` is `None`. In the case of `client_response`, `None` means\nthat the state machine should immediately send the command and transition to the next state.\nFor `server_response`, this means that there is no command, after a response is received.\nWe will see how both of these are useful later.\n\nAfter a `CommandListState` sends its last command, it transitions to the state whose constructor\nis specified in the `nextState` property.\n\n#### `ForLoopState` ####\n\nA `ForLoopState` encodes a loop with an incrementing counter. `iterKey` is a dictionary key\nassociated with the iteration counter; the counter is stored in the dictionary `self.info`, which\nis always carried from one state to the next. `iterInit` is the first value given to the counter,\nand `iterFin` is the final value. If the value in `self.info` corresponding to the key specified\nby `breakKey` is not `None`, iteration ends the next time the machine reaches the `ForLoopState`.\n\nEach time the state machine enters the `ForLoopState`, it consults the loop counter and decides\nwhether to transition to `loopState` (continue looping) or `exitState` (finish looping).\n\nMost of the time, the `expect` and `command` properties are both `None` for a `ForLoopState`,\ni.e., the state machine transitions to the next state immediately.\n\n### Coordinating png2y4m ###\n\nIn this case, our state machine is pretty simple:\n\n1. Configure the lambda with instance-specific settings.\n2. Retrieve each input PNG from S3.\n3. Run the command on the retrieved files.\n4. Upload the resulting Y4M.\n\nBecause each state has to refer to the state that comes after it, the classes corresponding to each\nstate need to be defined in reverse order in the source file. Let's start with `PNG2Y4MConfigState`,\nwhich is the state machine's entry point.\n\n#### `PNG2Y4MConfigState` ####\n\nThis state is a subclass of the `CommandListState` (described above) that sets a few variables\nin the worker. Its constructor first invokes the `CommandListState` constructor, then computes\nthe commands to send based on the worker number and the video being transcoded.\n\nNote that the final command is `None`; the state machine will wait for the response from the\npenultimate command (`seti:nonblock:0`) and immediately transition to the next state.\n\n#### `PNG2Y4MRetrieveLoopState` ####\n\nThis state is a subclass of the `ForLoopState` that controls the number of frames that are\ndownloaded. (Note that the constructor is overridden here because the ServerInfo object might\nbe changed at run time.)\n\nIf the looping is not yet finished, this state goes to `PNG2Y4MRetrieveAndRunState`, else it goes to\n`PNG2Y4MUploadState`.\n\n#### `PNG2Y4MRetrieveAndRunState` ####\n\nThis is once again a `CommandListState` subclass. It sets variables that determine which S3 object\nto retrieve and the corresponding output filename, then retrieves the object. Here again we add\na final `None` state to delay transition back to the loop header until the `retrieve:` command\nis complete.\n\nNote that the first `expect` is `None` because every path leading to this state has already\nwaited for outstanding responses from the client; similarly, the final command is `None`,\nwhich makes this state wait for the client's response before transitioning back to the loop header.\n\nNote also that we override the `nextState` property *after* `PNG2Y4MRetrieveLoopState` is defined\nto prevent use-before-define errors.\n\n#### `PNG2Y4MUploadState` ####\n\nAnother `CommandListState` that runs the png2y4m conversion command and then uploads the result,\nthen transitions to the FinalState.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fexcamera%2Fmu","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fexcamera%2Fmu","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fexcamera%2Fmu/lists"}