{"id":30294460,"url":"https://github.com/linkedin/dynoyarn","last_synced_at":"2025-08-17T01:35:05.790Z","repository":{"id":46188348,"uuid":"399942319","full_name":"linkedin/dynoyarn","owner":"linkedin","description":"DynoYARN is a framework to run simulated YARN clusters and workloads for YARN scale testing.","archived":false,"fork":false,"pushed_at":"2023-03-06T13:28:14.000Z","size":127,"stargazers_count":58,"open_issues_count":2,"forks_count":6,"subscribers_count":7,"default_branch":"main","last_synced_at":"2024-04-15T02:00:46.319Z","etag":null,"topics":["hadoop","hadoop-yarn"],"latest_commit_sha":null,"homepage":"","language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-2-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/linkedin.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2021-08-25T19:59:29.000Z","updated_at":"2024-02-04T02:07:14.000Z","dependencies_parsed_at":"2022-09-07T08:11:08.397Z","dependency_job_id":null,"html_url":"https://github.com/linkedin/dynoyarn","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/linkedin/dynoyarn","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/linkedin%2Fdynoyarn","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/linkedin%2Fdynoyarn/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/linkedin%2Fdynoyarn/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/linkedin%2Fdynoyarn/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/linkedin","download_url":"https://codeload.github.com/linkedin/dynoyarn/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/linkedin%2Fdynoyarn/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":270796217,"owners_count":24647319,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-16T02:00:11.002Z","response_time":91,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["hadoop","hadoop-yarn"],"created_at":"2025-08-17T01:35:05.070Z","updated_at":"2025-08-17T01:35:05.784Z","avatar_url":"https://github.com/linkedin.png","language":"Java","funding_links":[],"categories":[],"sub_categories":[],"readme":"# DynoYARN\nDynoYARN is a tool to spin up on-demand YARN clusters and run simulated YARN workloads for scale testing.\nIt can simulate 10,000 node YARN cluster performance on a 100 node Hadoop cluster.\n\nDynoYARN was created to address the following:\n1. Evaluate YARN features and Hadoop version upgrades on resource manager performance\n2. Forecast resource manager performance on large YARN clusters\n\nDynoYARN consists of a \"driver\" application and \"workload\" application. The driver is responsible for spinning up\nthe simulated YARN cluster. The driver assumes the resource manager uses capacity scheduler.\nThe workload is responsible for replaying a trace on the simulated cluster in real-time.\n\nThe driver and workload can be configured to spin up a cluster and replay workloads of arbitrary size, meaning\nDynoYARN can simulate a wide range of scenarios, from replaying previous production performance issues, to\npredicting resource manager performance of future clusters and workloads.\n\nBoth the driver and workload are implemented as YARN applications, so you need a functional Hadoop cluster\nto run the simulation.\n\n## Build\nTo build DynoYARN jars needed to run the simulation, run `./gradlew build` from the root directory.\nThe required jars are in `dynoyarn-driver/build/libs/dynoyarn-driver-*-all.jar` and\n`dynoyarn-generator/build/libs/dynoyarn-generator-*-all.jar`.\n\n## Run\n\nDynoYARN simulations can be run through command line by manually running the driver and workload applications,\nor by running it through Azkaban (which packages these applications into a single Azkaban job).\n\n### Command Line\n\n#### Prerequisites\n\nOn a machine with Hadoop access, add the following into a directory:\n1. `dynoyarn-driver-*-all.jar` jar\n2. `dynoyarn-generator-*-all.jar` jar\n3. Create a `dynoyarn-site.xml` file. This contains properties which will be added to the simulated cluster daemons\n   (resource manager and node managers). A base config is provided [here](dynoyarn-site.xml).\n4. Create a `dynoyarn.xml` file. This contains properties which will be used for the simulation itself (e.g. number\n   of node managers to spin up, resource capability of each node manager, etc). A base config is provided [here](dynoyarn.xml).\n\nNext, you need a workload trace to replay (see [Workload Spec Format](#workload-spec-format)) for more info.\nAn example workload trace is provided [here](workload-example.json). Copy the workload trace to be replayed to HDFS:\n\n    hdfs dfs -copyFromLocal workload-example.json /tmp/workload-example.json\n\nIt's useful to run the simulated resource manager on the same node across each simulation. Furthermore, we want\nto ensure the resource manager is running in an isolated environment to accurately reproduce resource manager behavior.\nTo do this, configure `dynoyarn.resourcemanager.node-label` in `dynoyarn.xml` to `dyno` (or any label name you choose),\npick a node in your cluster where you want the simulated resource manager to run (e.g. `hostname:8041`), then run\n`yarn rmadmin -addToClusterNodeLabels dyno; yarn rmadmin -replaceLabelsOnNode hostname:8041=dyno` so that the\nsimulated resource manager will run on `hostname:8041` for each simulation.\n\n#### Running the simulation\n\n1. To run the driver application, run from the directory:\n\n    ```\n    CLASSPATH=$(${HADOOP_HDFS_HOME}/bin/hadoop classpath --glob):./:./* java com.linkedin.dynoyarn.DriverClient -hadoop_binary_path /hdfs/path/to/hadoop.tarball.tar.gz -conf dynoyarn.xml -capacity_scheduler_conf /hdfs/path/to/capacity-scheduler.xml\n    ```\n\n  where the `hadoop_binary_path` argument contains the Hadoop binary and conf which the driver components (RM and NMs) will use (you can use the\n  same tarball that you would use when configuring `mapreduce.application.framework.path` for MapReduce jobs),\n  and the `capacity_scheduler_conf` argument contains the capacity scheduler configuration which the driver's RM will use.\n\n  The driver application lifetime is controlled by `dynoyarn.driver.simulation-duration-ms`, after which the\n  application (and simulated cluster) will terminate, and RM app summary and GC logs will be uploaded to HDFS\n  (to `dynoyarn.driver.rm-log-output-path`).\n\n2. To run the workload application, run from the directory:\n\n    ```\n    CLASSPATH=$(${HADOOP_HDFS_HOME}/bin/hadoop classpath --glob):./:./* java com.linkedin.dynoyarn.workload.WorkloadClient -workload_spec_location /tmp/workload-example.json -conf dynoyarn.xml -driver_app_id application_1615840027285_57002\n    ```\n\n  where `workload_spec_location` is the location on HDFS containing the trace to rerun,\n  and `driver_app_id` is the YARN app id for the driver app submitted previously.\n\n### Azkaban\n\nTo run a DynoYARN simulation via Azkaban, run `./gradlew build` from the root directory, and upload the resulting\nzip to Azkaban at `dynoyarn-azkaban/build/distributions/dynoyarn-azkaban-*`.\n\n## Workload Spec Format\nThe workload trace is in json format, one app per line. An example app:\n\n    {\n      \"amResourceRequest\": {\n        \"memoryMB\": 2048,\n        \"vcores\": 1\n      },\n      \"appId\": \"application_1605737660848_3450869\",\n      \"appType\": \"MAPREDUCE\",\n      \"queue\": \"default\",\n      \"user\": \"user2\",\n      \"submitTime\": 1607151674623,\n      \"resourceRequestSpecs\": [\n        {\n          \"runtimes\": [13262, 41329],\n          \"numInstances\": 2,\n          \"resource\": {\n            \"memoryMB\": 4096,\n            \"vcores\": 1\n          },\n          \"priority\": 20\n        },\n        {\n          \"runtimes\": [13292],\n          \"numInstances\": 1,\n          \"resource\": {\n            \"memoryMB\": 8192,\n            \"vcores\": 2\n          },\n          \"priority\": 10\n        }\n      ]\n    }\n\nThis was taken from a `MAPREDUCE` app that ran on a production cluster, which ran with id `application_1605737660848_3450869` and\nwas submitted at `1607151674623`. When replaying this app, it will be submitted as user `user2`, to queue `default`. The AM will\nrun in a `\u003c2GB, 1 vcore\u003e` container; it will first request two `\u003c4GB, 1 vcore\u003e` containers with priority `20` that run for\nabout 13 and 41 seconds, respectively. Once both containers finish, the app will request a single `\u003c8GB, 2 vcore\u003e` container\nwith priority `10` that runs for about 13 seconds.\n\nThe apps in the trace are submitted to the simulated cluster in relative real-time; in the [example](workload-example.json),\nthe first app was submitted at `1607151674543` and marks the start of the simulation; the second app was submitted at `1607151674623`, and will be\nsubmitted `1607151674623 - 1607151674543 = 80` milliseconds after the first app.\n\nTo generate a trace, you can combine production RM app summary logs with audit logs containing information\non when containers (e.g. mappers/reducers for MapReduce, or executors for Spark) for each application were requested.","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flinkedin%2Fdynoyarn","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flinkedin%2Fdynoyarn","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flinkedin%2Fdynoyarn/lists"}