{"id":18851994,"url":"https://github.com/cubefs/shuttle","last_synced_at":"2025-10-24T09:31:48.592Z","repository":{"id":38410713,"uuid":"483961902","full_name":"cubefs/shuttle","owner":"cubefs","description":"Shuttle：High Available, High Performance Remote Shuffle Service","archived":false,"fork":false,"pushed_at":"2023-03-28T12:21:20.000Z","size":448,"stargazers_count":153,"open_issues_count":5,"forks_count":26,"subscribers_count":6,"default_branch":"master","last_synced_at":"2025-05-31T22:07:51.803Z","etag":null,"topics":["distributed","hadoop","remote","shuffle","spark"],"latest_commit_sha":null,"homepage":"","language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/cubefs.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-04-21T08:07:57.000Z","updated_at":"2025-05-13T06:14:45.000Z","dependencies_parsed_at":"2024-11-08T03:37:33.496Z","dependency_job_id":"e702fdc8-047a-42a3-9976-ba33f427bd4f","html_url":"https://github.com/cubefs/shuttle","commit_stats":null,"previous_names":[],"tags_count":2,"template":false,"template_full_name":null,"purl":"pkg:github/cubefs/shuttle","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cubefs%2Fshuttle","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cubefs%2Fshuttle/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cubefs%2Fshuttle/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cubefs%2Fshuttle/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/cubefs","download_url":"https://codeload.github.com/cubefs/shuttle/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cubefs%2Fshuttle/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":271579431,"owners_count":24784250,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-22T02:00:08.480Z","response_time":65,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["distributed","hadoop","remote","shuffle","spark"],"created_at":"2024-11-08T03:37:28.671Z","updated_at":"2025-10-24T09:31:48.522Z","avatar_url":"https://github.com/cubefs.png","language":"Java","funding_links":[],"categories":["大数据"],"sub_categories":[],"readme":"# Shuttle: High Available, High Performance Remote Shuffle Service\n![thumbnail_1870B2B9@1C5FB502 ECC56662](https://user-images.githubusercontent.com/3745064/165201635-fe39b5ae-8b82-4404-80c4-6626c45f01b3.jpg)\n\n\nShuttle provides remote shuffle capability to group and dump shuffle data into distribute file system by partition. \n\nThe goal of Shuttle is transfering the small and random IO to sequence IO, to improve the performance and stability of application.\nSee more details on [docs link](docs/server-high-level-design.md).  \n\n[Detailed introduction](https://mp.weixin.qq.com/s/FMvKGvVYcxNG4dNOFQlF0g)  \nOur email: bigdata-arch@oppo.com\n\nWeChat Group QR-code：\n\n![qr_download](https://user-images.githubusercontent.com/3745064/171109511-8b8f5db1-3641-42a4-ad01-dfc3a490875d.png)\n\n\n\n\nPlease contact us if you has any question or suggestion.\n\n## Architecture diagram of Shuttle system\n\n![shuttle-arch-diagram](https://user-images.githubusercontent.com/3745064/167382634-0557234e-dd15-46fd-be3b-bf7c8a943833.png)\n\n\n## Spark Version matching\n\n| branch name | spark version |\n| ----------- | ------------- |\n| sp24 | spark2.4.x |\n| sp31 | \u003e=spark3.0 |\n| master | \u003e=spark3.0 |\n\nShuttle supports AQE of spark 2.x.  \nShuttle supports AQE of spark3.x except local read. Therefore, in spark3.x, you need to add the following configuration to close AQE local reading:\n```\nspark.sql.adaptive.localShuffleReader.enabled=false\n```\n\n## Build Guide\nUse JDK 8+ and maven 3+ to build\n\n## Build Shuttle servers distribute [ shuffle  master/worker  ]\n`\nsh build/build.sh\n`\n\nThis command generates build/dist/shuttle-rss.zip. The directory structure :\n`\nconf bin lib client\n`\n\n## Build for docker\n+ First you need to compile the zip package: `sh build/build.sh`。\n+ Create a docker image: `docker build -t shuttle-rss:1.0 .`\n+ Prepare the config file directory\n+ run service:  \n  \n      ```\n        docker run \\\n        -d \\\n        -p 19189:19189 \\\n        -p 19188:19188 \\\n        -p 19191:19191 \\\n        -p 19190:19190 \\\n        --env SHUTTLE_HOST_IP=\"share the host ip\" \\\n        -v \"your conf dir\":/usr/local/shuttle-rss/conf \\\n        shuttle-rss:1.0 \\\n        all\n      ```\n\n\n## Build Shuttle servers [ shuffle master/worker ]\n`\nmvn clean package -Pserver -DskipTests\n`\n\nThe jars generated by the command: \n      shuttle-rss-xxx-master.jar for rss Shuffle Master\nshuttle-rss-xxx-worker.jar  for rss Shuffle Worker\n\n## Build Shuttle clients [ shuffle manager/writer/reader ]\n`\nmvn clean package -Pclient -DskipTests\n`\n\nThis command generates clients jars:\nshuttle-rss-xxx-client.jar   for rss Shuffle Clients\n\n## Other Server Depends on Zookeeper、Distribute File System [ HDFS / CFS / Alluxio ]\n\n## Necessary configuration\nModify the following options in conf/rss_env.sh：\n```\n# Define Shuttle rss cluster name\nexport RSS_DATA_CENTER=dc1\nexport RSS_CLUSTER=cluster1\n\n#The root directory where shuffle data is saved\nexport RSS_ROOT_DIR=hdfs://user/rss-data\n\nexport RSS_ZK_SERVERS=10.10.10.1:2181\n```\n\nif use HDFS, add core-site.xml and hdfs-site.xml to the conf directory.\n\nif use CFS,  add cfs-site.xml to the conf directory.\n\nif use Alluxio, add alluxio-site.xml to the conf directory\n\n## How to Run\n### ShuffleMaster\nStart ShuffleMaster server with distribute package as a java application, run:\n\n`\nsh bin/run_master.sh start\n`\n\nShuffleMaster is a HA service,  you can start master on other machines.\n\n### ShuffleWorker\nStart ShuffleWorker server with distribute package as a java application, run:\n\n`\nsh bin/run_worker.sh start\n`\n\n### Prometheus compatible metrics support \nHow to enable Prometheus metrics support?\nStart worker/master with parameter:\n`\n-Dmetrics.export.port=xxxx\n`\nMetrics data url:\n`\nip:port/metric\n`\n\nRemarks:\nShuffleWorker uses port to specification, so we can start more shuffle workers with diff ports on one host.\n\n### ShuffleClients\nThe client side does not need to add the configuration of hdfs, cfs, etc. on the server side, and the client will automatically obtain these configurations from the shuffle master.\n### Static resource allocation\n1、Deploy shuttle-rss-xxx-client.jar to hdfs. Then add configure to your Spark application like following (you need to adjust the values based on your environment):\n```\nspark.dynamicAllocation.enabled                       false\nspark.shuffle.service.enabled                         false\nspark.executor.extraClassPath                         rss-xxx-client.jar\nspark.shuffle.manager                                 org.apache.spark.shuffle.Ors2ShuffleManager\nspark.shuffle.rss.serviceManager.type                 master\nspark.shuffle.rss.serviceRegistry.zookeeper.servers  10.10.10.1:2181\nspark.shuffle.rss.dataCenter                         dc1\nspark.shuffle.rss.cluster                            cluster1\n```\n2、Run your Spark application\n\nThis is convenient for you to quickly test rss, and it is not recommended to use it in a production environment\n\n### Dynamic resource allocation\nSpark dynamic resource allocation to support remote shuffle service requires modification of spark-core source code and recompilation.  See more details on Spark community document: [[SPARK-25299][DISCUSSION] Improving Spark Shuffle Reliability](https://docs.google.com/document/d/1uCkzGGVG17oGC6BJ75TpzLAZNorvrAU3FRd2X-rVHSM/edit?ts=5e3c57b8).\n\nIn the source code, you can find patch modified by spark 2.x and 3.x to spark-core. Merge the patch and compile spark, and then replace spark-core-xx.jar.\n\nChange configure to your Spark application:\n```\nspark.dynamicAllocation.enabled                       true\nspark.shuffle.service.enabled                         true\n```\nThe adaptation is perfect and will not connect to the port of yarn external shuffle.\n\n## Configuration\n### Environment variable\n\n| name | describe |\n| :----: | :--------: |\n|RSS_CONF_DIR\t| Set the server configuration file directory, default \"$PWD/conf\" |\n|RSS_DATA_CENTER |Service data center name | \n|RSS_CLUSTER |\tService data cluster name |\n|RSS_ROOT_DIR | Shuffle data is written to the root directory |\n|RSS_REGISTER_TYPE\t| Shuffle worker registration type, default \"master\" |\n|RSS_MASTER_NAME | Set the shuffle master service name |\n|RSS_ZK_SERVERS | Zookeeper address for service registration and management |\n|RSS_MASTER_JVM_OPTS | master jvm options |\n|RSS_WORKER_JVM_OPTS | worker jvm options |\n|RSS_MASTER_MEMORY\t| Set the master jvm memory, for example: \"-Xms2g -Xmx2g\" |\n|RSS_WORKER_MEMORY\t| Set the worker jvm memory, for example: \"-Xms6g -Xmx6g\" |\n|RSS_MASTER_SERVER_OPTS | Set master startup parameters, for example: \"-masterPort 19189\" |\n|RSS_WORKER_SERVER_OPTS | Set worker startup parameters, for example: \"-buildConnectionPort 19191 -port 19190\" |\n\n### Master startup parameters\n\n| name | describe |\n| :----: | :--------: |\n|masterName | Set master name |\n|masterPort\t| shuffle master rpc port |\n|httpPort\t| shuffle master http port |\n|zooKeeperServers\t| Service registration management zookeeper address |\n|dataCenter\t| The master assigns the worker's data center name by default |\n|cluster\t| The master assigns the worker's data clustername by default |\n\n### Worker startup parameters\n\n| name | describe |\n| :----: | :--------: | \n|serviceRegistry | Set the shuffle worker registration type, the default master |\n|masterName\t| Set which master the worker service is registered to |\n|zooKeeperServers | Service registration management zookeeper address |\n|dataCenter\t| Set the name of the data center where the worker is registered |\n|cluster\t| Set the name of the cluster name where the worker is registered |\n|workerLoadWeight\t| Set the worker distribution weight. This is typically used for heterogeneous clusters and defaults to 1 |\n|rootDir\t| The root directory where shuffle data is written |\n|buildConnectionPort\t| Connect the control port |\n|port\t| Data transfer port |\n|dumperThreads\t| The number of write data threads, the default is the number of cpu cores |\n|dumperQueueSize\t| Write the maximum size of each thread queue, default 100 |\n|nettyWorkerThreads\t| The number of netty threads, the default is 16 |\n|memoryControlSizeThreshold\t| The maximum size of memory used by shuffle data, the default is half of the jvm memory |\n|baseConnections\t| Number of tokens for flow control basis |\n|totalConnections\t| Flow control maximum number of tokens |\n|appStorageRetentionMillis\t| By default, the data will be deleted immediately after the task ends, but in some cases, the worker cannot receive the task end notification, and the master will force delete the data directory according to this time. Default 8h. |\n|appObjRetentionMillis\t| The maximum survival time of the app, stage and other data saved in the shuffle worker memory, the default is 6h |\n\n### Spark client\n| name | describe |\n| :----: | :--------: | \n|spark.shuffle.rss.masterName\t| Set the master name, the address of the master will be obtained from zookeeper according to this name |\n|spark.shuffle.rss.serviceManager.type\t| Set the shuffle worker management registration method, the default master |\n|spark.shuffle.rss.serviceRegistry.zookeeper.servers\t| Service registration management zookeeper address |\n|spark.shuffle.rss.writer.blockSize\t| Shuffle write data block size, default 1mb |\n|spark.shuffle.rss.writer.maxRequestSize\t| The maximum size of a request for network transmission, the default is 2mb |\n|spark.shuffle.rss.writer.maxFlyingPackageNum\t| The maximum number of outstanding network requests allowed, the default is 16. In the case of network latency, this effectively controls memory usage |\n|spark.shuffle.rss.memory.threshold\t| When shuffle write is in unsafe mode, the maximum amount of data that can be buffered in off-heap memory is 512MB by default. Its role is to reduce network transmission latency and balance network IO. |\n|spark.shuffle.rss.network.ioThreads\t| When shuffle write, the number of network transmission threads, the default is 2 |\n|spark.shuffle.rss.writer.bufferSpill\t| During the shuffle write process, the maximum buffer data size in memory, the default is 128mb |\n|spark.shuffle.rss.writer.type\t| Set the shuffle write type, including auto, bypass, unsafe, and sort. The default is auto, a shuffle write type will be automatically selected, and its selection logic is basically the same as spark sort shuffle |\n|spark.shuffle.rss.partitionCountPerShuffleWorker\t| How many partitions each shuffle worker can allocate, defaults to 5. |\n|spark.shuffle.rss.read.io.threads\t| The number of shuffle read threads, the default is 2 |\n|spark.shuffle.rss.read.max.size\t| In the shuffle read process, the maximum amount of data buffered in memory, the default is 128m. This will effectively reduce the probability of oom in the shuffle read process |\n|spark.shuffle.rss.read.merge.size\t| After setting how much data the network reads, the data will be packaged and added to the shuffle read pending data queue. |\n|spark.shuffle.rss.deleteShuffleDir\t| Whether to delete the shuffle data directory automatically after the task ends. Defaults to true |\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcubefs%2Fshuttle","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcubefs%2Fshuttle","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcubefs%2Fshuttle/lists"}