{"id":13571345,"url":"https://github.com/neoremind/kraps-rpc","last_synced_at":"2025-04-09T08:08:58.399Z","repository":{"id":57737489,"uuid":"98746822","full_name":"neoremind/kraps-rpc","owner":"neoremind","description":"A RPC framework leveraging Spark RPC module","archived":false,"fork":false,"pushed_at":"2019-03-13T13:47:08.000Z","size":119,"stargazers_count":210,"open_issues_count":6,"forks_count":104,"subscribers_count":13,"default_branch":"master","last_synced_at":"2025-04-02T05:08:21.913Z","etag":null,"topics":["rpc","spark"],"latest_commit_sha":null,"homepage":null,"language":"Scala","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/neoremind.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2017-07-29T16:57:33.000Z","updated_at":"2025-02-21T06:27:29.000Z","dependencies_parsed_at":"2022-08-24T14:57:21.392Z","dependency_job_id":null,"html_url":"https://github.com/neoremind/kraps-rpc","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/neoremind%2Fkraps-rpc","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/neoremind%2Fkraps-rpc/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/neoremind%2Fkraps-rpc/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/neoremind%2Fkraps-rpc/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/neoremind","download_url":"https://codeload.github.com/neoremind/kraps-rpc/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247999860,"owners_count":21031046,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["rpc","spark"],"created_at":"2024-08-01T14:01:01.191Z","updated_at":"2025-04-09T08:08:58.381Z","avatar_url":"https://github.com/neoremind.png","language":"Scala","funding_links":[],"categories":["Scala","开发框架"],"sub_categories":["RPC框架"],"readme":"# kraps-rpc\n[![Build Status](https://travis-ci.org/neoremind/kraps-rpc.svg?branch=master)](https://travis-ci.org/neoremind/kraps-rpc)\n[![Maven Central](https://maven-badges.herokuapp.com/maven-central/net.neoremind/kraps-rpc_2.11/badge.svg)](https://maven-badges.herokuapp.com/maven-central/net.neoremind/kraps-rpc_2.11)\n[![codecov](https://codecov.io/gh/neoremind/kraps-rpc/branch/master/graph/badge.svg)](https://codecov.io/gh/neoremind/kraps-rpc)\n[![Hex.pm](https://img.shields.io/hexpm/l/plug.svg)](http://www.apache.org/licenses/LICENSE-2.0)\n\n\nKraps-rpc is a RPC framework split from [Spark](https://github.com/apache/spark), you can regard it as `spark-rpc` with the word *spark* reversed. \n\nThis module is mainly for studying how RPC works in Spark, as people knows that Spark consists many distributed components, such as driver, master, executor, block manager, etc, and they communicate with each other through RPC. In Spark project the functionality is sealed in `Spark-core` module. Kraps-rpc separates the core RCP part from it, not including security and streaming download feature.\n\nThe module is based on Spark 2.1 version, which eliminate [Akka](http://akka.io/) due to [SPARK-5293](https://issues.apache.org/jira/browse/SPARK-5293).\n\n- [0. Dependency](#0-dependency)\n- [1. How to run](#1-how-to-run)\n  - [1.1 Create an endpoint](#11-create-an-endpoint)\n  - [1.2 Run server](#12-run-server)\n  - [1.3 Client call](#13-client-call)\n- [2. About RpcConf](#2-about-rpcconf)\n- [3. More examples](#3-more-examples)\n- [4. Performance test](#4-performance-test)\n  - [4.1 Test environment](#41-test-environment)\n  - [4.2 Test case](#42-test-case) \n  - [4.3 Test result](#43-test-result) \n- [5. Dependency tree](#5-dependency-tree)\n  \n\n## 0. Dependency\n\nYou can configure you project by including dependency from below, currently only work with **scala 2.11**.\n\nMaven:\n\n```\n\u003cdependency\u003e\n    \u003cgroupId\u003enet.neoremind\u003c/groupId\u003e\n    \u003cartifactId\u003ekraps-rpc_2.11\u003c/artifactId\u003e\n    \u003cversion\u003e1.0.0\u003c/version\u003e\n\u003c/dependency\u003e\n```\n\nSBT:\n\n```\n\"net.neoremind\" % \"kraps-rpc_2.11\" % \"1.0.0\"\n```\n\nTo learn more dependencies, please go to *Dependency tree* section.\n\n## 1. How to run\n\nThe following examples can be found in [kraps-rpc-example](https://github.com/neoremind/kraps-rpc/tree/master/kraps-rpc-example/src/main/scala)\n\n### 1.1 Create an endpoint\n\nCreating an endpoint which contains the business logic you would like to provide as a RPC service. Below shows a simple example of a hello world echo service.\n\n```\nclass HelloEndpoint(override val rpcEnv: RpcEnv) extends RpcEndpoint {\n\n  override def onStart(): Unit = {\n    println(\"start hello endpoint\")\n  }\n\n  override def receiveAndReply(context: RpcCallContext): PartialFunction[Any, Unit] = {\n    case SayHi(msg) =\u003e {\n      println(s\"receive $msg\")\n      context.reply(s\"hi, $msg\")\n    }\n    case SayBye(msg) =\u003e {\n      println(s\"receive $msg\")\n      context.reply(s\"bye, $msg\")\n    }\n  }\n\n  override def onStop(): Unit = {\n    println(\"stop hello endpoint\")\n  }\n}\n\n\ncase class SayHi(msg: String)\n\ncase class SayBye(msg: String)\n\n```\n\n`RpcEndpoint` is where to receive and handle requests, as `actor` notation in akka. `RpcEndpoint` does differentiate message `need-not-reply` from `need-reply`. Former one is much like UDP message (send and forget), latter one follows tcp way, waiting for one response.\n\n```\n  /**\n   * Process messages from [[RpcEndpointRef.send]] or [[RpcCallContext.reply)]]. If receiving a\n   * unmatched message, [[SparkException]] will be thrown and sent to `onError`.\n   */\n  def receive: PartialFunction[Any, Unit] = {\n    case _ =\u003e throw new SparkException(self + \" does not implement 'receive'\")\n  }\n\n  /**\n   * Process messages from [[RpcEndpointRef.ask]]. If receiving a unmatched message,\n   * [[SparkException]] will be thrown and sent to `onError`.\n   */\n  def receiveAndReply(context: RpcCallContext): PartialFunction[Any, Unit] = {\n    case _ =\u003e context.sendFailure(new SparkException(self + \" won't reply anything\"))\n  }\n```\n\nOne `RpcCallContext` is provided for endpoint to separate endpoing logic from message network transport process. Providing function to reply result or send failure information:\n\n- reply(response: Any) : reply one message\n- sendFailure(e: Throwable) : send failure\n\nAlso a serious status callbacks are provided :\n\n- onError\n- onConnected\n- onDisconnected\n- onNetworkError\n- onStart\n- onStop\n- stop\n\n### 1.2 Run server\n\nThere are a couple of steps to create a RPC server which provide `HelloEndpoint` service.\n\n1. Create `RpcEnvServerConfig`, `RpcConf` is where you can specify some parameters for the server, will be discussed in the below section, `hello-server` is just a simple name, no real use later. Host and port must be specified. Note that if server cannot bind on the specified port, it will try to increase the port value by one and try next.\n2. Create `RpcEnv` which launches the server via TCP socket at localhost on port 52345.\n3. Create `HelloEndpoint` and setup it with an identifier of `hello-service`, the name is for client to call and route into the correct service.\n4. `awaitTermination` will block the thread and make server run without exiting JVM.\n\n```\nimport net.neoremind.kraps.RpcConf\nimport net.neoremind.kraps.rpc._\nimport net.neoremind.kraps.rpc.netty.NettyRpcEnvFactory\n\nobject HelloworldServer {\n\n  def main(args: Array[String]): Unit = {\n    val config = RpcEnvServerConfig(new RpcConf(), \"hello-server\", \"localhost\", 52345)\n    val rpcEnv: RpcEnv = NettyRpcEnvFactory.create(config)\n    val helloEndpoint: RpcEndpoint = new HelloEndpoint(rpcEnv)\n    rpcEnv.setupEndpoint(\"hello-service\", helloEndpoint)\n    rpcEnv.awaitTermination()\n  }\n}\n```\n\n### 1.3 Client call\n\n#### 1.3.1 Asynchronous invocation\n\nCreating `RpcEnv` is the same as above, and here use `setupEndpointRef` to create a stub to call remote server at localhost on port 52345 and route to `hello-service`.\n\n`Future` is used here for asynchronous invocation.\n\n```\nimport net.neoremind.kraps.RpcConf\nimport net.neoremind.kraps.rpc.{RpcAddress, RpcEndpointRef, RpcEnv, RpcEnvClientConfig}\nimport net.neoremind.kraps.rpc.netty.NettyRpcEnvFactory\nimport scala.concurrent.{Await, Future}\nimport scala.concurrent.duration.Duration\nimport scala.concurrent.ExecutionContext.Implicits.global\n\nobject HelloworldClient {\n\n  def main(args: Array[String]): Unit = {\n    import scala.concurrent.ExecutionContext.Implicits.global\n    val rpcConf = new RpcConf()\n    val config = RpcEnvClientConfig(rpcConf, \"hello-client\")\n    val rpcEnv: RpcEnv = NettyRpcEnvFactory.create(config)\n    val endPointRef: RpcEndpointRef = rpcEnv.setupEndpointRef(RpcAddress(\"localhost\", 52345), \"hell-service\")\n    val future: Future[String] = endPointRef.ask[String](SayHi(\"neo\"))\n    future.onComplete {\n      case scala.util.Success(value) =\u003e println(s\"Got the result = $value\")\n      case scala.util.Failure(e) =\u003e println(s\"Got error: $e\")\n    }\n    Await.result(future, Duration.apply(\"30s\"))\n  }\n}\n```\n\n#### 1.3.2 Synchronous invocation\n\nCreating `RpcEnv` is the same as above, and here use `setupEndpointRef` to create a stub to call remote server at localhost on port 52345 and route to `hello-service`.\n\nUse `askWithRetry` instead of `ask` to call in synchronous way. \n\n*Note that in latest Spark version the method signature has changed to `askSync`.*\n\n```\nobject HelloworldClient {\n\n  def main(args: Array[String]): Unit = {\n    import scala.concurrent.ExecutionContext.Implicits.global\n    val rpcConf = new RpcConf()\n    val rpcConf = new RpcConf()\n    val config = RpcEnvClientConfig(rpcConf, \"hello-client\")\n    val rpcEnv: RpcEnv = NettyRpcEnvFactory.create(config)\n    val endPointRef: RpcEndpointRef = rpcEnv.setupEndpointRef(RpcAddress(\"localhost\", 52345), \"hello-service\")\n    val result = endPointRef.askWithRetry[String](SayBye(\"neo\"))\n    println(result)\n  }\n}\n```\n\n## 2. About RpcConf\n\n`RpcConf` is simply `SparkConf` in Spark, there are a couple of parameters that can be adjusted. They are listed below, for most of them you can reference to [Spark Configuration](http://spark.apache.org/docs/2.1.0/configuration.html). For example, you can specify parameter in the following way.\n\n```\nval rpcConf = new RpcConf()\nrpcConf.set(\"spark.rpc.lookupTimeout\", \"2s\") \n```\n\nThe parameters can also be set in VM options like:\n```\n-Dspark.rpc.netty.dispatcher.numThreads=16 -Dspark.rpc.io.threads=8\n```\n\n\n| Configuration                          | Description                              |\n| -------------------------------------- | ---------------------------------------- |\n| spark.rpc.lookupTimeout                | Timeout to use for RPC remote endpoint lookup, whenever a call is made the client will always ask the server whether specific endpoint exists or not, this is for the asking timeout, default is 120s |\n| spark.rpc.askTimeout                   | Timeout to use for RPC ask operations, default is 120s |\n| spark.rpc.numRetries                   | Number of times to retry connecting, default is 3 |\n| spark.rpc.retry.wait                   | Number of milliseconds to wait on each retry, default is 3s |\n| spark.rpc.io.numConnectionsPerPeer     | Spark RPC maintains an array of clients and randomly picks one to use. Number of concurrent connections between two nodes for fetching data. For reusing, used on client side to build client pool, please always set to 1, default is 1. |\n| spark.rpc.netty.dispatcher.numThreads  | For server side, actor Inbox dispatcher thread pool size, it is where endpoint business logic runs, if endpoints stall and reach to this number, event new RPC messages can be accepted, but server can not handle them in endpoint due to the limit, default is 8. |\n| spark.rpc.io.threads                   | For server and client side netty eventloop, this number is reactor thread pool size, the thread is responsible for accepting new connections and closing connections, serialize and deserialize byte array to RpcMessage object and push RpcMessage to actor pattern based Inbox for dispatcher to pick up and process, dispatcher concurrent level is set by`spark.rpc.netty.dispatcher.numThreads`. Default number is CPU cores * 2, min is 1. |\n\n## 3. More examples\n\nPlease find more in [test cases](https://github.com/neoremind/kraps-rpc/blob/master/kraps-core/src/test/scala/com/neoremind/kraps/RpcTest.scala).\n\n## 4. Performance test\n\n### 4.1 Test environment\n\nOne server and one client will be setup for testing at the same rack hosted in VM in different phasical machines. Test environment lists as below.\n\n```\nCPU: Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz 4 cores\nMemory: 8G\nOS: Linux ap-inf01 4.4.0-78-generic #99-Ubuntu SMP Thu Apr 27 15:29:09 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux\nJDK: \nOpenJDK Runtime Environment (build 1.8.0_131-8u131-b11-2ubuntu1.16.04.3-b11)\nOpenJDK 64-Bit Server VM (build 25.131-b11, mixed mode)\n```\n\n### 4.2 Test case\n\nAll performance related test cases can be found in [kraps-rpc-example](https://github.com/neoremind/kraps-rpc/tree/master/kraps-rpc-example/src/main/scala). Keep all parameters as default values.\n\n[Click here](https://github.com/neoremind/kraps-rpc/blob/master/kraps-rpc-example/src/main/scala/HelloworldServer.scala) to see server test case. Server running command is \n\n```\njava -server -Xms4096m -Xmx4096m -cp kraps-rpc-example_2.11-1.0.1-SNAPSHOT-jar-with-dependencies.jar HelloworldServer \u003cip\u003e\n```\n\n[Click here](https://github.com/neoremind/kraps-rpc/blob/master/kraps-rpc-example/src/main/scala/PerformanceTestClient.scala) to see client test case. Client running command is\n\n```\njava -server -Xms2048m -Xmx2048m -cp kraps-rpc-example_2.11-1.0.1-SNAPSHOT-jar-with-dependencies.jar PerformanceTestClient \u003cip\u003e \u003cinvocation number\u003e \u003cconcurrent calls\u003e\n```\n\n### 4.3 Test result\n\nPeak QPS will reach to more than 18k as concurrent level goes up.\n\n![](https://github.com/neoremind/mydoc/blob/master/image/kraps_rpc_performance_QPS.png?raw=true)\n\nBelow is CPU usage of server VM during the time performance tests are executed.\n\n![](https://github.com/neoremind/mydoc/blob/master/image/kraps_rpc_performance_test_server_cpu_usage.png?raw=true)\n\nBelow is CPU usage of client VM during the time performance tests are executed.\n\n![](https://github.com/neoremind/mydoc/blob/master/image/kraps_rpc_performance_test_client_cpu_usage.png?raw=true)\n\nAs shown above, during testing phase, server workload is not very high, I think there is still room for higher QPS if more concurrent client calls could be made.\n\n## 5. Dependency tree\n\n```\n[INFO] +- org.apache.spark:spark-network-common_2.11:jar:2.1.0:compile\n[INFO] |  +- io.netty:netty-all:jar:4.0.42.Final:compile\n[INFO] |  +- org.apache.commons:commons-lang3:jar:3.5:compile\n[INFO] |  +- org.fusesource.leveldbjni:leveldbjni-all:jar:1.8:compile\n[INFO] |  +- com.fasterxml.jackson.core:jackson-databind:jar:2.6.5:compile\n[INFO] |  +- com.fasterxml.jackson.core:jackson-annotations:jar:2.6.5:compile\n[INFO] |  +- com.google.code.findbugs:jsr305:jar:1.3.9:compile\n[INFO] |  +- org.apache.spark:spark-tags_2.11:jar:2.1.0:compile\n[INFO] |  \\- org.spark-project.spark:unused:jar:1.0.0:compile\n[INFO] +- de.ruedigermoeller:fst:jar:2.50:compile\n[INFO] |  +- com.fasterxml.jackson.core:jackson-core:jar:2.8.8:compile\n[INFO] |  +- org.javassist:javassist:jar:3.21.0-GA:compile\n[INFO] |  +- org.objenesis:objenesis:jar:2.5.1:compile\n[INFO] |  \\- com.cedarsoftware:java-util:jar:1.9.0:compile\n[INFO] |     +- commons-logging:commons-logging:jar:1.1.1:compile\n[INFO] |     \\- com.cedarsoftware:json-io:jar:2.5.1:compile\n[INFO] +- org.scala-lang:scala-library:jar:2.11.8:compile\n[INFO] +- org.slf4j:slf4j-api:jar:1.7.7:compile\n[INFO] +- org.slf4j:slf4j-log4j12:jar:1.7.7:compile\n[INFO] |  \\- log4j:log4j:jar:1.2.17:compile\n[INFO] +- com.google.guava:guava:jar:15.0:compile\n```\n\n## 6. Acknowledgement\n\nThe development of Kraps-rpc is inspired by Spark. Kraps-rpc with Apache2.0 Open Source License retains all copyright, trademark, author’s information from Spark.\n\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fneoremind%2Fkraps-rpc","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fneoremind%2Fkraps-rpc","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fneoremind%2Fkraps-rpc/lists"}