{"id":13701391,"url":"https://github.com/tobegit3hub/simple_tensorflow_serving","last_synced_at":"2025-10-25T17:18:24.150Z","repository":{"id":50292909,"uuid":"118567143","full_name":"tobegit3hub/simple_tensorflow_serving","owner":"tobegit3hub","description":"Generic and easy-to-use serving service for machine learning models","archived":false,"fork":false,"pushed_at":"2025-03-20T19:08:18.000Z","size":27322,"stargazers_count":757,"open_issues_count":29,"forks_count":191,"subscribers_count":29,"default_branch":"master","last_synced_at":"2025-05-11T08:47:28.987Z","etag":null,"topics":["client","deep-learning","http","machine-learning","savedmodel","serving","tensorflow","tensorflow-models"],"latest_commit_sha":null,"homepage":"https://stfs.readthedocs.io","language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/tobegit3hub.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2018-01-23T06:21:02.000Z","updated_at":"2025-05-04T20:16:16.000Z","dependencies_parsed_at":"2025-04-08T12:01:16.617Z","dependency_job_id":"44850cfb-8bd0-416e-9dfe-3ee61fbdff1c","html_url":"https://github.com/tobegit3hub/simple_tensorflow_serving","commit_stats":null,"previous_names":[],"tags_count":28,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tobegit3hub%2Fsimple_tensorflow_serving","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tobegit3hub%2Fsimple_tensorflow_serving/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tobegit3hub%2Fsimple_tensorflow_serving/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tobegit3hub%2Fsimple_tensorflow_serving/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/tobegit3hub","download_url":"https://codeload.github.com/tobegit3hub/simple_tensorflow_serving/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254140760,"owners_count":22021219,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["client","deep-learning","http","machine-learning","savedmodel","serving","tensorflow","tensorflow-models"],"created_at":"2024-08-02T20:01:34.983Z","updated_at":"2025-10-25T17:18:19.107Z","avatar_url":"https://github.com/tobegit3hub.png","language":"JavaScript","funding_links":[],"categories":["JavaScript"],"sub_categories":[],"readme":"# Simple TensorFlow Serving\n\n![](./images/simple_tensorflow_serving_introduction.jpeg)\n\n## Introduction\n\nSimple TensorFlow Serving is the generic and easy-to-use serving service for machine learning models. Read more in \u003chttps://stfs.readthedocs.io\u003e.\n\n* [x] Support distributed TensorFlow models\n* [x] Support the general RESTful/HTTP APIs\n* [x] Support inference with accelerated GPU\n* [x] Support `curl` and other command-line tools\n* [x] Support clients in any programming language\n* [x] Support code-gen client by models without coding\n* [x] Support inference with raw file for image models\n* [x] Support statistical metrics for verbose requests\n* [x] Support serving multiple models at the same time\n* [x] Support dynamic online and offline for model versions\n* [x] Support loading new custom op for TensorFlow models\n* [x] Support secure authentication with configurable basic auth\n* [x] Support multiple models of TensorFlow/MXNet/PyTorch/Caffe2/CNTK/ONNX/H2o/Scikit-learn/XGBoost/PMML/Spark MLlib\n\n## Installation\n\nInstall the server with [pip](https://pypi.python.org/pypi/simple-tensorflow-serving).\n\n```bash\npip install simple_tensorflow_serving\n```\n\nOr install from [source code](https://github.com/tobegit3hub/simple_tensorflow_serving).\n\n```bash\npython ./setup.py install\n\npython ./setup.py develop\n\nbazel build simple_tensorflow_serving:server\n```\n\nOr use the [docker image](https://hub.docker.com/r/tobegit3hub/simple_tensorflow_serving/).\n\n```bash\ndocker run -d -p 8500:8500 tobegit3hub/simple_tensorflow_serving\n\ndocker run -d -p 8500:8500 tobegit3hub/simple_tensorflow_serving:latest-gpu\n\ndocker run -d -p 8500:8500 tobegit3hub/simple_tensorflow_serving:latest-hdfs\n\ndocker run -d -p 8500:8500 tobegit3hub/simple_tensorflow_serving:latest-py34\n```\n\n````bash\ndocker-compose up -d\n````\n\nOr deploy in [Kubernetes](https://kubernetes.io/).\n\n```bash\nkubectl create -f ./simple_tensorflow_serving.yaml\n```\n\n## Quick Start\n\nStart the server with the TensorFlow [SavedModel](https://www.tensorflow.org/programmers_guide/saved_model).\n\n```bash\nsimple_tensorflow_serving --model_base_path=\"./models/tensorflow_template_application_model\"\n```\n\nCheck out the dashboard in [http://127.0.0.1:8500](http://127.0.0.1:8500) in web browser.\n \n![dashboard](./images/dashboard.png)\n\nGenerate Python client and access the model with test data without coding.\n\n```bash\ncurl http://localhost:8500/v1/models/default/gen_client?language=python \u003e client.py\n```\n\n```bash\npython ./client.py\n```\n\n## Advanced Usage\n\n### Multiple Models\n\nIt supports serve multiple models and multiple versions of these models. You can run the server with this configuration.\n\n```json\n{\n  \"model_config_list\": [\n    {\n      \"name\": \"tensorflow_template_application_model\",\n      \"base_path\": \"./models/tensorflow_template_application_model/\",\n      \"platform\": \"tensorflow\"\n    }, {\n      \"name\": \"deep_image_model\",\n      \"base_path\": \"./models/deep_image_model/\",\n      \"platform\": \"tensorflow\"\n    }, {\n       \"name\": \"mxnet_mlp_model\",\n       \"base_path\": \"./models/mxnet_mlp/mx_mlp\",\n       \"platform\": \"mxnet\"\n    }\n  ]\n}\n```\n\n```bash\nsimple_tensorflow_serving --model_config_file=\"./examples/model_config_file.json\"\n```\n\nAdding or removing model versions will be detected automatically and re-load latest files in memory. You can easily choose the specified model and version for inference.\n\n```json\nendpoint = \"http://127.0.0.1:8500\"\ninput_data = {\n  \"model_name\": \"default\",\n  \"model_version\": 1,\n  \"data\": {\n      \"keys\": [[11.0], [2.0]],\n      \"features\": [[1, 1, 1, 1, 1, 1, 1, 1, 1],\n                   [1, 1, 1, 1, 1, 1, 1, 1, 1]]\n  }\n}\nresult = requests.post(endpoint, json=input_data)\n```\n\n### GPU Acceleration\n\nIf you want to use GPU, try with the docker image with GPU tag and put cuda files in `/usr/cuda_files/`.\n\n```bash\nexport CUDA_SO=\"-v /usr/cuda_files/:/usr/cuda_files/\"\nexport DEVICES=$(\\ls /dev/nvidia* | xargs -I{} echo '--device {}:{}')\nexport LIBRARY_ENV=\"-e LD_LIBRARY_PATH=/usr/local/cuda/extras/CUPTI/lib64:/usr/local/nvidia/lib:/usr/local/nvidia/lib64:/usr/cuda_files\"\n\ndocker run -it -p 8500:8500 $CUDA_SO $DEVICES $LIBRARY_ENV tobegit3hub/simple_tensorflow_serving:latest-gpu\n```\n\nYou can set session config and gpu options in command-line parameter or the model config file.\n\n```bash\nsimple_tensorflow_serving --model_base_path=\"./models/tensorflow_template_application_model\" --session_config='{\"log_device_placement\": true, \"allow_soft_placement\": true, \"allow_growth\": true, \"per_process_gpu_memory_fraction\": 0.5}'\n```\n\n```json\n{\n  \"model_config_list\": [\n    {\n      \"name\": \"default\",\n      \"base_path\": \"./models/tensorflow_template_application_model/\",\n      \"platform\": \"tensorflow\",\n      \"session_config\": {\n        \"log_device_placement\": true,\n        \"allow_soft_placement\": true,\n        \"allow_growth\": true,\n        \"per_process_gpu_memory_fraction\": 0.5\n      }\n    }\n  ]\n}\n```\n\n### Generated Client\n\nYou can generate the test json data for the online models.\n\n```bash\ncurl http://localhost:8500/v1/models/default/gen_json\n```\n\nOr generate clients in different languages(Bash, Python, Golang, JavaScript etc.) for your model without writing any code.\n\n```bash\ncurl http://localhost:8500/v1/models/default/gen_client?language=python \u003e client.py\ncurl http://localhost:8500/v1/models/default/gen_client?language=bash \u003e client.sh\ncurl http://localhost:8500/v1/models/default/gen_client?language=golang \u003e client.go\ncurl http://localhost:8500/v1/models/default/gen_client?language=javascript \u003e client.js\n```\n\nThe generated code should look like these which can be test immediately.\n\n```python\n#!/usr/bin/env python\n\nimport requests\n\ndef main():\n  endpoint = \"http://127.0.0.1:8500\"\n  json_data = {\"model_name\": \"default\", \"data\": {\"keys\": [[1], [1]], \"features\": [[1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0], [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]]} }\n  result = requests.post(endpoint, json=json_data)\n  print(result.text)\n\nif __name__ == \"__main__\":\n  main()\n```\n\n```python\n#!/usr/bin/env python\n\nimport requests\n\ndef main():\n  endpoint = \"http://127.0.0.1:8500\"\n\n  input_data = {\"keys\": [[1.0], [1.0]], \"features\": [[1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0], [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]]}\n  result = requests.post(endpoint, json=input_data)\n  print(result.text)\n\nif __name__ == \"__main__\":\n  main()\n```\n\n### Image Model\n\nFor image models, we can request with the raw image files instead of constructing array data.\n\nNow start serving the image model like [deep_image_model](https://github.com/tobegit3hub/deep_image_model).\n\n```bash\nsimple_tensorflow_serving --model_base_path=\"./models/deep_image_model/\"\n```\n\nThen request with the raw image file which has the same shape of your model.\n\n```bash\ncurl -X POST -F 'image=@./images/mew.jpg' -F \"model_version=1\" 127.0.0.1:8500\n```\n\n## TensorFlow Estimator Model\n\nIf we use the TensorFlow Estimator API to export the model, the model signature should look like this.\n\n```\ninputs {\n  key: \"inputs\"\n  value {\n    name: \"input_example_tensor:0\"\n    dtype: DT_STRING\n    tensor_shape {\n      dim {\n        size: -1\n      }\n    }\n  }\n}\noutputs {\n  key: \"classes\"\n  value {\n    name: \"linear/binary_logistic_head/_classification_output_alternatives/classes_tensor:0\"\n    dtype: DT_STRING\n    tensor_shape {\n      dim {\n        size: -1\n      }\n      dim {\n        size: -1\n      }\n    }\n  }\n}\noutputs {\n  key: \"scores\"\n  value {\n    name: \"linear/binary_logistic_head/predictions/probabilities:0\"\n    dtype: DT_FLOAT\n    tensor_shape {\n      dim {\n        size: -1\n      }\n      dim {\n        size: 2\n      }\n    }\n  }\n}\nmethod_name: \"tensorflow/serving/classify\"\n```\n\nWe need to construct the string tensor for inference and use base64 to encode the string for HTTP. Here is the example Python code.\n\n```python\ndef _float_feature(value):\n  return tf.train.Feature(float_list=tf.train.FloatList(value=[value]))\n\ndef _bytes_feature(value):\n  return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))\n\ndef main():\n  # Raw input data\n  feature_dict = {\"a\": _bytes_feature(\"10\"), \"b\": _float_feature(10)}\n\n  # Create Example as base64 string\n  example_proto = tf.train.Example(features=tf.train.Features(feature=feature_dict))\n  tensor_proto = tf.contrib.util.make_tensor_proto(example_proto.SerializeToString(), dtype=tf.string)\n  tensor_string = tensor_proto.string_val.pop()\n  base64_tensor_string = base64.urlsafe_b64encode(tensor_string)\n\n  # Request server\n  endpoint = \"http://127.0.0.1:8500\"\n  json_data = {\"model_name\": \"default\", \"base64_decode\": True, \"data\": {\"inputs\": [base64_tensor_string]}}\n  result = requests.post(endpoint, json=json_data)\n  print(result.json())\n```\n\n### Custom Op\n\nIf your models rely on new TensorFlow [custom op](https://www.tensorflow.org/extend/adding_an_op), you can run the server while loading the so files.\n\n```bash\nsimple_tensorflow_serving --model_base_path=\"./model/\" --custom_op_paths=\"./foo_op/\"\n```\n\nPlease check out the complete example in [./examples/custom_op/](./examples/custom_op/).\n\n### Authentication\n\nFor enterprises, we can enable basic auth for all the APIs and any anonymous request is denied.\n\nNow start the server with the configured username and password.\n\n```bash\n./server.py --model_base_path=\"./models/tensorflow_template_application_model/\" --enable_auth=True --auth_username=\"admin\" --auth_password=\"admin\"\n```\n\nIf you are using the Web dashboard, just type your certification. If you are using clients, give the username and password within the request.\n\n```bash\ncurl -u admin:admin -H \"Content-Type: application/json\" -X POST -d '{\"data\": {\"keys\": [[11.0], [2.0]], \"features\": [[1, 1, 1, 1, 1, 1, 1, 1, 1], [1, 1, 1, 1, 1, 1, 1, 1, 1]]}}' http://127.0.0.1:8500\n```\n\n```python\nendpoint = \"http://127.0.0.1:8500\"\ninput_data = {\n  \"data\": {\n      \"keys\": [[11.0], [2.0]],\n      \"features\": [[1, 1, 1, 1, 1, 1, 1, 1, 1], [1, 1, 1, 1, 1, 1, 1, 1, 1]]\n  }\n}\nauth = requests.auth.HTTPBasicAuth(\"admin\", \"admin\")\nresult = requests.post(endpoint, json=input_data, auth=auth)\n```\n\n### TSL/SSL\n\nIt supports TSL/SSL and you can generate the self-signed secret files for testing.\n\n```bash\nopenssl req -x509 -newkey rsa:4096 -nodes -out /tmp/secret.pem -keyout /tmp/secret.key -days 365\n```\n\nThen run the server with certification files.\n\n```bash\nsimple_tensorflow_serving --enable_ssl=True --secret_pem=/tmp/secret.pem --secret_key=/tmp/secret.key --model_base_path=\"./models/tensorflow_template_application_model\"\n```\n\n## Supported Models\n\nFor MXNet models, you can load with commands and configuration like these.\n\n```bash\nsimple_tensorflow_serving --model_base_path=\"./models/mxnet_mlp/mx_mlp\" --model_platform=\"mxnet\"\n```\n\n```python\nendpoint = \"http://127.0.0.1:8500\"\ninput_data = {\n  \"model_name\": \"default\",\n  \"model_version\": 1,\n  \"data\": {\n      \"data\": [[12.0, 2.0]]\n  }\n}\nresult = requests.post(endpoint, json=input_data)\nprint(result.text)\n```\n\nFor ONNX models, you can load with commands and configuration like these.\n\n```bash\nsimple_tensorflow_serving --model_base_path=\"./models/onnx_mnist_model/onnx_model.proto\" --model_platform=\"onnx\"\n```\n\n```python\nendpoint = \"http://127.0.0.1:8500\"\ninput_data = {\n  \"model_name\": \"default\",\n  \"model_version\": 1,\n  \"data\": {\n      \"data\": [[...]]\n  }\n}\nresult = requests.post(endpoint, json=input_data)\nprint(result.text)\n```\n\nFor H2o models, you can load with commands and configuration like these.\n\n```bash\n# Start H2o server with \"java -jar h2o.jar\"\n\nsimple_tensorflow_serving --model_base_path=\"./models/h2o_prostate_model/GLM_model_python_1525255083960_17\" --model_platform=\"h2o\"\n```\n\n```python\nendpoint = \"http://127.0.0.1:8500\"\ninput_data = {\n  \"model_name\": \"default\",\n  \"model_version\": 1,\n  \"data\": {\n      \"data\": [[...]]\n  }\n}\nresult = requests.post(endpoint, json=input_data)\nprint(result.text)\n```\n\nFor Scikit-learn models, you can load with commands and configuration like these.\n\n```bash\nsimple_tensorflow_serving --model_base_path=\"./models/scikitlearn_iris/model.joblib\" --model_platform=\"scikitlearn\"\n\nsimple_tensorflow_serving --model_base_path=\"./models/scikitlearn_iris/model.pkl\" --model_platform=\"scikitlearn\"\n```\n\n```python\nendpoint = \"http://127.0.0.1:8500\"\ninput_data = {\n  \"model_name\": \"default\",\n  \"model_version\": 1,\n  \"data\": {\n      \"data\": [[...]]\n  }\n}\nresult = requests.post(endpoint, json=input_data)\nprint(result.text)\n```\n\nFor XGBoost models, you can load with commands and configuration like these.\n\n```bash\nsimple_tensorflow_serving --model_base_path=\"./models/xgboost_iris/model.bst\" --model_platform=\"xgboost\"\n\nsimple_tensorflow_serving --model_base_path=\"./models/xgboost_iris/model.joblib\" --model_platform=\"xgboost\"\n\nsimple_tensorflow_serving --model_base_path=\"./models/xgboost_iris/model.pkl\" --model_platform=\"xgboost\"\n```\n\n```python\nendpoint = \"http://127.0.0.1:8500\"\ninput_data = {\n  \"model_name\": \"default\",\n  \"model_version\": 1,\n  \"data\": {\n      \"data\": [[...]]\n  }\n}\nresult = requests.post(endpoint, json=input_data)\nprint(result.text)\n```\n\nFor PMML models, you can load with commands and configuration like these. This relies on [Openscoring](https://github.com/openscoring/openscoring) and [Openscoring-Python](https://github.com/openscoring/openscoring-python) to load the models.\n\n```bash\njava -jar ./third_party/openscoring/openscoring-server-executable-1.4-SNAPSHOT.jar\n\nsimple_tensorflow_serving --model_base_path=\"./models/pmml_iris/DecisionTreeIris.pmml\" --model_platform=\"pmml\"\n```\n\n```python\nendpoint = \"http://127.0.0.1:8500\"\ninput_data = {\n  \"model_name\": \"default\",\n  \"model_version\": 1,\n  \"data\": {\n      \"data\": [[...]]\n  }\n}\nresult = requests.post(endpoint, json=input_data)\nprint(result.text)\n```\n\n\n## Supported Client\n\nHere is the example client in [Bash](./bash_client/).\n\n```bash\ncurl -H \"Content-Type: application/json\" -X POST -d '{\"data\": {\"keys\": [[1.0], [2.0]], \"features\": [[10, 10, 10, 8, 6, 1, 8, 9, 1], [6, 2, 1, 1, 1, 1, 7, 1, 1]]}}' http://127.0.0.1:8500\n```\n\nHere is the example client in [Python](./python_client/).\n\n```python\nendpoint = \"http://127.0.0.1:8500\"\npayload = {\"data\": {\"keys\": [[11.0], [2.0]], \"features\": [[1, 1, 1, 1, 1, 1, 1, 1, 1], [1, 1, 1, 1, 1, 1, 1, 1, 1]]}}\n\nresult = requests.post(endpoint, json=payload)\n```\n\nHere is the example client in [C++](./cpp_client/).\n\nHere is the example client in [Java](./java_client/).\n\nHere is the example client in [Scala](./scala_client/).\n\nHere is the example client in [Go](./go_client/).\n\n```go\nendpoint := \"http://127.0.0.1:8500\"\ndataByte := []byte(`{\"data\": {\"keys\": [[11.0], [2.0]], \"features\": [[1, 1, 1, 1, 1, 1, 1, 1, 1], [1, 1, 1, 1, 1, 1, 1, 1, 1]]}}`)\nvar dataInterface map[string]interface{}\njson.Unmarshal(dataByte, \u0026dataInterface)\ndataJson, _ := json.Marshal(dataInterface)\n\nresp, err := http.Post(endpoint, \"application/json\", bytes.NewBuffer(dataJson))\n```\n\nHere is the example client in [Ruby](./ruby_client/).\n\n```ruby\nendpoint = \"http://127.0.0.1:8500\"\nuri = URI.parse(endpoint)\nheader = {\"Content-Type\" =\u003e \"application/json\"}\ninput_data = {\"data\" =\u003e {\"keys\"=\u003e [[11.0], [2.0]], \"features\"=\u003e [[1, 1, 1, 1, 1, 1, 1, 1, 1], [1, 1, 1, 1, 1, 1, 1, 1, 1]]}}\nhttp = Net::HTTP.new(uri.host, uri.port)\nrequest = Net::HTTP::Post.new(uri.request_uri, header)\nrequest.body = input_data.to_json\n\nresponse = http.request(request)\n```\n\nHere is the example client in [JavaScript](./javascript_client/).\n\n```javascript\nvar options = {\n    uri: \"http://127.0.0.1:8500\",\n    method: \"POST\",\n    json: {\"data\": {\"keys\": [[11.0], [2.0]], \"features\": [[1, 1, 1, 1, 1, 1, 1, 1, 1], [1, 1, 1, 1, 1, 1, 1, 1, 1]]}}\n};\n\nrequest(options, function (error, response, body) {});\n```\n\nHere is the example client in [PHP](./php_client/).\n\n```php\n$endpoint = \"127.0.0.1:8500\";\n$inputData = array(\n    \"keys\" =\u003e [[11.0], [2.0]],\n    \"features\" =\u003e [[1, 1, 1, 1, 1, 1, 1, 1, 1], [1, 1, 1, 1, 1, 1, 1, 1, 1]],\n);\n$jsonData = array(\n    \"data\" =\u003e $inputData,\n);\n$ch = curl_init($endpoint);\ncurl_setopt_array($ch, array(\n    CURLOPT_POST =\u003e TRUE,\n    CURLOPT_RETURNTRANSFER =\u003e TRUE,\n    CURLOPT_HTTPHEADER =\u003e array(\n        \"Content-Type: application/json\"\n    ),\n    CURLOPT_POSTFIELDS =\u003e json_encode($jsonData)\n));\n\n$response = curl_exec($ch);\n```\n\nHere is the example client in [Erlang](./erlang_client/).\n\n```erlang\nssl:start(),\napplication:start(inets),\nhttpc:request(post,\n  {\"http://127.0.0.1:8500\", [],\n  \"application/json\",\n  \"{\\\"data\\\": {\\\"keys\\\": [[11.0], [2.0]], \\\"features\\\": [[1, 1, 1, 1, 1, 1, 1, 1, 1], [1, 1, 1, 1, 1, 1, 1, 1, 1]]}}\"\n  }, [], []).\n```\n\nHere is the example client in [Lua](./lua_client/).\n\n```lua\nlocal endpoint = \"http://127.0.0.1:8500\"\nkeys_array = {}\nkeys_array[1] = {1.0}\nkeys_array[2] = {2.0}\nfeatures_array = {}\nfeatures_array[1] = {1, 1, 1, 1, 1, 1, 1, 1, 1}\nfeatures_array[2] = {1, 1, 1, 1, 1, 1, 1, 1, 1}\nlocal input_data = {\n    [\"keys\"] = keys_array,\n    [\"features\"] = features_array,\n}\nlocal json_data = {\n    [\"data\"] = input_data\n}\nrequest_body = json:encode (json_data)\nlocal response_body = {}\n\nlocal res, code, response_headers = http.request{\n    url = endpoint,\n    method = \"POST\", \n    headers = \n      {\n          [\"Content-Type\"] = \"application/json\";\n          [\"Content-Length\"] = #request_body;\n      },\n      source = ltn12.source.string(request_body),\n      sink = ltn12.sink.table(response_body),\n}\n```\n\nHere is the example client in [Rust](./swift_client/).\n\nHere is the example client in [Swift](./swift_client/).\n\nHere is the example client in [Perl](./perl_client/).\n\n```perl\nmy $endpoint = \"http://127.0.0.1:8500\";\nmy $json = '{\"data\": {\"keys\": [[11.0], [2.0]], \"features\": [[1, 1, 1, 1, 1, 1, 1, 1, 1], [1, 1, 1, 1, 1, 1, 1, 1, 1]]}}';\nmy $req = HTTP::Request-\u003enew( 'POST', $endpoint );\n$req-\u003eheader( 'Content-Type' =\u003e 'application/json' );\n$req-\u003econtent( $json );\n$ua = LWP::UserAgent-\u003enew;\n\n$response = $ua-\u003erequest($req);\n```\n\nHere is the example client in [Lisp](./swift_client/).\n\nHere is the example client in [Haskell](./swift_client/).\n\nHere is the example client in [Clojure](./clojure_client/).\n\nHere is the example client in [R](./r_client/).\n\n```r\nendpoint \u003c- \"http://127.0.0.1:8500\"\nbody \u003c- list(data = list(a = 1), keys = 1)\njson_data \u003c- list(\n  data = list(\n    keys = list(list(1.0), list(2.0)), features = list(list(1, 1, 1, 1, 1, 1, 1, 1, 1), list(1, 1, 1, 1, 1, 1, 1, 1, 1))\n  )\n)\n\nr \u003c- POST(endpoint, body = json_data, encode = \"json\")\nstop_for_status(r)\ncontent(r, \"parsed\", \"text/html\")\n```\n\nHere is the example with Postman.\n\n![](./images/postman.png)\n\n\n## Performance\n\nYou can run SimpleTensorFlowServing with any WSGI server for better performance. We have benchmarked and compare with `TensorFlow Serving`. Find more details in [benchmark](./benchmark/).\n\nSTFS(Simple TensorFlow Serving) and TFS(TensorFlow Serving) have similar performances for different models. Vertical coordinate is inference latency(microsecond) and the less is better.\n\n![](./images/benchmark_latency.jpeg)\n\nThen we test with `ab` with concurrent clients in CPU and GPU. `TensorFlow Serving` works better especially with GPUs.\n\n![](./images/benchmark_concurrency.jpeg)\n\nFor [simplest model](./benchmark/simplest_model/), each request only costs ~1.9 microseconds and one instance of Simple TensorFlow Serving can achieve 5000+ QPS. With larger batch size, it can inference more than 1M instances per second.\n\n![](./images/benchmark_batch_size.jpeg)\n\n## How It Works\n\n1. `simple_tensorflow_serving` starts the HTTP server with `flask` application.\n2. Load the TensorFlow models with `tf.saved_model.loader` Python API.\n3. Construct the feed_dict data from the JSON body of the request.\n   ```\n   // Method: POST, Content-Type: application/json\n   {\n     \"model_version\": 1, // Optional\n     \"data\": {\n       \"keys\": [[1], [2]],\n       \"features\": [[1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0], [1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0]]\n     }\n   }\n   ```\n4. Use the TensorFlow Python API to `sess.run()` with feed_dict data.\n5. For multiple versions supported, it starts independent thread to load models.\n6. For generated clients, it reads user's model and render code with [Jinja](http://jinja.pocoo.org/) templates. \n\n![](./images/architecture.jpeg)\n\n## Contribution\n\nFeel free to open an issue or send pull request for this project. It is warmly welcome to add more clients in your languages to access TensorFlow models.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftobegit3hub%2Fsimple_tensorflow_serving","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftobegit3hub%2Fsimple_tensorflow_serving","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftobegit3hub%2Fsimple_tensorflow_serving/lists"}