{"id":13857070,"url":"https://github.com/iitzco/tfserve","last_synced_at":"2025-04-10T20:10:50.998Z","repository":{"id":57474931,"uuid":"147363460","full_name":"iitzco/tfserve","owner":"iitzco","description":"Serve TF models simple and easy as an HTTP API","archived":false,"fork":false,"pushed_at":"2018-10-29T16:04:23.000Z","size":3155,"stargazers_count":35,"open_issues_count":2,"forks_count":10,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-03-24T17:52:47.380Z","etag":null,"topics":["artificial-intelligence","deep-learning","http-api","http-server","neural-network","tensorflow","tensorflow-models","tensorflow-serving"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/iitzco.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2018-09-04T14:57:57.000Z","updated_at":"2022-12-02T07:44:11.000Z","dependencies_parsed_at":"2022-09-10T04:05:04.888Z","dependency_job_id":null,"html_url":"https://github.com/iitzco/tfserve","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/iitzco%2Ftfserve","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/iitzco%2Ftfserve/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/iitzco%2Ftfserve/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/iitzco%2Ftfserve/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/iitzco","download_url":"https://codeload.github.com/iitzco/tfserve/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248288357,"owners_count":21078903,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["artificial-intelligence","deep-learning","http-api","http-server","neural-network","tensorflow","tensorflow-models","tensorflow-serving"],"created_at":"2024-08-05T03:01:24.948Z","updated_at":"2025-04-10T20:10:50.956Z","avatar_url":"https://github.com/iitzco.png","language":"Python","readme":"# TFServe\r\n\r\n[![Downloads](https://pepy.tech/badge/tfserve)](https://pepy.tech/project/tfserve)  [![PyPI version](https://badge.fury.io/py/tfserve.svg)](https://badge.fury.io/py/tfserve)\r\n\r\nTFServe is a framework designed to serve tensorflow models in a simple and easy way as an HTTP API server. It's built on top of [Werkzeug](http://werkzeug.pocoo.org/).\r\n\r\n## How to install\r\n\r\n```bash\r\n$ pip install tfserve\r\n```\r\n\r\nAfter installing `tfserve`, install either `tensorflow` of `tensorflow-gpu` (the latter if you have GPU available).\r\n\r\n```bash\r\n$ pip install tensorflow\r\n```\r\nor\r\n```bash\r\n$ pip install tensorflow-gpu\r\n```\r\n\r\n## How to use\r\n\r\n### Python API\r\n\r\nYou will need 5 parts:\r\n\r\n1. **Model**: it can be a `.pb` file or a model directory containing ckpt files.\r\n2. **Input tensor names**: name of the input tensors of the graph.\r\n3. **Output tensor names**: name of the output tensors of the graph.\r\n4. **`encode`**: python function that receives the request body data and outputs a `dict` mapping input tensor names to input numpy values.\r\n5. **`decode`**: python function that receives a `dict` mapping output tensor names to output numpy values and returns the HTTP response.\r\n\r\nFollow the example to learn how to combine these parts...\r\n\r\n#### Example\r\n\r\nDeploy image classification service that receives a binary jpg image and returns the class of the object found in the image alongside it's probability.\r\n\r\n```python\r\n\r\n# 1. Model: trained mobilenet on ImageNet that can be downloaded from\r\n#           https://storage.googleapis.com/mobilenet_v2/checkpoints/mobilenet_v2_1.4_224.tgz\r\nMODEL_PATH = \"mobilenet_v2_1.4_224/mobilenet_v2_1.4_224_frozen.pb\"\r\n\r\n# 2. Input tensor names:\r\nINPUT_TENSORS = [\"import/input:0\"]\r\n\r\n# 3. Output tensor names:\r\nOUTPUT_TENSORS = [\"import/MobilenetV2/Predictions/Softmax:0\"]\r\n\r\n# 4. encode function: Receives raw jpg image as request data. Returns dict\r\n#                     mappint import/input:0 to numpy value.\r\ndef encode(request_data):\r\n    with tempfile.NamedTemporaryFile(mode=\"wb\", suffix=\".jpg\") as f:\r\n        f.write(request_data)\r\n        # Model receives 224x224 normalized RGB image.\r\n        img = Image.open(f.name).resize((224, 224)) \r\n        img = np.asarray(img) / 255.\r\n\r\n    return {INPUT_TENSORS[0]: img}\r\n\r\n# 5. decode function: Receives `dict` mapping import/MobilenetV2/Predictions/Softmax:0 to\r\n#                     numpy value and builds dict with for json response.\r\ndef decode(outputs):\r\n    p = outputs[OUTPUT_TENSORS[0]] # 1001 vector with probabilities for each class.\r\n    index = np.argmax(p)\r\n    return {\"class\": index_to_class_map[index-1], \"prob\": float(p[index])}\r\n```\r\n\r\nThat's it! Now create TFServeApp object and run it!\r\n\r\n```python\r\napp = TFServeApp(MODEL_PATH, INPUT_TENSORS, OUTPUT_TENSORS, encode, decode)\r\napp.run('127.0.0.1', 5000)  # Host and port where the server will be running\r\n```\r\n\r\nSee `client.py` for full example.\r\n\r\n#### How to consume server\r\n\r\n![img](imgs/screen.gif)\r\n\r\n\u003e The server supports only `POST` method to `/` with the input information as part of the request body.\r\n\r\nThe input will be proccessed in the encode function to produce the feed_dict object that will be passed to the graph. The graph output will be processed in the decode function and the server will return whatever the decode function returns.\r\n\r\n### CLI\r\n\r\n`tfserve` also provides a CLI program with built-in encode/decode handlers:\r\n\r\n```bash\r\ntfserve -m PATH [-i INPUTS] [-o OUTPUTS] [-h HANDLER] [-b] [-H HOST] [-p PORT]\r\n\r\n  -m PATH, --model PATH\r\n                        path to pb file or directory containing checkpoint\r\n  -i INPUTS, --inputs INPUTS\r\n                        a comma separated list of input tensors\r\n  -o OUTPUTS, --outputs OUTPUTS\r\n                        a comma separated list of output tensors\r\n  -h HANDLER, --handler HANDLER\r\n                        encode/decode handler (deault is 'json')\r\n  -b, --batch           process multiple inputs (default is to process\r\n                        one input per request)\r\n  -H HOST, --host HOST  host interface to bind to (0.0.0.0)\r\n  -p PORT, --port PORT  port to listen on (5000)\r\n```\r\n\r\n#### Example\r\n\r\n```bash\r\n$ tfserve -m models/graph.pb -i x:0 -o out:0 -h json -H localhost\r\n```\r\n\r\nRun `tfserve` with `models/graph.pb` model that takes as input the tensor with name `x:0` (dimesion: [?,5]) and outputs a tensor named `out:0`. The server will run on http://localhost:5000/ and will receive `POST` requests to `/`.\r\n\r\nBy using the **json handler**, you can provide the input data as a `json` object in the request body:\r\n\r\n```json\r\n{\r\n  \"x:0\": [1,1,3,4,5]\r\n}\r\n```\r\n\r\nYou will receive a `json` output object as:\r\n\r\n```json\r\n{\r\n  \"out:0\": 0.48\r\n}\r\n```\r\n\r\n#### More information about CLI\r\n\r\nRun:\r\n\r\n```bash\r\n$ tfserve --help\r\n```\r\n\r\n## Help\r\n\r\n* **What if I don't know the tensor names?**\r\n\r\n\u003e You can use `tfserve.helper.estimate_io_tensors(model_path)` function to get a list of possible input/output tensor names. Also, you can use the CLI by running: `tfserve -m [model_path] --help-model`\r\n\r\n* **What if I want to run multiple inferences at the same time?**\r\n\r\n\u003e You can use `batch=True` when building tfserve.TFServeApp. You will then need to handle the batch dimension yourself in the `encode` and `decode` function.\r\n\u003e Also, if using the CLI, just add the `--batch` flag.\r\n\r\n\r\n## Limitation\r\n\r\n\u003e It only works with one-to-one models. That is, models that need to run the graph only once to get the inference.\r\n\u003e Other architectures of inference will be supported soon. Help is appreciated!\r\n","funding_links":[],"categories":["Python"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fiitzco%2Ftfserve","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fiitzco%2Ftfserve","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fiitzco%2Ftfserve/lists"}