{"id":15129798,"url":"https://github.com/bmw-innovationlab/bmw-tensorflow-inference-api-cpu","last_synced_at":"2025-10-23T06:30:29.940Z","repository":{"id":36522519,"uuid":"227433938","full_name":"BMW-InnovationLab/BMW-TensorFlow-Inference-API-CPU","owner":"BMW-InnovationLab","description":"This is a repository for an object detection inference API using the Tensorflow framework.","archived":false,"fork":false,"pushed_at":"2022-06-28T13:37:59.000Z","size":10447,"stargazers_count":185,"open_issues_count":1,"forks_count":47,"subscribers_count":14,"default_branch":"master","last_synced_at":"2025-01-15T20:15:19.109Z","etag":null,"topics":["api","bounding-boxes","computer-vision","computervision","cpu","deep-learning","deeplearning","detection-inference-api","docker","docker-ce","docker-container","docker-image","inference","inference-engine","inference-server","object-detection","predictions","rest-api","tensorflow","tensorflow-framework"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/BMW-InnovationLab.png","metadata":{"files":{"readme":"README-docker_swarm.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2019-12-11T18:31:52.000Z","updated_at":"2024-10-29T22:29:37.000Z","dependencies_parsed_at":"2022-08-08T15:17:12.383Z","dependency_job_id":null,"html_url":"https://github.com/BMW-InnovationLab/BMW-TensorFlow-Inference-API-CPU","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BMW-InnovationLab%2FBMW-TensorFlow-Inference-API-CPU","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BMW-InnovationLab%2FBMW-TensorFlow-Inference-API-CPU/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BMW-InnovationLab%2FBMW-TensorFlow-Inference-API-CPU/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BMW-InnovationLab%2FBMW-TensorFlow-Inference-API-CPU/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/BMW-InnovationLab","download_url":"https://codeload.github.com/BMW-InnovationLab/BMW-TensorFlow-Inference-API-CPU/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":237784876,"owners_count":19365948,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["api","bounding-boxes","computer-vision","computervision","cpu","deep-learning","deeplearning","detection-inference-api","docker","docker-ce","docker-container","docker-image","inference","inference-engine","inference-server","object-detection","predictions","rest-api","tensorflow","tensorflow-framework"],"created_at":"2024-09-26T02:20:31.915Z","updated_at":"2025-10-23T06:30:24.140Z","avatar_url":"https://github.com/BMW-InnovationLab.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Tensorflow CPU Inference API For Windows and Linux with docker swarm\nPlease use **docker swarm** only if you need to:\n\n* Provide redundancy in terms of API containers: In case a container went down, the incoming requests will be redirected to another running instance.\n\n* Coordinate between the containers: Swarm will orchestrate between the APIs and choose one of them to listen to the incoming request.\n\n* Scale up the Inference service in order to get a faster prediction especially if there's traffic on the service.\n\n## Run the docker container\n\nDocker swarm can scale up the API into multiple replicas and can be used on one or multiple hosts(Linux users only). In both cases, a docker swarm setup is required for all hosts.\n\n#### Docker swarm setup\n\n1- Initialize Swarm:\n\n```sh \ndocker swarm init\n```\n\n2- On the manager host, open the cpu-inference.yaml file and specify the number of replicas needed. In case you are using multiple hosts (With multiple hosts section), the number of replicas will be divided across all hosts.\n\n```yaml\nversion: \"3\"\n\nservices:\n  api:\n    ports:\n      - \"4343:4343\"\n    image: tensorflow_inference_api_cpu\n    volumes:\n      - \"/mnt/models:/models\"\n    deploy:\n      replicas: 1\n      update_config:\n        parallelism: 2\n        delay: 10s\n      restart_policy:\n        condition: on-failure\n```\n\n**Notes about cpu-inference.yaml:**\n\n* the volumes field on the left of \":\" should be an absolute path, can be changeable by the user, and represents the models directory on your Operating System\n* the following volume's field \":/models\" should never be changed\n\n#### With one host\n\nDeploy the API:\n\n```sh\ndocker stack deploy -c cpu-inference.yaml tensorflow-cpu\n```\n\n![onehost](./docs/tcpu.png)\n\n#### With multiple hosts (Linux users only)\n\n1- **Make sure hosts are reachable on the same network**. \n\n2- Choose a host to be the manager and run the following command on the chosen host to generate a token so the other hosts can join:\n\n```sh\ndocker swarm join-token worker\n```\n\nA command will appear on your terminal, copy and paste it on the other hosts, as seen in the below image\n\n3- Deploy your application using:\n\n```sh \ndocker stack deploy -c cpu-inference.yaml tensorflow-cpu\n```\n\n![multhost](./docs/tcpu2.png)\n\n#### Useful Commands\n\n1- In order to scale up the service to 4 replicas for example use this command:\n\n```sh\ndocker service scale tensorflow-cpu_api=4\n```\n\n2- To check the available workers:\n\n```sh\ndocker node ls\n```\n\n3- To check on which node the container is running:\n\n```sh\ndocker service ps tensorflow-cpu_api\n```\n\n4- To check the number of replicas:\n\n```sh\ndocker service ls\n```\n\n## Benchmarking\n\nHere are two graphs showing time of prediction for different number of requests at the same time.\n\n\n![CPU 20 req](./docs/TCPU20req.png)\n\n\n![CPU 40 req](./docs/TCPU40req.png)\n\n\nWe can see that both graphs got the same result no matter what is the number of received requests at the same time. When we increase the number of workers (hosts) we are able to speed up the inference by at least 2 times. For example we can see in the last column we were able to process 40 requests in:\n\n- 17.5 seconds with 20 replicas in 1 machine\n- 8.8 seconds with 20 replicas in each of the 2 machines\n\nMoreover, in case one of the machines is down the others are always ready to receive requests.\n\nFinally since we are predicting on CPU scaling more replicas doesn't mean a faster prediction, 4 containers was faster than 20.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbmw-innovationlab%2Fbmw-tensorflow-inference-api-cpu","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbmw-innovationlab%2Fbmw-tensorflow-inference-api-cpu","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbmw-innovationlab%2Fbmw-tensorflow-inference-api-cpu/lists"}