{"id":16142962,"url":"https://github.com/zerohertz/yolo-serving-cookbook","last_synced_at":"2025-03-18T17:31:09.844Z","repository":{"id":203691229,"uuid":"708682315","full_name":"Zerohertz/yolo-serving-cookbook","owner":"Zerohertz","description":"📸 YOLO Serving Cookbook based on Triton Inference Server 📸","archived":true,"fork":false,"pushed_at":"2024-05-15T13:12:02.000Z","size":1562,"stargazers_count":4,"open_issues_count":0,"forks_count":1,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-02-12T05:02:34.012Z","etag":null,"topics":["docker","docker-compose","fastapi","gradio","k8s","kubernetes","mlops","model-serving","onnx","pytorch","triton-inference-server","yolo","yolov5"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"agpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Zerohertz.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-10-23T07:09:59.000Z","updated_at":"2025-01-20T17:20:12.000Z","dependencies_parsed_at":null,"dependency_job_id":"18774095-3eec-4bac-a08f-754488264b93","html_url":"https://github.com/Zerohertz/yolo-serving-cookbook","commit_stats":null,"previous_names":["zerohertz/yolo-serving-cookbook"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Zerohertz%2Fyolo-serving-cookbook","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Zerohertz%2Fyolo-serving-cookbook/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Zerohertz%2Fyolo-serving-cookbook/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Zerohertz%2Fyolo-serving-cookbook/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Zerohertz","download_url":"https://codeload.github.com/Zerohertz/yolo-serving-cookbook/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":244269460,"owners_count":20426227,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["docker","docker-compose","fastapi","gradio","k8s","kubernetes","mlops","model-serving","onnx","pytorch","triton-inference-server","yolo","yolov5"],"created_at":"2024-10-10T00:08:09.557Z","updated_at":"2025-03-18T17:31:09.411Z","avatar_url":"https://github.com/Zerohertz.png","language":"Python","readme":"\u003ch1 align = \"center\"\u003e\n    📸 YOLO Serving Cookbook 📸\n\u003c/h1\u003e\n\n\u003cp align = \"center\"\u003e\n    \u003cimg src=\"https://img.shields.io/badge/Python-3766AB?style=flat-square\u0026logo=Python\u0026logoColor=white\"/\u003e \u003cimg src=\"https://img.shields.io/badge/OpenCV-5C3EE8?style=flat-square\u0026logo=OpenCV\u0026logoColor=white\"/\u003e \u003cimg src=\"https://img.shields.io/badge/FastAPI-009688?style=flat-square\u0026logo=FastAPI\u0026logoColor=white\"/\u003e \u003cimg src=\"https://img.shields.io/badge/Gradio-EE8332?style=flat-square\u0026logo=Openlayers\u0026logoColor=white\"/\u003e\n\u003c/p\u003e\n\u003cp align = \"center\"\u003e\n    \u003cimg src=\"https://img.shields.io/badge/ONNX-005CED?style=flat-square\u0026logo=ONNX\u0026logoColor=white\"/\u003e \u003cimg src=\"https://img.shields.io/badge/Triton%20Inference%20Server-76B900?style=flat-square\u0026logo=nvidia\u0026logoColor=white\"/\u003e \u003cimg src=\"https://img.shields.io/badge/Docker-2496ED?style=flat-square\u0026logo=Docker\u0026logoColor=white\"/\u003e \u003cimg src=\"https://img.shields.io/badge/Kubernetes-326CE5?style=flat-square\u0026logo=Kubernetes\u0026logoColor=white\"/\u003e \u003cimg src=\"https://img.shields.io/badge/Traefik Proxy-24A1C1?style=flat-square\u0026logo=Traefik Proxy\u0026logoColor=white\"/\u003e\n\u003c/p\u003e\n\n## [1. Docker](https://github.com/Zerohertz/YOLO-Serving/tree/1.Docker)\n\n\u003cdetails\u003e\n\u003csummary\u003e\nArchitecture\n\u003c/summary\u003e\n\u003cdiv align=\"center\"\u003e\n\u003cimg src=\"https://github.com/Zerohertz/Zerohertz/assets/42334717/16f71b10-e68a-4016-a87f-2a6fbb9946a9\" alt=\"Docker\" width=\"500\"/\u003e\n\u003c/div\u003e\n\u003c/details\u003e\n\n## [2. Docker Compose](https://github.com/Zerohertz/YOLO-Serving/tree/2.Docker-Compose)\n\n\u003cdetails\u003e\n\u003csummary\u003e\nArchitecture\n\u003c/summary\u003e\n\u003cdiv align=\"center\"\u003e\n\u003cimg src=\"https://github.com/Zerohertz/Zerohertz/assets/42334717/e243f0c8-4ace-4a86-96e4-067066047dab\" alt=\"Docker-Compose\" width=\"700\"/\u003e\n\u003c/div\u003e\n\u003c/details\u003e\n\n## 3. Kubernetes\n\n\u003cdetails\u003e\n\u003csummary\u003e\nArchitecture (without Ensemble)\n\u003c/summary\u003e\n\n\u003ctable align=\"center\"\u003e\n\u003ctr\u003e\n\u003ctd align=\"center\"\u003eNumber of Replicas = 1\u003c/td\u003e\n\u003ctd align=\"center\"\u003eNumber of Replicas = 5\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd align=\"center\"\u003e\u003cimg src=\"https://github.com/Zerohertz/Zerohertz/assets/42334717/e619da5f-015d-4c4d-bb4e-a717c7e5395c\" alt=\"Kubernetes-Rep=1\"/\u003e\u003c/td\u003e\n\u003ctd align=\"center\"\u003e\u003cimg src=\"https://github.com/Zerohertz/Zerohertz/assets/42334717/571f781a-5842-45e9-9652-949c65c34efd\" alt=\"Kubernetes-Rep=5\"/\u003e\u003c/td\u003e\n\u003c/tr\u003e\n\u003c/table\u003e\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\nArchitecture (with Ensemble)\n\u003c/summary\u003e\n\n\u003ctable align=\"center\"\u003e\n\u003ctr\u003e\n\u003ctd align=\"center\"\u003eNumber of Replicas = 1\u003c/td\u003e\n\u003ctd align=\"center\"\u003eNumber of Replicas = 5\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd align=\"center\"\u003e\u003cimg src=\"https://github.com/Zerohertz/Zerohertz/assets/42334717/0292b7a6-3842-40b1-8b8c-c07ce2b2f0c9\" alt=\"Kubernetes-Ensemble-Rep=1\"/\u003e\u003c/td\u003e\n\u003ctd align=\"center\"\u003e\u003cimg src=\"https://github.com/Zerohertz/Zerohertz/assets/42334717/ddba3515-6382-4b1c-9ab0-3e43dca83921\" alt=\"Kubernetes-Ensemble-Rep=5\"/\u003e\u003c/td\u003e\n\u003c/tr\u003e\n\u003c/table\u003e\n\n\u003c/details\u003e\n\n### Experimental Setup\n\n+ Server\n  + `Sync`: FastAPI에서 동기 처리\n  + `Async`: FastAPI에서 비동기 처리\n  + `Rep`: `fastapi`와 `triton-inference-server`의 replica 수\n  + `Ensemble`: `triton-inference-server` 내에서 [ensemble](https://github.com/triton-inference-server/server/blob/main/docs/user_guide/architecture.md#ensemble-models)을 활용해 전, 후처리 및 시각화를 수행 (`fastapi`는 비동기로 작동)\n+ Client (FastAPI를 100회 호출, 10회 실험)\n  + `Serial`: `for`문을 이용해 직렬적 호출\n  + `Concurrency`: `ThreadPoolExecutor`를 이용해 동시 호출\n  + `Random`: `ThreadPoolExecutor`를 이용 및 0 ~ 20초 이후 랜덤 호출\n\n### Results\n\n\u003cdiv align=\"right\"\u003e단위: \u003ccode\u003e[Sec]\u003c/code\u003e\u003c/div\u003e\n\n\u003cdiv align=\"center\"\u003e\n\n|Server Arch.|Mean(Serial)|End(Serial)|Mean(Concurrency)|End(Concurrency)|Mean(Random)|End(Random)|\n|:-:|:-:|:-:|:-:|:-:|:-:|:-:|\n|[Sync\u0026Rep=1](https://github.com/Zerohertz/YOLO-Serving/tree/3.Kubernetes-1.Sync)|0.69|78.01|41.93|129.61|40.05|128.63|\n|[Sync\u0026Rep=5](https://github.com/Zerohertz/YOLO-Serving/tree/3.Kubernetes-1.Sync)|0.60|68.99|25.57|61.38|26.88|81.69|\n|[Async\u0026Rep=1](https://github.com/Zerohertz/YOLO-Serving/tree/3.Kubernetes-2.Async)|0.68|77.02|0.80|82.22|0.78|80.34|\n|[Async\u0026Rep=1-5](https://github.com/Zerohertz/YOLO-Serving/tree/3.Kubernetes-2.Async)|0.61|69.07|0.60|62.11|-|-|\n|[Async\u0026Rep=5](https://github.com/Zerohertz/YOLO-Serving/tree/3.Kubernetes-2.Async)|0.62|69.77|1.84|39.77|1.91|41.84|\n|[Ensemble\u0026Rep=1](https://github.com/Zerohertz/YOLO-Serving/tree/3.Kubernetes-3.Ensemble)|0.70|78.02|0.77|78.50|-|-|\n|[Ensemble\u0026Rep=5](https://github.com/Zerohertz/YOLO-Serving/tree/3.Kubernetes-3.Ensemble)|0.66|74.52|1.90|42.03|-|-|\n\n\u003c/div\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\nFigures\n\u003c/summary\u003e\n\n\u003ctable align=\"center\"\u003e\n\u003ctr\u003e\n\u003ctd align=\"center\"\u003e\u003cimg src=\"figures/EACH-SERIAL.png\" alt=\"EACH-SERIAL\"/\u003e\u003c/td\u003e\n\u003ctd align=\"center\"\u003e\u003cimg src=\"figures/TOTAL-SERIAL.png\" alt=\"TOTAL-SERIAL\"/\u003e\u003c/td\u003e\n\u003c/tr\u003e\n\u003c/table\u003e\n\n\u003ctable align=\"center\"\u003e\n\u003ctr\u003e\n\u003ctd align=\"center\"\u003e\u003cimg src=\"figures/EACH-CONCURRENCY.png\" alt=\"EACH-CONCURRENCY\"/\u003e\u003c/td\u003e\n\u003ctd align=\"center\"\u003e\u003cimg src=\"figures/EACH-CONCURRENCY-ASYNC.png\" alt=\"EACH-CONCURRENCY-ASYNC\"/\u003e\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd colspan=\"2\" align=\"center\"\u003e\u003cimg src=\"figures/TOTAL-CONCURRENCY.png\" alt=\"TOTAL-CONCURRENCY\"/\u003e\u003c/td\u003e\n\u003c/tr\u003e\n\u003c/table\u003e\n\n\u003ctable align=\"center\"\u003e\n\u003ctr\u003e\n\u003ctd align=\"center\"\u003e\u003cimg src=\"figures/EACH-RANDOM.png\" alt=\"EACH-RANDOM\"/\u003e\u003c/td\u003e\n\u003ctd align=\"center\"\u003e\u003cimg src=\"figures/TOTAL-RANDOM.png\" alt=\"TOTAL-RANDOM\"/\u003e\u003c/td\u003e\n\u003c/tr\u003e\n\u003c/table\u003e\n\n\u003c/details\u003e\n\n### Discussion\n\n#### Sync, Async, Ensemble\n\n\u003cdiv align=\"right\"\u003e단위: \u003ccode\u003e[Sec]\u003c/code\u003e\u003c/div\u003e\n\n\u003cdiv align=\"center\"\u003e\n\n|Server Arch.|Mean(Serial)|End(Serial)|Mean(Concurrency)|End(Concurrency)|Mean(Random)|End(Random)|\n|:-:|:-:|:-:|:-:|:-:|:-:|:-:|\n|Sync|0.647|73.499|33.752|95.496|33.460|105.160|\n|Async|0.652|73.395|1.320|60.991|1.345|61.094|\n|Ensemble|0.680|76.270|1.332|60.269|-|-|\n\n\u003c/div\u003e\n\n직렬적 호출 시 동기, 비동기 방식은 차이가 존재하지 않는다.\n\n하지만 비동기 방식은 동기 방식에 비해 동시적 호출 시 약 36.51%, 랜덤 호출 시 약 41.90% 빠른 응답을 확인할 수 있다.\n\n반면 ensemble 방식을 통해 큰 이점은 확인하지 못했지만, 본 실험의 한계일 수 있다. (리소스, 데이터 규모, ...)\n\n\u003cdetails\u003e\n\u003csummary\u003e\n\u003ccode\u003easync def\u003c/code\u003e로 정의된 FastAPI에서 \u003ccode\u003eRandom\u003c/code\u003e 조건의 오류 발생\n\u003c/summary\u003e\n\n```python\nTraceback (most recent call last):\n  File \"anaconda3\\lib\\site-packages\\requests\\models.py\", line 972, in json\n    return complexjson.loads(self.text, kwargs)\n  File \"anaconda3\\lib\\site-packages\\simplejson\\__init__.py\", line 514, in loads\n    return _default_decoder.decode(s)\n  File \"anaconda3\\lib\\site-packages\\simplejson\\decoder.py\", line 386, in decode\n    obj, end = self.raw_decode(s)\n  File \"anaconda3\\lib\\site-packages\\simplejson\\decoder.py\", line 416, in raw_decode\n    return self.scan_once(s, idx=_w(s, idx).end())\nsimplejson.errors.JSONDecodeError: Expecting value: line 1 column 1 (char 0)\n\nDuring handling of the above exception, another exception occurred:\n\nTraceback (most recent call last):\n  File \"Downloads\\curl.py\", line 70, in \u003cmodule\u003e\n    main(i)\n  File \"Downloads\\curl.py\", line 53, in main\n    responses = list(\n  File \"anaconda3\\lib\\concurrent\\futures\\_base.py\", line 609, in result_iterator\n    yield fs.pop().result()\n  File \"anaconda3\\lib\\concurrent\\futures\\_base.py\", line 439, in result\n    return self.__get_result()\n  File \"anaconda3\\lib\\concurrent\\futures\\_base.py\", line 391, in __get_result\n    raise self._exception\n  File \"anaconda3\\lib\\concurrent\\futures\\thread.py\", line 58, in run\n    result = self.fn(*self.args, self.kwargs)\n  File \"Downloads\\curl.py\", line 24, in send_request\n    res = response.json()\n  File \"anaconda3\\lib\\site-packages\\requests\\models.py\", line 976, in json\n    raise RequestsJSONDecodeError(e.msg, e.doc, e.pos)\nrequests.exceptions.JSONDecodeError: Expecting value: line 1 column 1 (char 0)\n```\n\n이는 `Random` 조건에서 발생하는 오류인데, `Concurrency` 조건에선 발생하지 않는게 이상하다.\n\n모든 pod에 대해 로그가 남아있나 살펴봤지만 발견하지 못했다.\n\n아마 비동기적으로 `triton-inference-server`의 service를 호출하다가 생긴 오류로 보인다.\n\n\u003e Chat GPT 선생님의 답변\n\nFastAPI를 사용하면서 특정 패턴의 요청에서만 오류가 발생하는 경우, 다양한 원인이 있을 수 있습니다. 다음은 몇 가지 가능성에 대해 설명합니다:\n\n1. 리소스 한계: 서버 또는 사용 중인 데이터베이스의 연결 수나 기타 리소스 한계에 도달할 수 있습니다. 한 순간에 100회 요청을 보낼 때, 각 요청이 빠르게 처리되어서 리소스가 빠르게 해제되면 문제가 없을 수 있습니다. 그러나 100회의 요청을 20초 동안 지속적으로 보낼 경우, 그 사이에 리소스가 해제되지 않아 문제가 발생할 수 있습니다.\n2. 레이트 리밋팅: 일부 시스템은 짧은 시간 동안 너무 많은 요청을 처리하는 것을 방지하기 위해 레이트 리밋팅(rate limiting)을 사용합니다. 그렇기 때문에 20초 동안 꾸준히 요청을 보내면 제한에 걸릴 수 있습니다.\n3. 메모리 문제: 요청이 누적되면서 서버의 메모리 사용량이 늘어날 수 있습니다. 만약 메모리가 부족하다면, 요청 처리 중에 에러가 발생할 수 있습니다.\n4. 중복 요청 처리: 20초 동안 지속적으로 요청을 보낼 때, 이전 요청의 결과나 상태가 다음 요청의 처리에 영향을 줄 수 있습니다. 중복되는 요청이나 상태 변경이 원인이 될 수 있습니다.\n5. 로깅 또는 모니터링 도구: 요청을 처리하는 동안 로그를 남기거나 모니터링 도구를 사용하는 경우, 요청 빈도가 높아지면 추가적인 부하가 발생할 수 있습니다.\n\n+ 해결 방법\n  + 로깅: 발생하는 오류의 로그를 확인하여 구체적인 원인을 파악합니다.\n  + 프로파일링: 서버의 CPU, 메모리, 네트워크 사용량 등의 리소스 사용 상황을 모니터링하여 병목 현상이나 리소스 부족 문제를 파악합니다.\n  + 테스팅: 서버에 요청을 보내는 다양한 시나리오를 테스트하여 문제의 원인을 파악합니다.\n\n이러한 점검을 통해 문제의 원인을 파악하고 적절한 조치를 취할 수 있습니다.\n\n\u003c/details\u003e\n\n\n#### Replicas\n\n\u003cdiv align=\"right\"\u003e단위: \u003ccode\u003e[Sec]\u003c/code\u003e\u003c/div\u003e\n\n\u003cdiv align=\"center\"\u003e\n\n|Server Arch.|Mean(Serial)|End(Serial)|Mean(Concurrency)|End(Concurrency)|Mean(Random)|End(Random)|\n|:-:|:-:|:-:|:-:|:-:|:-:|:-:|\n|Rep=1|0.691|77.682|14.501|96.777|20.415|104.487|\n|Rep=5|0.629|71.094|9.767|47.726|14.391|61.767|\n\n\u003c/div\u003e\n\nReplica 수의 증가를 통해 API의 응답을 빠르게 할 수 있음을 확인했다. ([파드는 서비스와 통신하도록 구성할 수 있으며, 서비스와의 통신은 서비스의 맴버 중 일부 파드에 자동적으로 로드-밸런싱 된다.](https://kubernetes.io/ko/docs/tutorials/services/connect-applications-service/#%EC%84%9C%EB%B9%84%EC%8A%A4-%EC%83%9D%EC%84%B1%ED%95%98%EA%B8%B0))\n\n특히 동시적 호출 시 큰 향상이 있음을 확인할 수 있다.\n\n\u003cdetails\u003e\n\u003csummary\u003e\n\u003ccode\u003eWORKER TIMEOUT\u003c/code\u003e\n\u003c/summary\u003e\n\n`fastapi`의 replica는 1개, `triton-inference-server`의 replica는 5개 일 때는 발생하지 않던 오류가 `fastapi`의 replica는 5개, `triton-inference-server`의 replica는 5개 일 때 아래와 같이 발생했다.\n\n이것은 `\"--timeout\", \"120\"`을 `Dockerfile`에 추가하여 해결했다.\n\n```bash\n[1] [CRITICAL] WORKER TIMEOUT (pid:8)\n[1] [WARNING] Worker with pid 8 was terminated due to signal 6\n[379] [INFO] Booting worker with pid: 379\n[379] [INFO] Started server process [379]\n[379] [INFO] Waiting for application startup.\n[379] [INFO] Application startup complete.\n```\n\n\u003c/details\u003e\n\n#### Autoscaling\n\n`HPA` 사용 시 한 순간에 100회의 요청이 입력되면 replica를 생성하기 전에 단일 `fastapi` pod에 입력되기 때문에 autoscaling 효과를 볼 수 없다.\n\n따라서 autoscaling을 원활히 하려면 `Resource` 기준이 아닌 새로운 `metrics`가 필요하다.\n\n\u003cdetails\u003e\n\u003csummary\u003e\n예시: \u003ccode\u003ehpa.yaml\u003c/code\u003e\n\u003c/summary\u003e\n\n```yaml\napiVersion: autoscaling/v2beta2\nkind: HorizontalPodAutoscaler\nmetadata:\n  name: triton-inference-server-hpa\nspec:\n  scaleTargetRef:\n    apiVersion: apps/v1\n    kind: Deployment\n    name: triton-inference-server\n  minReplicas: 1\n  maxReplicas: 5\n  metrics:\n    - type: Resource\n      resource:\n        name: cpu\n        target:\n          type: Utilization\n          averageUtilization: 80\n    - type: Resource\n      resource:\n        name: memory\n        target:\n          type: Utilization\n          averageUtilization: 80\n---\napiVersion: autoscaling/v2beta2\nkind: HorizontalPodAutoscaler\nmetadata:\n  name: fastapi-hpa\nspec:\n  scaleTargetRef:\n    apiVersion: apps/v1\n    kind: Deployment\n    name: fastapi\n  minReplicas: 1\n  maxReplicas: 5\n  metrics:\n    - type: Resource\n      resource:\n        name: cpu\n        target:\n          type: Utilization\n          averageUtilization: 80\n    - type: Resource\n      resource:\n        name: memory\n        target:\n          type: Utilization\n          averageUtilization: 80\n```\n\n\u003c/details\u003e\n\n### [3.4. Gradio](https://github.com/Zerohertz/YOLO-Serving-Cookbook/tree/3.Kubernetes-4.Gradio)\n\n\n\u003cdetails\u003e\n\u003csummary\u003e\nArchitecture\n\u003c/summary\u003e\n\u003cdiv align=\"center\"\u003e\n\n![](https://github.com/Zerohertz/YOLO-Serving-Cookbook/assets/42334717/fa647b85-9716-4fd8-933a-bb92ebbda62f)\n\n\u003c/div\u003e\n\u003c/details\u003e\n\n\n\u003cdiv align=\"center\"\u003e\n\n![Gradio](https://github.com/Zerohertz/Zerohertz/assets/42334717/816ec0eb-7ba4-49d4-8302-6a720aba91d4)\n\n\u003c/div\u003e\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fzerohertz%2Fyolo-serving-cookbook","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fzerohertz%2Fyolo-serving-cookbook","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fzerohertz%2Fyolo-serving-cookbook/lists"}