{"id":14971631,"url":"https://github.com/notai-tech/fastdeploy","last_synced_at":"2025-04-13T04:16:09.102Z","repository":{"id":40616053,"uuid":"254351073","full_name":"notAI-tech/fastDeploy","owner":"notAI-tech","description":"Deploy DL/ ML inference pipelines with minimal extra code.","archived":false,"fork":false,"pushed_at":"2024-11-20T06:25:30.000Z","size":16425,"stargazers_count":97,"open_issues_count":0,"forks_count":17,"subscribers_count":7,"default_branch":"master","last_synced_at":"2025-04-13T04:15:55.319Z","etag":null,"topics":["deep-learning","docker","falcon","gevent","gunicorn","http-server","inference-server","model-deployment","model-serving","python","pytorch","serving","streaming-audio","tensorflow-serving","tf-serving","torchserve","triton","triton-inference-server","triton-server","websocket"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/notAI-tech.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-04-09T11:20:16.000Z","updated_at":"2025-03-06T22:44:06.000Z","dependencies_parsed_at":"2023-01-23T20:01:19.928Z","dependency_job_id":"1ab26e27-21b8-45b1-a8fe-b29da0a745bc","html_url":"https://github.com/notAI-tech/fastDeploy","commit_stats":{"total_commits":430,"total_committers":5,"mean_commits":86.0,"dds":"0.32558139534883723","last_synced_commit":"93adace2297ccbc331fe82c2e94819919fe1f4aa"},"previous_names":[],"tags_count":21,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/notAI-tech%2FfastDeploy","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/notAI-tech%2FfastDeploy/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/notAI-tech%2FfastDeploy/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/notAI-tech%2FfastDeploy/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/notAI-tech","download_url":"https://codeload.github.com/notAI-tech/fastDeploy/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248661717,"owners_count":21141451,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["deep-learning","docker","falcon","gevent","gunicorn","http-server","inference-server","model-deployment","model-serving","python","pytorch","serving","streaming-audio","tensorflow-serving","tf-serving","torchserve","triton","triton-inference-server","triton-server","websocket"],"created_at":"2024-09-24T13:45:34.824Z","updated_at":"2025-04-13T04:16:09.080Z","avatar_url":"https://github.com/notAI-tech.png","language":"Python","readme":"## fastDeploy\n#### easy and performant micro-services for Python Deep Learning inference pipelines\n\n- Deploy any python inference pipeline with minimal extra code\n- Auto batching of concurrent inputs is enabled out of the box\n- no changes to inference code (unlike tf-serving etc), entire pipeline is run as is\n- Promethues metrics (open metrics) are exposed for monitoring\n- Auto generates clean dockerfiles and kubernetes health check, scaling friendly APIs\n- sequentially chained inference pipelines are supported out of the box\n- can be queried from any language via easy to use rest apis\n- easy to understand (simple consumer producer arch) and simple code base\n\n\n#### Installation:\n```bash\npip install --upgrade fastdeploy fdclient\n# fdclient is optional, only needed if you want to use python client\n```\n\n#### [CLI explained](https://github.com/notAI-tech/fastDeploy/blob/master/cli.md)\n\n#### Start fastDeploy server on a recipe: \n```bash\n# Invoke fastdeploy \npython -m fastdeploy --help\n# or\nfastdeploy --help\n\n# Start prediction \"loop\" for recipe \"echo\"\nfastdeploy --loop --recipe recipes/echo\n\n# Start rest apis for recipe \"echo\"\nfastdeploy --rest --recipe recipes/echo\n```\n\n#### Send a request and get predictions:\n\n- [Python client usage](https://github.com/notAI-tech/fastDeploy/blob/master/clients/python/README.md)\n\n- [curl usage]()\n\n- [Nodejs client usage]()\n\n#### auto generate dockerfile and build docker image:\n```bash\n# Write the dockerfile for recipe \"echo\"\n# and builds the docker image if docker is installed\n# base defaults to python:3.8-slim\nfastdeploy --build --recipe recipes/echo\n\n# Run docker image\ndocker run -it -p8080:8080 fastdeploy_echo\n```\n\n#### Serving your model (recipe):\n\n- [Writing your model/pipeline's recipe](https://github.com/notAI-tech/fastDeploy/blob/master/recipe.md)\n\n\n### Where to use fastDeploy?\n\n- to deploy any non ultra light weight models i.e: most DL models, \u003e50ms inference time per example\n- if the model/pipeline benefits from batch inference, fastDeploy is perfect for your use-case\n- if you are going to have individual inputs (example, user's search input which needs to be vectorized or image to be classified)\n- in the case of individual inputs, requests coming in at close intervals will be batched together and sent to the model as a batch\n- perfect for creating internal micro services separating your model, pre and post processing from business logic\n- since prediction loop and inference endpoints are separated and are connected via sqlite backed queue, can be scaled independently\n\n\n### Where not to use fastDeploy?\n- non cpu/gpu heavy models that are better of running parallely rather than in batch\n- if your predictor calls some external API or uploads to s3 etc in a blocking way\n- io heavy non batching use cases (eg: query ES or db for each input)\n- for these cases better to directly do from rest api code (instead of consumer producer mechanism) so that high concurrency can be achieved\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnotai-tech%2Ffastdeploy","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fnotai-tech%2Ffastdeploy","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnotai-tech%2Ffastdeploy/lists"}