{"id":19747988,"url":"https://github.com/tensorchord/modelz-py","last_synced_at":"2025-10-18T21:20:36.228Z","repository":{"id":142843263,"uuid":"602482456","full_name":"tensorchord/modelz-py","owner":"tensorchord","description":"Python SDK and CLI for modelz.ai, which is a developer-first platform for prototyping and deploying machine learning models. ","archived":false,"fork":false,"pushed_at":"2023-10-12T08:22:00.000Z","size":204,"stargazers_count":7,"open_issues_count":5,"forks_count":6,"subscribers_count":6,"default_branch":"main","last_synced_at":"2024-02-25T12:34:21.361Z","etag":null,"topics":["inference","llm","machine-learning","serverless","serverless-inference"],"latest_commit_sha":null,"homepage":"https://tensorchord.github.io/modelz-py/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/tensorchord.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null}},"created_at":"2023-02-16T09:59:05.000Z","updated_at":"2023-09-03T17:19:17.000Z","dependencies_parsed_at":null,"dependency_job_id":"e5a26503-f02a-444b-b349-dc58eb2daae3","html_url":"https://github.com/tensorchord/modelz-py","commit_stats":null,"previous_names":[],"tags_count":7,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tensorchord%2Fmodelz-py","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tensorchord%2Fmodelz-py/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tensorchord%2Fmodelz-py/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tensorchord%2Fmodelz-py/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/tensorchord","download_url":"https://codeload.github.com/tensorchord/modelz-py/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":224203383,"owners_count":17272939,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["inference","llm","machine-learning","serverless","serverless-inference"],"created_at":"2024-11-12T02:19:42.145Z","updated_at":"2025-10-18T21:20:36.164Z","avatar_url":"https://github.com/tensorchord.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Modelz Python SDK\n\n[Docs](https://tensorchord.github.io/modelz-py/) | [Templates](https://cloud.modelz.ai/templates) | [ModelZ](https://modelz.ai) | [ModelZ Docs](https://docs.modelz.ai/)\n\n[ModelZ](https://modelz.ai) is an MLOps platform, you can deploy serverless instance for machine learning models by packed Docker image, such as Stable Diffusion, Dolly, ImageBind, and so on...\n\n`Deployment` is an instance of any ML service deployed at `ModelZ`, you could create one and then make `inference` to it. \n\n`Templates` are preset Docker images for `deployment`, which is widely acknowledged used ML models, official templates are built and maintained by `ModelZ` developers. While it's available for you to define your own `template` and `deployment`.\n\nThe python SDK is designed for CURD to your `deployments`, and sent request to them to make `inference`. It's an alternative of ModelZ WebUI operation, which could be more friendly with CI/CD pipelines or at model development.\n\n## Install\n\n```shell\npip install modelz-py\n```\n\n## CLI usage\n\nWe support these functions now:\n\n- create/update/list/delete deployments\n- make inference to deployments\n- get metric information of any deployment\n\nThose functions will be supported in the future:\n\n- build image and push to registry\n\n**See [CLI Docs](https://tensorchord.github.io/modelz-py/cli.html) for all usages.**\n\n## Example\n\n### Create and infer to ModelZ deployment at terminal\n\n#### Step 1: Create deployment\nFirst, you need to create a deployment at ModelZ platform. We pick `Stable Diffusion` image for this example.\nTo get more predefined images, see our [templates](https://cloud.modelz.ai/templates).\n\nYou can get your ModelZ `API Key` and `User ID` from [here](https://cloud.modelz.ai/settings) after register.\n\nModelZ supports these type of images:\n- DockerHub images: starts with `docker.io/...`, you could build it yourself and upload to DockerHub.\n- Google Cloud Registry images: starts with `xxx-docker.pkg.dev/...`, maintainered by ModelZ developers and you could find them at our [Templates](https://cloud.modelz.ai/templates).\n\n```shell\nexport MODELZ_API_KEY=mzi-1234567890987654321\nexport MODELZ_USER=00000000-1111-1111-1111-000000000000\nmodelz deployment create \\\n--image us-central1-docker.pkg.dev/nth-guide-378813/modelzai/mosec-stable-diffusion:23.04.1 \\\n--server-resource nvidia-tesla-t4-4c-16g \\\n--framework mosec \\\n--name stable-diffusion-mosec\n```\nThe result might be something like:\n```json\n{\n  \"spec\": {\n    \"deployment_resource\": {},\n    \"deployment_source\": {\n      \"docker\": {\n        \"image\": \"us-central1-docker.pkg.dev/nth-guide-378813/modelzai/mosec-stable-diffusion:23.04.1\"\n      }\n    },\n    \"framework\": \"mosec\",\n    \"http_probe_path\": \"/\",\n    \"id\": \"0a93636b-5ed3-4abd-8fac-8a7c5a4026c9\",\n    \"image_config\": {\n      \"enable_cache_optimize\": false\n    },\n    \"max_replicas\": 1,\n    \"min_replicas\": 0,\n    \"name\": \"stable-diffusion-mosec\",\n    \"server_resource\": \"nvidia-tesla-t4-4c-16g\",\n    \"startup_duration\": 300,\n    \"target_load\": 10,\n    \"zero_duration\": 300\n  },\n  \"status\": {\n    \"available_replicas\": 0,\n    \"innocation_count\": 0,\n    \"replicas\": 0\n  }\n}\n```\n\n#### Step 2: Get Inference Endpoint\nAfter a while, you could get endpoint of deployment from `list` command.\n```shell\nmodelz deployment list -k mzi-1234567890987654321 -u 00000000-1111-1111-1111-000000000000\n```\nor `get` command where deployment id of `-d` from `create` command:\n```shell\nmodelz deployment get -k mzi-1234567890987654321 -u 00000000-1111-1111-1111-000000000000 -d 0a93636b-5ed3-4abd-8fac-8a7c5a4026c9\n```\nThe result(get) might be something like:\n```json\n{\n  \"spec\": {\n    \"deployment_resource\": {},\n    \"deployment_source\": {\n      \"docker\": {\n        \"image\": \"us-central1-docker.pkg.dev/nth-guide-378813/modelzai/mosec-stable-diffusion:23.04.1\"\n      }\n    },\n    \"framework\": \"mosec\",\n    \"id\": \"0a93636b-5ed3-4abd-8fac-8a7c5a4026c9\",\n    \"image_config\": {\n      \"enable_cache_optimize\": false\n    },\n    \"max_replicas\": 1,\n    \"min_replicas\": 0,\n    \"name\": \"stable-diffusion-mosec\",\n    \"server_resource\": \"nvidia-tesla-t4-4c-16g\",\n    \"startup_duration\": 300,\n    \"target_load\": 10,\n    \"zero_duration\": 300\n  },\n  \"status\": {\n    \"available_replicas\": 0,\n    \"created_at\": \"2023-10-12T06:17:15Z\",\n    \"endpoint\": \"http://stable-diffusion-mosec-vc166fuhjuzkupai.modelz.tech\",\n    \"innocation_count\": 0,\n    \"phase\": \"NoReplicas\",\n    \"replicas\": 0\n  }\n}\n```\n\n#### Step 3: Make Inference\nThen you could send any inference you like to the deployment.\n```shell\nexport MODELZ_API_KEY=mzi-1234567890987654321\nmodelz inference \\\n--endpoint http://stable-diffusion-mosec-vc166fuhjuzkupai.modelz.tech \\\n--serde msgpack --write-file cat.jpg cute cat\n```\n\n#### Step 4: Delete deployment\nWhen you don't need an deployment any more, don't forget to delete it when you want.\nThe selected deployment would be deleted immediately.\n**This operation can not be undone!**\n\n```shell\nexport MODELZ_API_KEY=mzi-1234567890987654321\nexport MODELZ_USER=00000000-1111-1111-1111-000000000000\nmodelz deployment delete -d b807e092-f748-4d71-8a1d-e57be617c532\n```\n\n### Create and infer to ModelZ deployment by code\n\n```python\nimport time\nfrom modelz import DeploymentClient, ModelzClient\nfrom modelz.openapi.sdk.models import (\n    DeploymentSpec,\n    DeploymentCreateRequest,\n    DeploymentDockerSource,\n    DeploymentSource,\n    DeploymentSpec,\n    DeploymentUpdateRequest,\n    FrameworkType,\n    ServerResource,\n    DeploymentUpdateRequestEnvVars,\n)s\nfrom modelz.console import jsonFormattedPrint\n\n# Get ModelZ User ID and API Key from https://cloud.modelz.ai/settings after register.\nmodelz_user_id = \"00000000-1111-1111-1111-000000000000\"\nmodelz_api_key = \"mzi-1234567890987654321\"\n\n# Create client to operate deployments\nclient = DeploymentClient(login_name=modelz_user_id, key=modelz_api_key)\n\n# Step 1: Create deployment\nspec = DeploymentSpec(\n        deployment_source=DeploymentSource(\n            docker=DeploymentDockerSource(\n                image=\"us-central1-docker.pkg.dev/nth-guide-378813/modelzai/mosec-stable-diffusion:23.04.1\")),\n        server_resource=ServerResource.NVIDIA_TESLA_T4_4C_16G,\n        framework=FrameworkType.MOSEC,\n        name=\"stable-diffusion\",\n        min_replicas=0,\n        max_replicas=1,\n        startup_duration=300,\n        zero_duration=300,\n        target_load=10,\n    )\nresp = client.create(DeploymentCreateRequest(spec))\nprint(jsonFormattedPrint(resp))\n# Get id of deployment\ndeployment_id = resp.parsed.spec.id\n\n# Step 2: Get deployments its endpoint for inference\nresp = client.get(deployment_id)\nprint(jsonFormattedPrint(resp))\nendpoint = resp.parsed.status.endpoint\n\n# Waiting for ingress created\ntime.sleep(10)\n\n# Step 3: Make Inference\ninfer_client = ModelzClient(key=modelz_api_key, endpoint=endpoint, timeout=300)\nresp = infer_client.inference(params=\"cute cat\", serde=\"msgpack\")\nresp.save_to_file(\"image.jpg\")\n\n# Step 3.1: Update deployment\nreq = DeploymentUpdateRequest(\n    env_vars=DeploymentUpdateRequestEnvVars.from_dict({\"debug\":\"true\"})\n)\nresp = client.update(deployment_id, req)\nprint(jsonFormattedPrint(resp))\n\n# Step 4: Delete deployment\nclient.delete(deployment_id)\n```\n\n### Gradio Client on ModelZ Endpoints\n\nWe provide a lightweight Python library that makes it very easy to use any Gradio app served on modelz as an API. The functionalities of `GradioClient` are completely identical to `Client` in  `gradio_client` library provided by Gradio. The only difference is that when initializing the client, you should enter your Modelz serving endpoint URL instead of a Hugging Face space.\n\nExample Usage:\n\n```python\nfrom modelz import GradioClient as Client\n\n# Parameter here is the endpoint of your Modelz deployment\n# The format is like https://${DEPOLOYMENT_KEY}.modelz.io/\ncli = Client(\"https://translator-th85ze61tj4n3klc.modelz.io/\")\n\ncli.view_api() \n# \u003e\u003e Client.predict() Usage Info\n# ---------------------------\n# Named API endpoints: 1\n\n#  - predict(text, api_name=\"/predict\") -\u003e output\n#     Parameters:\n#      - [Textbox] text: str \n#     Returns:\n#      - [Textbox] output: str \n\n      \ncli.predict(\"hallo\", api_name=\"/predict\")\n# \u003e\u003e \"Bonjour.\"\n\n\n```","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftensorchord%2Fmodelz-py","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftensorchord%2Fmodelz-py","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftensorchord%2Fmodelz-py/lists"}