{"id":18802264,"url":"https://github.com/oracle-quickstart/oci-mlflow","last_synced_at":"2025-06-29T01:35:20.732Z","repository":{"id":106380564,"uuid":"417580627","full_name":"oracle-quickstart/oci-mlflow","owner":"oracle-quickstart","description":null,"archived":false,"fork":false,"pushed_at":"2023-01-19T19:47:48.000Z","size":787,"stargazers_count":5,"open_issues_count":0,"forks_count":2,"subscribers_count":3,"default_branch":"master","last_synced_at":"2024-12-29T20:16:01.775Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"HCL","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"upl-1.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/oracle-quickstart.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-10-15T17:20:03.000Z","updated_at":"2024-11-11T22:52:48.000Z","dependencies_parsed_at":null,"dependency_job_id":"eab8060d-a492-4fc4-afbf-c89275747ec5","html_url":"https://github.com/oracle-quickstart/oci-mlflow","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":"oracle-quickstart/oci-quickstart-template","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/oracle-quickstart%2Foci-mlflow","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/oracle-quickstart%2Foci-mlflow/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/oracle-quickstart%2Foci-mlflow/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/oracle-quickstart%2Foci-mlflow/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/oracle-quickstart","download_url":"https://codeload.github.com/oracle-quickstart/oci-mlflow/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":239735263,"owners_count":19688262,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-07T22:27:13.891Z","updated_at":"2025-02-19T21:12:55.846Z","avatar_url":"https://github.com/oracle-quickstart.png","language":"HCL","funding_links":[],"categories":[],"sub_categories":[],"readme":"# oci-mlflow\n\nMLflow is an open source platform to manage the ML lifecycle, including experimentation, reproducibility, deployment, and a central model registry.\n\nMLflow is library-agnostic. You can use it with any machine learning library, and in any programming language, since all functions are accessible through a REST API and CLI. For convenience, the project also includes a Python API, R API, and Java API.\n\nIn the following sections, we will show how to deploy MLflow on OCI and use the components in your machine learning applications with Docker containers for tracking, training, and serving. In a typical machine learning workflow, you can track experiment runs and models with MLflow. \n\nYou can also integrate MLflow with OCI Data Science service and OCI AI Services (e.g., tracking artifacts, parameters, metrics, and model, etc).\n\n\n## Prerequisites\n\nPermission to `manage` the following types of resources in your Oracle Cloud Infrastructure tenancy: `vcns`, `internet-gateways`, `route-tables`, `security-lists`, `subnets`, `buckets`, and `mysql-instances`.\n\nQuota to create the following resources: 1 VCN, 1 subnet, 1 Internet Gateway, 1 route rules, 1 MySQL Database Service, 1 Object Storage bucket, and 2 compute instances.\n\nIf you don't have the required permissions and quota, contact your tenancy administrator. See [Policy Reference](https://docs.cloud.oracle.com/en-us/iaas/Content/Identity/Reference/policyreference.htm), [Service Limits](https://docs.cloud.oracle.com/iaas/Content/General/Concepts/resourcequotas.htm), and [Compartment Quotas](https://docs.cloud.oracle.com/iaas/Content/General/Concepts/resourcequotas.htm).\n\n\n## Deploy Using the Terraform CLI\n\nFirst off we'll need to do some pre deploy setup.  That's all detailed [here](https://github.com/oracle/oci-quickstart-prerequisites).\n\nSecondly, create a provider.auto.tfvars file (`cp provider.auto.tfvars.template provider.auto.tfvars`) and set all the parameters in the file. For S3 Compatibility API with Object Storage, you can reference [endpoint](https://docs.oracle.com/en-us/iaas/Content/Object/Tasks/s3compatibleapi.htm) and [Customer Secret key](https://docs.oracle.com/en-us/iaas/Content/Identity/Tasks/managingcredentials.htm#create-secret-key). You will need the customer key later.\n\nYou might need to update `userdata\\docker\\requirements-training.txt` files with required dependencies for your specific machine learning applications with Python. You can also install extra Python packages in the docker containers later. \n\nMake sure you have terraform v0.14+ cli installed and accessible from your terminal.\n\n### Build\n\nAt first time, you are required to initialize the terraform modules used by the template with  `terraform init` command:\n\n```bash\n$ terraform init\n\nInitializing the backend...\n\nInitializing provider plugins...\n- Finding latest version of hashicorp/archive...\n- Installing hashicorp/archive v2.1.0...\n- Installed hashicorp/archive v2.1.0 (signed by HashiCorp)\n\nTerraform has created a lock file .terraform.lock.hcl to record the provider\nselections it made above. Include this file in your version control repository\nso that Terraform can guarantee to make the same selections by default when\nyou run \"terraform init\" in the future.\n\nTerraform has been successfully initialized!\n\nYou may now begin working with Terraform. Try running \"terraform plan\" to see\nany changes that are required for your infrastructure. All Terraform commands\nshould now work.\n\nIf you ever set or change modules or backend configuration for Terraform,\nrerun this command to reinitialize your working directory. If you forget, other\ncommands will detect it and remind you to do so if necessary.\n```\n\nOnce terraform is initialized, just run the following commands to preview and create the resources:\n\n```bash\n$ terraform plan\n$ terraform apply\n```\n\nYou can find the IPs of compute instances (tracking, training, and serving) from the Terraform output values:\n\n```bash\n...\ncompute_linux_instances = {\n  \"serving\" = {\n    \"id\" = \"ocid1.instance.oc1...\"\n    \"ip\" = \"...\"\n  }\n  \"tracking\" = {\n    \"id\" = \"ocid1.instance.oc1...\"\n    \"ip\" = \"...\"\n  }\n}\n...\n```\n\nYou then need to ssh to each compute instances and follow the instructions in `~/commands.txt` to start MLflow tracking server, setup HTTP Basic authentication, and serve a model. Now you can use the MLflow UI (`http://\u003ctracking.ip\u003e:3000`). \n\nThe MLflow tracking server has two components for storage: a backend store and an artifact store. We use a MySQL Database Service instance as the backend store and an Object Storage bucket as the artifact store. The MySQL DB system endpoint uses a private IP address and is not directly accessible from the internet. \n\n\n## Verify the Deployment\n\nWe will showcases how you can use MLflow end-to-end with MLflow sample applications [sklearn_elasticnet_wine](https://github.com/mlflow/mlflow/tree/master/examples/sklearn_elasticnet_wine).\n\n### Training the Models\n\nCreate a OCI Data Science [notebook session](https://docs.oracle.com/en-us/iaas/data-science/using/manage-notebook-sessions.htm) to access a JupyterLab interface using a customizable compute, storage, and network configuration. Add `MLFLOW_TRACKING_URI`, `MLFLOW_TRACKING_USERNAME`,`MLFLOW_TRACKING_PASSWORD`,`MLFLOW_S3_ENDPOINT_URL`, `AWS_ACCESS_KEY_ID` (Customer Secret access key) and `AWS_SECRET_ACCESS_KEY` (Customer Secret secret key) custom environment variables to your notebook session. Use the machine learning libraries () the JupyterLab interface to complete all steps in [train.ipynb](https://github.com/mlflow/mlflow/blob/master/examples/sklearn_elasticnet_wine/train.ipynb). \n\nYou need to install mlflow and boto3 libraries if not preinstalled in your conda environment in your notebook session.\n\n### Comparing the Models\n\nUse the MLflow UI to compare the models that you have produced. On this page, you can see a list of experiment runs with metrics you can use to compare the models.\n\nSelect a model, go to the model run page and copy the logged_model path (`runs:/\u003crun_uuid\u003e/model`).\n\nThe model run page also shows you the code snippets to demonstrate how to make predictions using the logged model.\n\n### Serving the Model\n\nFollow the instruction in \"~/commands.txt\" on the serving compute instance to deploy a local REST server that can serve predictions using the model-uri.\n\nFor models created by the MLflow sample `sklearn_elasticnet_wine`, you can make requests to `POST` `/invocations` in pandas split or record-oriented formats. \n\nOnce you have deployed the server, you can pass it some sample data and see the predictions.\n\n```bash\ncurl -X POST -H \"Content-Type:application/json; format=pandas-split\" --data '{\"columns\":[\"fixed acidity\",\"volatile acidity\",\"citric acid\",\"residual sugar\",\"chlorides\",\"free sulfur dioxide\",\"total sulfur dioxide\",\"density\",\"pH\",\"sulphates\",\"alcohol\"],\"data\":[[6.2, 0.66, 0.48, 1.2, 0.029, 29, 75, 0.98, 3.33, 0.39, 12.8]]}' http://\u003cserving.ip\u003e:1234/invocations\n```\n\nYou should see the output of wine quality in the response.\n\nIf you want to use a different port, you will need to add firewall rules on the serving compute instance.\n\n## Destroy the Deployment \n\nWhen you no longer need the MLflow environment, you can run this command to destroy the resources:\n\n```bash\nterraform destroy\n```\n\n## Deploy MLflow tracking server on OKE\n\nYou can also deploy MLflow tracking server on an existing OKE and MySQL Database Service follow the following steps: \n\nSet up ingress controller on Kubernetes cluster.  That's all detailed [here](https://docs.oracle.com/en-us/iaas/Content/ContEng/Tasks/contengsettingupingresscontroller.htm).\n\nSet up local VCN peering between VCNs so the MLflow tracking server running on OKE can access MySQL Database Service. That's all detailed [here](https://docs.oracle.com/en-us/iaas/Content/Network/Tasks/localVCNpeering.htm).\n\nAccess the Kubernetes cluster in Cloud Shell. That's all detailed [here](https://docs.oracle.com/en-us/iaas/Content/ContEng/Tasks/contengdownloadkubeconfigfile.htm#cloudshelldownload).\n\nCreate mlflow namespace\n```bash\nkubectl create namespace mlflow\n```\n\nCreate htpasswd file\n```bash\nhtpasswd -c auth \u003cfirst_username\u003e\nhtpasswd auth \u003cother_username\u003e\n```\n\nCreate Kubernetes secret for basic authentication\n```bash\nkubectl create secret generic mlflow-tracking-basic-auth --from-file=auth --namespace mlflow\n```\n\nCreate Kubernetes secret for pulling docker image from OCIR\n```bash\nkubectl create secret docker-registry mlflowocirsecret --docker-server=\u003cregion-key\u003e.ocir.io --docker-username='\u003ctenancy-namespace\u003e/\u003coci-username\u003e' --docker-password='\u003coci-auth-token\u003e' --docker-email='\u003cemail-address\u003e'\n```\n\nUpdate `oke\\tracking\\mlflow-tracking-secret-template.yaml` and `oke\\tracking\\mlflow-tracking-template.yaml` files. You can use the docker image create by the terraform above. You can also build the docker image using `userdata\\docker\\Dockerfile-tracking` and `userdata\\docker\\miniconda_install.sh` files and then push to OCI Registry that's detailed [here](https://docs.oracle.com/en-us/iaas/Content/Registry/Tasks/registrypushingimagesusingthedockercli.htm).\n\nDeploy the MLflow tracking server with the updated yaml files\n```bash\nkubectl apply -f mlflow-tracking-secret.yaml\nkubectl apply -f mlflow-tracking-template.yaml\n```\n\nYour can now access the MLflow tracking server on OKE with `http://\u003cingress-nginx-controller-EXTERNAL-IP/tracking/` \n\n## Architecture Diagram\n\n![OCI Diagram](./images/oci-mlflow.png)\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Foracle-quickstart%2Foci-mlflow","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Foracle-quickstart%2Foci-mlflow","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Foracle-quickstart%2Foci-mlflow/lists"}