{"id":19772030,"url":"https://github.com/danieldacosta/mlflow-server","last_synced_at":"2026-04-17T10:04:04.503Z","repository":{"id":112641686,"uuid":"387862260","full_name":"DanielDaCosta/mlflow-server","owner":"DanielDaCosta","description":"MLflow server on EC2 with Docker, using s3 and RDS to store artifacts and files. ","archived":false,"fork":false,"pushed_at":"2021-12-05T19:58:29.000Z","size":229,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-02-28T11:30:45.102Z","etag":null,"topics":["aws","docker","gitlab-ci","mlfow"],"latest_commit_sha":null,"homepage":"","language":"Shell","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/DanielDaCosta.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-07-20T17:10:50.000Z","updated_at":"2022-02-02T12:30:41.000Z","dependencies_parsed_at":null,"dependency_job_id":"f345ae12-91af-4576-859f-d08dcd16b45c","html_url":"https://github.com/DanielDaCosta/mlflow-server","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/DanielDaCosta/mlflow-server","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DanielDaCosta%2Fmlflow-server","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DanielDaCosta%2Fmlflow-server/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DanielDaCosta%2Fmlflow-server/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DanielDaCosta%2Fmlflow-server/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/DanielDaCosta","download_url":"https://codeload.github.com/DanielDaCosta/mlflow-server/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DanielDaCosta%2Fmlflow-server/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":261058805,"owners_count":23103924,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["aws","docker","gitlab-ci","mlfow"],"created_at":"2024-11-12T05:05:08.943Z","updated_at":"2025-10-12T10:12:38.167Z","avatar_url":"https://github.com/DanielDaCosta.png","language":"Shell","funding_links":[],"categories":[],"sub_categories":[],"readme":"# MLflow server on EC2 using Postgres (RDS) and S3 as backend\n\nMlflow is an open-source platform to manage the ML lifecycle, including experimentation, reproducibility, deployment, and a central model registry.\n\n## Architecture\n\n![MlflowDiagram](images/MLFlow_Infra.png)\n\n## IAM Permissions\nThe EC2 instance requires the following policies:\n- *Read-Write* from the s3 bucket.\n- *AmazonEC2ContainerRegistryReadOnly*\n- *Read Parameters* from SSM:\n```json \n{\n    \"Version\": \"2012-10-17\",\n    \"Statement\": [\n        {\n            \"Sid\": \"VisualEditor0\",\n            \"Effect\": \"Allow\",\n            \"Action\": [\n                \"ssm:GetParameters\",\n                \"ssm:GetParameter\"\n            ],\n            \"Resource\": \"arn:aws:ssm:us-east-1:{ACCOUNT_ID}:parameter/mlflow/*\"\n        }\n    ]\n}\n```\n- KMS decrypt for encrypted SSM parameters\n```json\n{\n    \"Version\": \"2012-10-17\",\n    \"Statement\": [\n        {\n            \"Sid\": \"VisualEditor0\",\n            \"Effect\": \"Allow\",\n            \"Action\": \"kms:Decrypt\",\n            \"Resource\": \"arn:aws:kms:us-east-1:{ACCOUNT_ID}:key/{KEY-ID}\"\n        }\n    ]\n}\n```\n\n## Code Structure\n\nThe code was executed on AWS AMI: Amazon Linux 2\n\n### Database/:\n\nCreates the database for Mlflow to store its files:\n\n```sql\nCREATE DATABASE mlflow;\n```\n\n### Container/:\n\nContains the Dockerfile for building the Image.\n\nThe image is built and pushed to the ECR through the Gitlab-CI. Check the gitlab-ci.yml for the full code:\n```bash\ndocker build -t $DOCKER_REGISTRY/$APP_NAME:latest container/\naws ecr describe-repositories --repository-names $APP_NAME || aws ecr create-repository --repository-name $APP_NAME\naws ecr get-login-password --region $AWS_DEFAULT_REGION | docker login --username AWS --password-stdin $DOCKER_REGISTRY\ndocker push $DOCKER_REGISTRY/$APP_NAME:latest\n ```\n\n### Shell Script/:\n- All secret credentials are stored on AWS SSM under path mlflow/:\n\n```bash\nACCOUNT_ID=`aws sts get-caller-identity --output text --query 'Account'`\nDB_PASS=`aws ssm get-parameters --region us-east-1 --names /mlflow/DB_PASS --with-decryption --query \"Parameters[0].Value\" --output text`\nDB_HOST=`aws ssm get-parameters --region us-east-1 --names /mlflow/DB_HOST --query \"Parameters[0].Value\" --output text`\nDB_USER=`aws ssm get-parameters --region us-east-1 --names /mlflow/DB_USER --query \"Parameters[0].Value\" --output text`\n```\n\n- Installing Docker. Adding ec2-user to the docker group so you can execute Docker commands without using sudo:\n\n```bash\nsudo yum update -y\nsudo amazon-linux-extras install docker\nsudo service docker start\nsudo usermod -a -G docker ec2-user\n```\n\n- **Configure Docker to start on boot** (https://docs.docker.com/engine/install/linux-postinstall/#configure-docker-to-start-on-boot):\n\n```sudo systemctl enable docker.service```\n\n- **Setting amazon-ecr-credential-helper**: This means that developers or build scripts using the Docker CLI no longer have to explicitly use the ECR API to retrieve a secure token, nor call docker login with this token before pushing or pulling container image. (https://aws.amazon.com/blogs/containers/amazon-ecr-credential-helper-now-supports-amazon-ecr-public/)\n```bash\nmkdir -p /home/ec2-user/.docker\n\nsudo cat \u003c\u003c-END \u003e\u003e /home/ec2-user/.docker/config.json\n{\n    \"credHelpers\": {\n        \"public.ecr.aws\": \"ecr-login\",\n        \"${ACCOUNT_ID}.dkr.ecr.us-east-1.amazonaws.com/mlflow:latest\": \"ecr-login\"\n    }\n}\nEND\n\n# Change the user and/or group ownership of a given file, directory, or symbolic link\nchown -R ec2-user:ec2-user /home/ec2-user/.docker\n```\n\n- Pulling Images from ECR:\n\n```bash\nsudo -u ec2-user docker pull $ACCOUNT_ID.dkr.ecr.us-east-1.amazonaws.com/mlflow:latest\n# Running container. The container is configured to always restart. Port 5000 is available to services outside of Docker.\ndocker run --env BUCKET=s3://mlflow-artifacts/ --env USERNAME=$DB_USER --env PASSWORD=$DB_PASS \\\n--env HOST=$DB_HOST --env PORT=5432 --env DATABASE=mlflow \\\n-p 5000:5000 -d --restart always --name mlflow-server $ACCOUNT_ID.dkr.ecr.us-east-1.amazonaws.com/mlflow:latest\n```\n\n## Usage\nAfter pushing the image to ECR, run the command `bash run_container.sh`; it will build the docker image for you.\n\n\n## MLFlow - User Guide\n1. Set ML_FLOW_TRACKING_URI:\n```\nML_FLOW_TRACKING_URI = \"http://localhost:5000\"\n```\n2. Create a new experiment:\n```\nmlflow.set_experiment(\"{project-name}-{user}\")\n```\n\n3. Run mlflow's APIs for starting and managing MLflow runs:\n\n```mlflow.start_run({name-of-your-runner})\nIn the end: mlflow.end_run()\n```\n\n4. Logging model Parameters:\n```python\n# XGBoost Params\nparam_dist = {\n    'objective':'binary:logistic',\n    'n_estimators': 1000,\n    'scale_pos_weight' : count_notchurn/count_churn,\n    'max_depth' : 4,\n    'min_child_weight': 6,\n    'learning_rate': 0.005,\n    'subsample': 0.8,\n    'colsample_bytree': 0.7,\n    'gamma': 0,\n    'seed': 42\n}\n\nmlflow.log_params(param_dist)\n```\n\n5. Logging Metrics:\n- Single metrics\n```python\nmlflow.log_metric('Recall', recall)\nmlflow.log_metric('Precision', precision)\nmlflow.log_metric('Balanced Accuracy', balanced_accuracy)\nmlflow.log_metric('F1', f1)\n```\n- Logging metrics per epochs:\n```python\n# For XGboost model\nresults = model.evals_result() # Get metric lists\neval_metric_result = 'logloss'\n\nfor i, metric in enumerate(results['validation_1'][eval_metric_result]):\n    mlflow.log_metric('Validation LogLoss', metric, step=i)\n```\n\n![MetricPerEpoch](images/metric_per_epoch.png)\n\n6. Save Images: You can save, for example, SHAP summary_plot images:\n\n```python\nshap.summary_plot(shap_values, X_test, max_display=50, show=False)\nfig_shap = 'SHAP_Xgboost.png'\npyplot.savefig(fig_shap, bbox_inches='tight')\n\nmlflow.log_artifact(fig_shap)\n```\n\n7. Save Model: You can also store the entire model:\n```python\nmlflow.xgboost.log_model(model, \"XGboost\")\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdanieldacosta%2Fmlflow-server","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdanieldacosta%2Fmlflow-server","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdanieldacosta%2Fmlflow-server/lists"}