{"id":16567919,"url":"https://github.com/drorata/mlem-review","last_synced_at":"2026-04-23T03:33:08.010Z","repository":{"id":79632262,"uuid":"498316648","full_name":"drorata/mlem-review","owner":"drorata","description":"Exploring the new tool MLEM by Iterative","archived":false,"fork":false,"pushed_at":"2022-06-17T12:30:18.000Z","size":15,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2026-03-31T06:35:43.888Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/drorata.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-05-31T11:58:26.000Z","updated_at":"2022-06-17T14:55:39.000Z","dependencies_parsed_at":"2023-05-14T01:00:23.964Z","dependency_job_id":null,"html_url":"https://github.com/drorata/mlem-review","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/drorata/mlem-review","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/drorata%2Fmlem-review","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/drorata%2Fmlem-review/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/drorata%2Fmlem-review/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/drorata%2Fmlem-review/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/drorata","download_url":"https://codeload.github.com/drorata/mlem-review/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/drorata%2Fmlem-review/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32165061,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-23T02:19:40.750Z","status":"ssl_error","status_checked_at":"2026-04-23T02:17:55.737Z","response_time":53,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-10-11T21:07:47.453Z","updated_at":"2026-04-23T03:33:07.993Z","avatar_url":"https://github.com/drorata.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Playing around with MLEM\n\nMLEM promises that you:\n\n\u003e Use the same human-readable format for any ML framework\n\nThis is a bold promise and in this repo I will explore it a little (and maybe some other features as well).\nNote that the machine learning part of the content is only secondary.\nIn the foreground we put the process and the tools.\n\n## Fetching and preparing the data 👷🏽‍♀️\n\nTo keep it simple on the ML front, we use the [Iris data set](https://scikit-learn.org/stable/auto_examples/datasets/plot_iris_dataset.html).\nThe data is obtained in [`get_data.py`](./get_data.py); see the comments there for more details.\n\nThis script is used in the first stage of the DVC pipeline which is coded in [`dvc.yaml`](./dvc.yaml).\n\n## Training the model and persisting it using MLEM 🚀\n\nIn [`train_and_persist.py`](./train_and_persist.py) we, well, train and persist the model.\nAgain, in `dvc.yaml` this script is used as the second stage.\nHere it is important to pay more attention to the `mlem.api.save()` statement:\n\n```python\nsave(\n    rf, \"rf\", sample_data=X, description=\"Random Forest Classifier\",\n)\n```\n\n`rf` is the fitted model and it is given a name; the _string_ `\"rf\"`.\nIn addition a description is provided (See issue [#279](https://github.com/iterative/mlem/issues/279) for a related topic).\nFurthermore, by providing a value to the parameter `sample_data`, MLEM will include the schema of the data in the model's meta data.\nCheckout [`.mlem/model/rf.mlem`](./.mlem/model/rf.mlem).\n\n## What's next? Or how to get predictions using an API? ⚡️\n\nBy running `dvc repro` in this project following things will happen:\n\n- Iris data set will be fetched and splitted into train and test sets.\n- A model will be train.\n- The model will be persisted by MLEM; its metadata ([`.mlem/model/rf.mlem`](./.mlem/model/rf.mlem)) will be tracked by Git and the model itself ([`.mlem/model/rf`](./.mlem/model/rf)) will be tracked by DVC.\n\nNow comes the fun part.\nBy running:\n\n```bash\nmlem build rf docker --conf server.type=fastapi --conf image.name=rf-image-test\n```\n\nMLEM will build a docker image that can be used to get predictions from the trained model using an API.\nOnce the image is built, a container can be ran:\n\n```\ndocker run --rm -it -p 8080:8080 rf-image-test\n```\n\nOnce it is up and running, the documentation of the endpoints of the new API can be found here: http://0.0.0.0:8080/docs.\n\nTo make it easier, [`Taskfile.yml`](./Taskfile.yml) can help in building and serving the image.\nFor more details on how to use a `Taskfile`, checkout [`task`](https://taskfile.dev/).\n\nFinally, once MLEM is serving the model, we can get predictions for our test set using [`evaluate.py`](./evaluate.py).\nTo that end we simply send a list of dictionaries to the `/predict` end point and get, in return, a list of predictions.\nIsn't it really wonderful?\n\n\u003ccenter\u003e\n\u003cimg src=\"https://cdn.pixabay.com/photo/2020/06/04/08/50/awesome-5257905_1280.png\" alt=\"Isn't it awesome?\" width=25% height=25%\u003e\n\u003c/center\u003e\n\n## Summary\n\nSo, in this repository you can find an end-to-end example how to bring your ML model to life as an API that can return predictions.\nThis bridges a huge hurdle that data science teams face.\nAfter completing the hard work related to data fetching, cleaning, feature engineering, models training/evaluation/tuning and so on, the team is ready to deliver great value.\nAlas... Now support from DevOps and Data engineers is needed to bring the model to production.\nUsing MLEM, the team is much closer to be independent and impact directly and quickly.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdrorata%2Fmlem-review","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdrorata%2Fmlem-review","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdrorata%2Fmlem-review/lists"}