{"id":20161928,"url":"https://github.com/hzdr/dvc_tutorial_series","last_synced_at":"2025-03-03T02:44:05.177Z","repository":{"id":74710212,"uuid":"373877515","full_name":"hzdr/dvc_tutorial_series","owner":"hzdr","description":"Small collection of dvc tutorials","archived":false,"fork":false,"pushed_at":"2021-06-08T14:37:09.000Z","size":20,"stargazers_count":4,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-01-13T14:18:05.042Z","etag":null,"topics":["dvc","dvc-pipeline","mlops"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-3-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/hzdr.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-06-04T15:02:38.000Z","updated_at":"2024-10-16T11:12:30.000Z","dependencies_parsed_at":null,"dependency_job_id":"274e1432-a1fa-45d0-b486-8c2d67cbb12b","html_url":"https://github.com/hzdr/dvc_tutorial_series","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hzdr%2Fdvc_tutorial_series","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hzdr%2Fdvc_tutorial_series/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hzdr%2Fdvc_tutorial_series/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hzdr%2Fdvc_tutorial_series/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/hzdr","download_url":"https://codeload.github.com/hzdr/dvc_tutorial_series/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":241600484,"owners_count":19988713,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["dvc","dvc-pipeline","mlops"],"created_at":"2024-11-14T00:21:55.196Z","updated_at":"2025-03-03T02:44:05.160Z","avatar_url":"https://github.com/hzdr.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"This is the codepart of the article series I published on medium [here]. (https://nsultova.medium.com/exploring-dvc-for-machine-learning-pipelines-in-research-part-1-3ebc2ca35a18) \n\n\nDuring the past months part of my job became looking at different tools to manage machine learning workflows for our team at [HelmholtzAI](https://www.helmholtz.ai/). \n\nA lot of material accumulated on the way, thus I decided to share some of the process and what I’ve learned.\n\nThis repository contains tutorials and code centered around [DVC](https://dvc.org) which became one of our favourite candidates.\n\n\n### Code structure\n\n```\n.\n├── README.md\n├── content\n│   ├── PART_00.md\n│   └── EXAMPLE.md\n│   └── RESSOURCES.md\n└── src\n    ├── assets\n    ├── config.py\n    ├── create_dataset.py\n    ├── create_features.py\n    ├── environment.yml\n    ├── evaluate_model.py\n    ├── params.yaml\n    ├── train_model.py\n    └── wine-quality.csv\n```\n\n- `/content`:  contains the articles and additional informations\n- `/assets`: directiory where (intermediate) results are being written to\n- `environment.yml` - can be used to create a conda environment or just to look up which dependencies are needed\n- `config.py` - handle paths and other variables, makes eventual expanding less cumbersome\n- `params.yaml` - used by .dvc, custom parameters can be set here\n\n..the rest should be self-explanatory.\n\n### Setup\n\nMake sure you have some recent python version installed. (I run Python 3.9.1 within an conda environment on an macOS Big Sur 11.4 as of this writing). \n\nI'd highly recommend to use any flavour of virtual environments (conda, venv, ..) for following along with this tutorial. (Except you're a *-BSD or NixOS user, in thus case I assume you know your way around these issues anyway ^^ ). \n\nClone the repo, make sure the dependencies are installed and you're good to go!\n\nLook into `/content/PART_01`, section **DVC Tutorial** for more information.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhzdr%2Fdvc_tutorial_series","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fhzdr%2Fdvc_tutorial_series","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhzdr%2Fdvc_tutorial_series/lists"}