{"id":20275068,"url":"https://github.com/danhenriquex/machine_learning_project","last_synced_at":"2025-03-04T01:28:08.869Z","repository":{"id":254120796,"uuid":"845554840","full_name":"danhenriquex/Machine_Learning_Project","owner":"danhenriquex","description":"Machine Learning Project Setup with DVC, Hydra, GCP and Docker","archived":false,"fork":false,"pushed_at":"2024-09-08T17:32:38.000Z","size":2093,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-01-14T06:26:43.505Z","etag":null,"topics":["docker","dvc","gcp","hydra","machine-learning","mlops"],"latest_commit_sha":null,"homepage":"","language":"Makefile","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/danhenriquex.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-08-21T13:27:23.000Z","updated_at":"2024-09-08T17:32:41.000Z","dependencies_parsed_at":"2024-12-25T22:32:55.040Z","dependency_job_id":null,"html_url":"https://github.com/danhenriquex/Machine_Learning_Project","commit_stats":null,"previous_names":["danhenriquex/e2e_ml_project","danhenriquex/machine_learning_project"],"tags_count":9,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/danhenriquex%2FMachine_Learning_Project","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/danhenriquex%2FMachine_Learning_Project/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/danhenriquex%2FMachine_Learning_Project/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/danhenriquex%2FMachine_Learning_Project/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/danhenriquex","download_url":"https://codeload.github.com/danhenriquex/Machine_Learning_Project/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":241766558,"owners_count":20016782,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["docker","dvc","gcp","hydra","machine-learning","mlops"],"created_at":"2024-11-14T13:07:47.228Z","updated_at":"2025-03-04T01:28:08.849Z","avatar_url":"https://github.com/danhenriquex.png","language":"Makefile","readme":"\u003ch1 align=\"center\"\u003e🤖 Machine Learning Project\u003c/h1\u003e\n\u003cp align=\"center\" id=\"objetivo\"\u003eLearning MLOps. \n\u003c/p\u003e \n\n\u003cp align=\"center\"\u003e\n \u003ca href=\"#overview\"\u003eOverview\u003c/a\u003e •\n \u003ca href=\"#features\"\u003eTechnologies and Tools Used\u003c/a\u003e •\n \u003ca href=\"#started\"\u003eGetting Started\u003c/a\u003e • \n \u003ca href=\"#author\"\u003eAuthor\u003c/a\u003e\n\u003c/p\u003e\n\n\u003ch4 align=\"center\"\u003e \n\t🚧  MLOps Project 🚀 Finished  🚧 \n\u003c/h4\u003e\n\n### Overview\n\n\u003cdiv style='margin: 20px' id=\"overview\"\u003e\nThis project demonstrates the setup of a Data Version Control (DVC) system using Google Cloud Storage (GCS) as the remote storage for data versioning. It leverages Docker for containerization, Hydra for configuration management, and Poetry for dependency management.\n\u003c/div\u003e\n\n### Technologies and Tools Used\n\n\u003cdiv id=\"features\"\u003e\n\n- **Docker**: Used to containerize the application, making it portable and easier to deploy in different environments.\n- **GCP (Google Cloud Platform)**: Google Cloud Storage is used to store raw data and manage versioning through DVC.\n- **Hydra**: Manages the configuration schema for the project, helping with flexible and hierarchical configuration setups.\n- **DVC**: Used for versioning datasets and model files. It helps in tracking changes and managing large files efficiently.\n- **Poetry**: Handles dependency management, ensuring all required packages are installed in a virtual environment.\n\n\u003c/div\u003e\n\n\u003cdiv id=\"started\"\u003e\n\n### Getting Started\n\nTo get started with this project, follow these steps:\n\n1. **Clone the Repository:**\n\n   ```bash\n   git clone \u003crepository_url\u003e\n   cd \u003crepository_directory\u003e\n   ```\n   \n2. **Create environtment:**\n\n   ```bash\n   # To install and update dependencies\n   make lock-dependencies\n   \n   # Build the docker container\n   make build\n   ```\n3. **Update Dataset**\n   \n   ```bash\n   # Updates dataset in GCP and push changes to github repository\n   make version-data\n   ```\n\n\u003c/div\u003e\n\n\n### Author\n\n---\n\n\u003c!-- \u003cscript type=\"text/javascript\" src=\"https://platform.linkedin.com/badges/js/profile.js\" async defer\u003e\u003c/script\u003e --\u003e\n\n\u003cdiv align=\"left\" id=\"author\"\u003e\n\n\u003ca href=\"https://github.com/danhenriquex\"\u003e\n  \u003cimg src=\"https://github.com/danhenriquex.png\" width=\"100\" height=\"100\" style=\"border-radius: 50%\"/\u003e\n\u003c/a\u003e\n\n\u003c!-- \u003cdiv class=\"LI-profile-badge\"  data-version=\"v1\" data-size=\"medium\" data-locale=\"pt_BR\" data-type=\"vertical\" data-theme=\"dark\" data-vanity=\"danilo-henrique-santana\"\u003e\u003ca class=\"LI-simple-link\" href='https://br.linkedin.com/in/danilo-henrique-santana?trk=profile-badge'\u003eDanilo Henrique\u003c/a\u003e\u003c/div\u003e --\u003e\n\u003c/div\u003e\n\n\u003cdiv style=\"margin-top: 20px\" \u003e\n  \u003ca href=\"https://www.linkedin.com/in/danilo-henrique-480032167/\"\u003e\n    \u003cimg  src=\"https://img.shields.io/badge/LinkedIn-0077B5?style=for-the-badge\u0026logo=linkedin\u0026logoColor=white\"/\u003e\n  \u003c/a\u003e\n\u003c/div\u003e\n\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdanhenriquex%2Fmachine_learning_project","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdanhenriquex%2Fmachine_learning_project","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdanhenriquex%2Fmachine_learning_project/lists"}