Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/danhenriquex/machine_learning_project
Machine Learning Project Setup with DVC, Hydra, GCP and Docker
https://github.com/danhenriquex/machine_learning_project
docker dvc gcp hydra machine-learning mlops
Last synced: 4 days ago
JSON representation
Machine Learning Project Setup with DVC, Hydra, GCP and Docker
- Host: GitHub
- URL: https://github.com/danhenriquex/machine_learning_project
- Owner: danhenriquex
- Created: 2024-08-21T13:27:23.000Z (3 months ago)
- Default Branch: main
- Last Pushed: 2024-09-08T17:32:38.000Z (2 months ago)
- Last Synced: 2024-09-09T19:12:07.883Z (2 months ago)
- Topics: docker, dvc, gcp, hydra, machine-learning, mlops
- Language: Makefile
- Homepage:
- Size: 2 MB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
🤖 Machine Learning Project
Learning MLOps.
Overview •
Technologies and Tools Used •
Getting Started •
Author
🚧 MLOps Project 🚀 Finished 🚧### Overview
This project demonstrates the setup of a Data Version Control (DVC) system using Google Cloud Storage (GCS) as the remote storage for data versioning. It leverages Docker for containerization, Hydra for configuration management, and Poetry for dependency management.### Technologies and Tools Used
- **Docker**: Used to containerize the application, making it portable and easier to deploy in different environments.
- **GCP (Google Cloud Platform)**: Google Cloud Storage is used to store raw data and manage versioning through DVC.
- **Hydra**: Manages the configuration schema for the project, helping with flexible and hierarchical configuration setups.
- **DVC**: Used for versioning datasets and model files. It helps in tracking changes and managing large files efficiently.
- **Poetry**: Handles dependency management, ensuring all required packages are installed in a virtual environment.### Getting Started
To get started with this project, follow these steps:
1. **Clone the Repository:**
```bash
git clone
cd
```
2. **Create environtment:**```bash
# To install and update dependencies
make lock-dependencies
# Build the docker container
make build
```
3. **Update Dataset**
```bash
# Updates dataset in GCP and push changes to github repository
make version-data
```### Author
---