https://github.com/gangula-karthik/mlops-assignment
An end-to-end MLOps journey from car prices to wheat predictions, all powered by FastAPI, MLflow, and a dash of cloud magic!
https://github.com/gangula-karthik/mlops-assignment
dagshub google-cloud-run mlflow mlops nextjs pycaret vercel
Last synced: 2 months ago
JSON representation
An end-to-end MLOps journey from car prices to wheat predictions, all powered by FastAPI, MLflow, and a dash of cloud magic!
- Host: GitHub
- URL: https://github.com/gangula-karthik/mlops-assignment
- Owner: gangula-karthik
- License: mit
- Created: 2025-02-04T06:19:54.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2025-12-14T16:34:15.000Z (6 months ago)
- Last Synced: 2026-01-03T20:42:28.950Z (5 months ago)
- Topics: dagshub, google-cloud-run, mlflow, mlops, nextjs, pycaret, vercel
- Language: Jupyter Notebook
- Homepage: https://ml-ops-assignment.vercel.app
- Size: 16.2 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# IT3385 - MLOps Assignment
This assignment was completed by Gangula Karthik, Choy Wei Jun and Gabriel Loh. It covers 3 end-to-end machine learning tasks with practical MLOps concepts used alongside the modelling process. To try out the website visit: https://ml-ops-assignment.vercel.app/

## Important Links
- Youtube video link: https://youtu.be/3A7-HXlz9pw
- MLFlow remote tracking server: https://dagshub.com/gangula-karthik/MLOps-Assignment.mlflow/
- DVC Tracking (configured on CLI to host in s3 bucket): https://dagshub.com/gangula-karthik/MLOps-Assignment?filter=dvc
- Frontend: https://ml-ops-assignment.vercel.app/
- Backend Documentation: https://mlops-assignment-734580083911.us-central1.run.app/docs
## Technologies and Libraries Used
The folder structure was setup using cookiecutter
```
cookiecutter https://github.com/mihail911/e2eml-cookiecutter
```
```
📦 Project Root
├── 📂 assets/ # Stores static files like images, PDFs, and other resources
├── 📂 backend/ # Contains all backend-related code, including APIs, models, and logic
│ ├── backend_car_price.py
│ ├── housepricingmodel_karthik_api.py
│ ├── wheatseeds_gabriel_api.py
│ ├── main.py # Entry point for backend services
│ ├── best_cb_model.pkl # Serialized ML model for car price prediction
│ ├── wheat_classifier_pipeline.pkl # Serialized ML model for wheat classification
│ ├── logs.log # Log file for backend operations
│ ├── requirements.txt # Backend dependencies
│ ├── __init__.py
│ ├── *.pyc # Compiled Python files
│ └── ...
│
├── 📂 conf/ # Configuration files for the project (e.g., environment settings)
│ ├── config.yaml # Example: Configuration for ML models, paths, or API keys
│ ├── logging.conf # Example: Logging configuration
│ └── ...
│
├── 📂 frontend/ # Stores frontend code (if applicable, e.g., React, HTML, CSS)
│ ├── src/ # Source code for frontend
| | ├── app/ # all the routes and frontend code
| | ├── components/ # all the components for the individual sections
│ ├── public/ # Static assets for frontend
│ ├── package.json # Frontend dependencies (if using Node.js)
│
├── 📂 notebooks/ # Jupyter notebooks for data exploration, visualization, and analysis
│
├── .gitignore # Specifies files and folders to ignore in version control
├── Dockerfile # Defines how to containerize the project
├── LICENSE # License file for the project
├── mlflow.db # SQLite database for MLflow experiment tracking
├── mlruns.db # Stores metadata for MLflow runs
├── poetry.lock # Poetry lockfile for dependency management
├── pyproject.toml # Poetry project file defining dependencies and project settings
└── README.md # Main project documentation
```
This is the DVC Setup that was done:
```
S3 endpoint url: https://dagshub.com/gangula-karthik/MLOps-Assignment.s3
dvc remote add origin s3://dvc
dvc remote modify origin endpointurl https://dagshub.com/gangula-karthik/MLOps-Assignment.s3
dvc remote modify origin --local access_key_id
dvc remote modify origin --local secret_access_key
```
Setting Up the frontend
```
cd frontend/
npm i
npm run dev
```
Setting up the backend
```
poetry install
poetry shell
cd backend
poetry run python main.py
```
**Backend**
- Python
- Pycaret: An automated machine learning library used to streamline the model development process for all three machine learning tasks.
- FastAPI: A modern, fast (high-performance) web framework for building APIs with Python 3.7+ based on standard Python type hints.
- Pydantic: A data validation and settings management library using Python type annotations, facilitating data parsing and validation for easier error handling and enforcement of type constraints.
- MLflow: Utilized for tracking experiments, including parameter logging, metrics, and model versioning, to enhance model management and reproducibility.
- DVC: Data Version Control was used to track and manage datasets and their versions, ensuring consistency across all stages of the project.
- HYDRA: Employed for managing and configuring the application's environment, making it easier to switch between different setups without changing the codebase.
- POETRY: Used for dependency management and packaging of project, ensuring that all necessary libraries are installed and maintained correctly.
**Frontend**
- Next.js: A React framework for building user interfaces, optimized for a smoother developer experience and high-performance applications.
- HeroUI, ShadCN, and tailwindcss
## Individual Contributions
Karthik:
- worked on housing price prediction
- setup shared environment using poetry
- setup dvc + aws s3 on the data folder
- setup remote mlflow tracking server on dagshub
- integrated different backends together using fastapi router
- deployed frontend (on vercel) and backend (google cloud run)
Wei Jun:
- worked on car sales prediction
Gabriel Loh:
- worked on wheat type prediction