An open API service indexing awesome lists of open source software.

https://github.com/hiejulia/ml-pipeline

End to end production ML pipeline
https://github.com/hiejulia/ml-pipeline

Last synced: 6 months ago
JSON representation

End to end production ML pipeline

Awesome Lists containing this project

README

          

Buy Me A Coffee


# ML pipeline project
- End to end ML pipeline

- Data ingest
- Data validation
- Data processing
- Model training
- Model analysis and validation
- Model deployment
- Data governance - security
- Beam & Airflow
- Kubeflow pipeline

# Install
- Python 3.6+
- Tensorflow 2
- TFX pipeline `pip install tfx`
- Airflow ` pip install apache-airflow`
- Kubeflow
- Apache Beam `pip install apache-beam`

#### Data Ingestion

#### Data Validation
- `pip install tensorflow-data-validation`
- Data anomalies
- Data schema
- Statistics with new dataset vs prev training dataset
- Missing values, correlation
-

#### Data Preprocessing
- `pip install tensorflow-transform`
- Apache Beam
- pandas, numpy
- normalize feature data
- Scale

#### Model Training - Analysis - Validation
- `pip install tensorflow-model-analysis`

#### Model Deploy
- API
- TensorFlow Serving
- Deploy to cloud
- Model optimization for deployment

#### Kubenetes for deployment

#### Beam & Airflow
#### Kubeflow

#### Model API
- API endpoint for serving ML model
- Persistent datastore(redis) for store model prediction
- Kafka integration for push model prediction results to monitor topic
- Build docker