Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/DataTalksClub/machine-learning-zoomcamp
Learn ML engineering for free in 4 months!
https://github.com/DataTalksClub/machine-learning-zoomcamp
Last synced: about 2 months ago
JSON representation
Learn ML engineering for free in 4 months!
- Host: GitHub
- URL: https://github.com/DataTalksClub/machine-learning-zoomcamp
- Owner: DataTalksClub
- Created: 2020-04-17T04:29:23.000Z (over 4 years ago)
- Default Branch: master
- Last Pushed: 2024-10-26T09:33:50.000Z (about 2 months ago)
- Last Synced: 2024-10-29T11:22:35.831Z (about 2 months ago)
- Language: Jupyter Notebook
- Homepage:
- Size: 11.1 MB
- Stars: 9,517
- Watchers: 160
- Forks: 2,243
- Open Issues: 4
-
Metadata Files:
- Readme: README.md
- Funding: .github/FUNDING.yml
Awesome Lists containing this project
- AiTreasureBox - DataTalksClub/machine-learning-zoomcamp - 12-20_9723_-1](https://img.shields.io/github/stars/DataTalksClub/machine-learning-zoomcamp.svg)|The code from the Machine Learning Bookcamp book and a free course based on the book| (Repos)
README
# Machine Learning Zoomcamp
* Register at [DataTalks.Club](https://DataTalks.Club) and join the `#course-ml-zoomcamp` channel
* [Course telegram channel](https://t.me/mlzoomcamp)
* [Course playlist](https://www.youtube.com/playlist?list=PL3MmuxUbc_hIhxl5Ji8t4O6lPAOpHaCLR)### Syllabus
- [Introduction to Machine Learning](#1-introduction-to-machine-learning)
- [Machine Learning for Regression](#2-machine-learning-for-regression)
- [Machine Learning for Classification](#3-machine-learning-for-classification)
- [Evaluation Metrics for Classification](#4-evaluation-metrics-for-classification)
- [Deploying Machine Learning Models](#5-deploying-machine-learning-models)
- [Decision Trees and Ensemble Learning](#6-decision-trees-and-ensemble-learning)
- [Neural Networks and Deep Learning](#8-neural-networks-and-deep-learning)
- [Serverless Deep Learning](#9-serverless-deep-learning)
- [Kubernetes and TensorFlow Serving](#10-kubernetes-and-tensorflow-serving)## Taking the course
### 2024 cohort
We start the course again in September 2024
* 16 September, 17:00 Berlin time
* [Sign up here](https://airtable.com/shryxwLd0COOEaqXo)
* Register at [DataTalks.Club](https://DataTalks.Club/slack.html) and join the `#course-ml-zoomcamp` channel
* Join the [course telegram channel](https://t.me/mlzoomcamp)
* Subscribe to the [public google calendar](https://calendar.google.com/calendar/?cid=cGtjZ2tkbGc1OG9yb2lxa2Vwc2g4YXMzMmNAZ3JvdXAuY2FsZW5kYXIuZ29vZ2xlLmNvbQ) (subscribing works from desktop only)
* [Tweet about it](https://ctt.ac/XZ6b9)
* If you have questions, check [FAQ](https://docs.google.com/document/d/1LpPanc33QJJ6BSsyxVg-pWNMplal84TdZtq10naIhD8/edit)
* All the materials specific to the 2024 will be in [the 2024 cohort folder](./cohorts/2024/)### Self-paced mode
You can take the course at your own pace. All the materials are freely available, and you can start learning at any time.
To take the best out of this course, we recommened this:
* Register at [DataTalks.Club](https://DataTalks.Club) and join the `#course-ml-zoomcamp` channel
* For each module, watch the videos and work through the code
* If you have any questions, ask them in the `#course-ml-zoomcamp` channel in Slack
* Do homework. There are solutions, but we advise to first attempt the homework yourself, and after that check the solutions
* Do at least one project. Two is better. Only this way you can make sure you're really learning. If you need feedback, use the `#course-ml-zoomcamp` channelOf course, you can take each module independently.
### Prerequisites
* Prior programming experience (at least 1+ year)
* Being comfortable with command line
* No prior exposure to machine learning is requiredNice to have but not mandatory
* Python (but you can learn it during the course)
* Prior exposure to linear algebra will be helpful (e.g. you studied it in college but forgot)## Asking questions
The best way to get support is to use [DataTalks.Club's Slack](https://datatalks.club/slack.html). Join the [`#course-ml-zoomcamp`](https://app.slack.com/client/T01ATQK62F8/C0288NJ5XSA) channel.
To make discussions in Slack more organized:
* Follow [these recommendations](asking-questions.md) when asking for help
* Read the [DataTalks.Club community guidelines](https://datatalks.club/slack/guidelines.html)> We encourage [Learning in Public](learning-in-public.md)
## 1. [Introduction to Machine Learning](01-intro/)
- 1.1 [Introduction to Machine Learning](01-intro/01-what-is-ml.md)
- 1.2 [ML vs Rule-Based Systems](01-intro/02-ml-vs-rules.md)
- 1.3 [Supervised Machine Learning](01-intro/03-supervised-ml.md)
- 1.4 [CRISP-DM](01-intro/04-crisp-dm.md)
- 1.5 [Model Selection Process](01-intro/05-model-selection.md)
- 1.6 [Setting up the Environment](01-intro/06-environment.md)
- 1.7 [Introduction to NumPy](01-intro/07-numpy.md)
- 1.8 [Linear Algebra Refresher](01-intro/08-linear-algebra.md)
- 1.9 [Introduction to Pandas](01-intro/09-pandas.md)
- 1.10 [Summary](01-intro/10-summary.md)
- 1.11 [Homework](01-intro/homework.md)## 2. [Machine Learning for Regression](02-regression/)
- 2.1 [Car price prediction project](02-regression/01-car-price-intro.md)
- 2.2 [Data preparation](02-regression/02-data-preparation.md)
- 2.3 [Exploratory data analysis](02-regression/03-eda.md)
- 2.4 [Setting up the validation framework](02-regression/04-validation-framework.md)
- 2.5 [Linear regression](02-regression/05-linear-regression-simple.md)
- 2.6 [Linear regression: vector form](02-regression/06-linear-regression-vector.md)
- 2.7 [Training linear regression: Normal equation](02-regression/07-linear-regression-training.md)
- 2.8 [Baseline model for car price prediction project](02-regression/08-baseline-model.md)
- 2.9 [Root mean squared error](02-regression/09-rmse.md)
- 2.10 [Using RMSE on validation data](02-regression/10-car-price-validation.md)
- 2.11 [Feature engineering](02-regression/11-feature-engineering.md)
- 2.12 [Categorical variables](02-regression/12-categorical-variables.md)
- 2.13 [Regularization](02-regression/13-regularization.md)
- 2.14 [Tuning the model](02-regression/14-tuning-model.md)
- 2.15 [Using the model](02-regression/15-using-model.md)
- 2.16 [Car price prediction project summary](02-regression/16-summary.md)
- 2.17 [Explore more](02-regression/17-explore-more.md)
- 2.18 [Homework](02-regression/homework.md)## 3. [Machine Learning for Classification](03-classification/)
- 3.1 [Churn prediction project](03-classification/01-churn-project.md)
- 3.2 [Data preparation](03-classification/02-data-preparation.md)
- 3.3 [Setting up the validation framework](03-classification/03-validation.md)
- 3.4 [EDA](03-classification/04-eda.md)
- 3.5 [Feature importance: Churn rate and risk ratio](03-classification/05-risk.md)
- 3.6 [Feature importance: Mutual information](03-classification/06-mutual-info.md)
- 3.7 [Feature importance: Correlation](03-classification/07-correlation.md)
- 3.8 [One-hot encoding](03-classification/08-ohe.md)
- 3.9 [Logistic regression](03-classification/09-logistic-regression.md)
- 3.10 [Training logistic regression with Scikit-Learn](03-classification/10-training-log-reg.md)
- 3.11 [Model interpretation](03-classification/11-log-reg-interpretation.md)
- 3.12 [Using the model](03-classification/12-using-log-reg.md)
- 3.13 [Summary](03-classification/13-summary.md)
- 3.14 [Explore more](03-classification/14-explore-more.md)
- 3.15 [Homework](03-classification/homework.md)## 4. [Evaluation Metrics for Classification](04-evaluation/)
- 4.1 [Evaluation metrics: session overview](04-evaluation/01-overview.md)
- 4.2 [Accuracy and dummy model](04-evaluation/02-accuracy.md)
- 4.3 [Confusion table](04-evaluation/03-confusion-table.md)
- 4.4 [Precision and Recall](04-evaluation/04-precision-recall.md)
- 4.5 [ROC Curves](04-evaluation/05-roc.md)
- 4.6 [ROC AUC](04-evaluation/06-auc.md)
- 4.7 [Cross-Validation](04-evaluation/07-cross-validation.md)
- 4.8 [Summary](04-evaluation/08-summary.md)
- 4.9 [Explore more](04-evaluation/09-explore-more.md)
- 4.10 [Homework](04-evaluation/homework.md)## 5. [Deploying Machine Learning Models](05-deployment/)
- 5.1 [Intro / Session overview](05-deployment/01-intro.md)
- 5.2 [Saving and loading the model](05-deployment/02-pickle.md)
- 5.3 [Web services: introduction to Flask](05-deployment/03-flask-intro.md)
- 5.4 [Serving the churn model with Flask](05-deployment/04-flask-deployment.md)
- 5.5 [Python virtual environment: Pipenv](05-deployment/05-pipenv.md)
- 5.6 [Environment management: Docker](05-deployment/06-docker.md)
- 5.7 [Deployment to the cloud: AWS Elastic Beanstalk (optional)](05-deployment/07-aws-eb.md)
- 5.8 [Summary](05-deployment/08-summary.md)
- 5.9 [Explore more](05-deployment/09-explore-more.md)
- 5.10 [Homework](05-deployment/homework.md)## 6. [Decision Trees and Ensemble Learning](06-trees/)
- 6.1 [Credit risk scoring project](06-trees/01-credit-risk.md)
- 6.2 [Data cleaning and preparation](06-trees/02-data-prep.md)
- 6.3 [Decision trees](06-trees/03-decision-trees.md)
- 6.4 [Decision tree learning algorithm](06-trees/04-decision-tree-learning.md)
- 6.5 [Decision trees parameter tuning](06-trees/05-decision-tree-tuning.md)
- 6.6 [Ensemble learning and random forest](06-trees/06-random-forest.md)
- 6.7 [Gradient boosting and XGBoost](06-trees/07-boosting.md)
- 6.8 [XGBoost parameter tuning](06-trees/08-xgb-tuning.md)
- 6.9 [Selecting the best model](06-trees/09-final-model.md)
- 6.10 [Summary](06-trees/10-summary.md)
- 6.11 [Explore more](06-trees/11-explore-more.md)
- 6.12 [Homework](06-trees/homework.md)## [Midterm Project](projects/#midterm-project)
Putting everything we've learned so far in practice!
## 8. [Neural Networks and Deep Learning](08-deep-learning/)
- 8.1 [Fashion classification](08-deep-learning/01-fashion-classification.md)
- 8.1b [Setting up the Environment on Saturn Cloud](08-deep-learning/01b-saturn-cloud.md)
- 8.2 [TensorFlow and Keras](08-deep-learning/02-tensorflow-keras.md)
- 8.3 [Pre-trained convolutional neural networks](08-deep-learning/03-pretrained-models.md)
- 8.4 [Convolutional neural networks](08-deep-learning/04-conv-neural-nets.md)
- 8.5 [Transfer learning](08-deep-learning/05-transfer-learning.md)
- 8.6 [Adjusting the learning rate](08-deep-learning/06-learning-rate.md)
- 8.7 [Checkpointing](08-deep-learning/07-checkpointing.md)
- 8.8 [Adding more layers](08-deep-learning/08-more-layers.md)
- 8.9 [Regularization and dropout](08-deep-learning/09-dropout.md)
- 8.10 [Data augmentation](08-deep-learning/10-augmentation.md)
- 8.11 [Training a larger model](08-deep-learning/11-large-model.md)
- 8.12 [Using the model](08-deep-learning/12-using-model.md)
- 8.13 [Summary](08-deep-learning/13-summary.md)
- 8.14 [Explore more](08-deep-learning/14-explore-more.md)
- 8.15 [Homework](08-deep-learning/homework.md)For the deep learning part, we need to use a GPU. ML Zoomcamp students can use
[Saturn Cloud](https://bit.ly/saturn-mlzoomcamp) and get extra 150 GPU hours there.
Message support and say "I'm enrolled in ML Zoomcamp" to get an upgrade.## 9. [Serverless Deep Learning](09-serverless/)
- 9.1 [Introduction to Serverless](09-serverless/01-intro.md)
- 9.2 [AWS Lambda](09-serverless/02-aws-lambda.md)
- 9.3 [TensorFlow Lite](09-serverless/03-tensorflow-lite.md)
- 9.4 [Preparing the code for Lambda](09-serverless/04-preparing-code.md)
- 9.5 [Preparing a Docker image](09-serverless/05-docker-image.md)
- 9.6 [Creating the lambda function](09-serverless/06-creating-lambda.md)
- 9.7 [API Gateway: exposing the lambda function](09-serverless/07-api-gateway.md)
- 9.8 [Summary](09-serverless/08-summary.md)
- 9.9 [Explore more](09-serverless/09-explore-more.md)
- 9.10 [Homework](09-serverless/homework.md)## 10. [Kubernetes and TensorFlow Serving](10-kubernetes/)
- 10.1 [Overview](10-kubernetes/01-overview.md)
- 10.2 [TensorFlow Serving](10-kubernetes/02-tensorflow-serving.md)
- 10.3 [Creating a pre-processing service](10-kubernetes/03-preprocessing.md)
- 10.4 [Running everything locally with Docker-compose](10-kubernetes/04-docker-compose.md)
- 10.5 [Introduction to Kubernetes](10-kubernetes/05-kubernetes-intro.md)
- 10.6 [Deploying a simple service to Kubernetes](10-kubernetes/06-kubernetes-simple-service.md)
- 10.7 [Deploying TensorFlow models to Kubernetes](10-kubernetes/07-kubernetes-tf-serving.md)
- 10.8 [Deploying to EKS](10-kubernetes/08-eks.md)
- 10.9 [Summary](10-kubernetes/09-summary.md)
- 10.10 [Explore more](10-kubernetes/10-explore-more.md)
- 10.11 [Homework](10-kubernetes/homework.md)## 11. [KServe](11-kserve/) (optional)
- 11.1 [Overview](11-kserve/01-overview.md)
- 11.2 [Running KServe locally](11-kserve/02-kserve-local.md)
- 11.3 [Deploying a Scikit-Learn model with KServe](11-kserve/03-kserve-sklearn.md)
- 11.4 [Deploying custom Scikit-Learn images with KServe](11-kserve/04-kserve-custom-image.md)
- 11.5 [Serving TensorFlow models with KServe](11-kserve/05-tensorflow-kserve.md)
- 11.6 [KServe transformers](11-kserve/06-kserve-transformers.md)
- 11.7 [Deploying with KServe and EKS](11-kserve/07-kserve-eks.dm)
- 11.8 [Summary](11-kserve/08-summary.md)
- 11.9 [Explore more](11-kserve/09-explore-more.md)## [Capstone Project 1](projects/#capstone-1)
Putting everything we've learned so far in practice one more time!
## [Article](article/)
Writing an article about something not covered in the course.
## [Capstone project 2](projects/#capstone-2) (optional)
For those who love projects!
## Supporters and partners
Thanks to the course sponsors for making it possible to run this course
Do you want to support our course and our community? Please reach out to [[email protected]]([email protected])