https://github.com/districtdatalabs/navyfcu-ml
Notebooks and data for Machine Learning course.
https://github.com/districtdatalabs/navyfcu-ml
Last synced: 11 months ago
JSON representation
Notebooks and data for Machine Learning course.
- Host: GitHub
- URL: https://github.com/districtdatalabs/navyfcu-ml
- Owner: DistrictDataLabs
- License: mit
- Created: 2018-09-07T13:40:33.000Z (over 7 years ago)
- Default Branch: master
- Last Pushed: 2018-10-07T17:09:14.000Z (over 7 years ago)
- Last Synced: 2025-01-09T08:27:47.180Z (about 1 year ago)
- Language: HTML
- Size: 15.5 MB
- Stars: 4
- Watchers: 3
- Forks: 4
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Generalized Machine Learning
This repository contains notebooks, data, and slides for the survey of generalized machine learning and distributed computing training from September 14, 2018 - September 28, 2018. During this three day course, we will cover the following topics:
Day One:
- ML Review: Generalized ML and Spatial Learning, Bias/Variance Tradeoff, Model Selection Triple
- Regularized Regression: LASSO vs Ridge; ElasticNet and more
- Clustering: Partitive vs Agglomerative Clustering; clustering evaluation methods, visualization
- Classification I: Instance and Inductive Models (kNN, Decision Trees, Ensembles of Trees)
Day 2:
- Classification II: Parametric Models: SVMs, Bayesian Models, Logistic Regression
- Dimensionality Reduction and Manifolds: PCA, SVD, tSNE, Isomaps
- Neural Networks I: Multi-Layer Perceptrons
- Neural Networks II: Deep Learning and Tensorflow
Day 3:
- Introduction to Spark: RDDs and Architecture
- Programming Spark - interactive analysis and distributed jobs
- Using Spark for data analysis: Spark SQL and Spark DataFrames
- Spark for distributed ML: Spark MLlib