https://github.com/anilkumarteegala/wqu-ds-unit-2
This repo contains all the files material releated to WorldQuant University's Data Science Summer 2020 Session Unit 2: Machine Learning and Statistical Analysis
https://github.com/anilkumarteegala/wqu-ds-unit-2
data-science machine-learning statistical-analysis wqu
Last synced: 3 months ago
JSON representation
This repo contains all the files material releated to WorldQuant University's Data Science Summer 2020 Session Unit 2: Machine Learning and Statistical Analysis
- Host: GitHub
- URL: https://github.com/anilkumarteegala/wqu-ds-unit-2
- Owner: AnilKumarTeegala
- Created: 2020-09-08T02:34:45.000Z (over 5 years ago)
- Default Branch: master
- Last Pushed: 2020-09-08T03:38:06.000Z (over 5 years ago)
- Last Synced: 2025-01-12T23:25:59.180Z (over 1 year ago)
- Topics: data-science, machine-learning, statistical-analysis, wqu
- Homepage:
- Size: 5.86 KB
- Stars: 31
- Watchers: 3
- Forks: 33
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# WQU-DS-Unit-2
This repo contains all the files material releated to WorldQuant University's Data Science Summer 2020 Session Unit 2: Machine Learning and Statistical Analysis
## Syllabus
1. Introduction to machine learning and Scikit-learn API
2. Regression, classification, & model selection (miniproject: ml)
3. Feature engineering
4. NLP and dimension reduction (miniproject: nlp)
5. KNeighbors, clustering and ensemble models
6. Support vector machines
7. Time series analysis and anomaly detection
8. Clustering
## Chapter Content Videos
### Ch1. Introduction to machine learning and Scikit-learn API
- [Introduction to Machine Learning](https://youtu.be/9J6FNvil6Gw)
- [1.2.1 Intro to Scikit-learn](https://youtu.be/ecryKgFv5MA)
- [1.2.2 Predictors](https://youtu.be/AHALKrsFdVw)
- [1.2.3 Transformers and Pipeline](https://youtu.be/zVU4131QPsw)
- [1.2.4 Feature Unions](https://youtu.be/_gOyCmdumps)
- [1.2.5 Custom Transformers](https://youtu.be/I0YbI9x51kU)
- [1.2.6 Custom Predictors](https://youtu.be/bxf2JHcH4UI)
- [1.2.7 Exercise Distance Transformer](https://youtu.be/0zYnOu1axGQ)
- [1.2.8 Exercise Majority Classifier](https://youtu.be/SGLjDeJqDb0)
- [1.3.1 Persisting Your Model](https://youtu.be/V9Pm_EZvABA)
- [1.3.2 Common Mistakes](https://youtu.be/YZQNIUIAi3A)
### Ch2. Regression, classification, & model selection
- [2.1.1 Regression Metrics](https://youtu.be/wlMrMvphMuw)
- [2.2.1 Linear Regression Intro](https://youtu.be/L1h6cIG8XcA)
- [2.2.2 Gradient Descent and Huber Loss](https://youtu.be/j0d4RogPiXM)
- [2.2.3 Multivariate Regression](https://youtu.be/9YZbXipwAQg)
- [2.2.4 Feature Importance](https://youtu.be/x5XB1ynjCGI)
- [2.3.1 Classification Metrics](https://youtu.be/SMmbzWn8yGI)
- [2.3.2 Probabilistic Models and Metrics](https://youtu.be/4MgWh8oD-hQ)
- [2.3.3 Logistic Regression](https://youtu.be/oqj-0_4WKq4)
- [2.3.4 Multiclassification](https://youtu.be/gOVP8c1Cmmg)
- [2.4.1 Model Selection](https://youtu.be/lEBStVJXXpM)
- [2.4.2 Intro to Decision Trees](https://youtu.be/D6tNPIXCI1o)
- [2.4.3 Underfitting and Overfitting](https://youtu.be/ZhhvoUAhA80)
- [2.4.4 GridSearchCV](https://youtu.be/cWqE82yQi1Y)
- [2.4.5 Comparing Two Models](https://youtu.be/rZ8XGPm-i1o)
- [2.5.1 Imputation](https://youtu.be/vJ3WBCW2sas)
- [2.5.2 Categorical Data](https://youtu.be/qOAoV8HK8e0)
- [2.6.1 GridsearchCV and Pipelines 1](https://youtu.be/c8ZJMXM6vvo)
- [2.6.2 GridsearchCV and Pipelines 2](https://youtu.be/mqnG1yANXvo)
- [2.6.3 RandomizedSearchCV](https://youtu.be/BXzf0gJuV4w)
### Ch3 Feature Enginerring & KNN
- [3.1.1 Feature Engineering and Extraction](https://youtu.be/kHDRaKe2B5A)
- [3.1.2 Feature Transformation](https://youtu.be/YAJsJNS3DAA)
- [3.1.3 Curse of Dimensionality](https://youtu.be/HvG45qVJM84)
- [3.1.4 Regularization](https://youtu.be/ArJqhJ415d4)
- [3.1.5 Multicollinearity and PCA](https://youtu.be/8eBKta-D334)
- [3.1.6 Ensemble Models](https://youtu.be/oUOi5T1b_iU)
- [3.2.1 Bias and Variance](https://youtu.be/-ZLg6Zp9HHg)
- [3.2.2 Learning Curves](https://youtu.be/KB4fPj68Rbo)
- [3.3.1 Intro KNN](https://youtu.be/EV6xlHdTaEY)
- [3.3.2 KNN Bias and Variance](https://youtu.be/AXbhiyWJZdw)
- [3.3.3 KNN Time Complexity](https://youtu.be/XleyueB7jXU)
- [3.3.4 KD Trees and Weights](https://youtu.be/UV1WeqUFPE8)
- [3.4.1 Intro to NLP](https://youtu.be/w3HtykbMcqk)
- [3.4.2 Spacy](https://youtu.be/7nojkNN0EME)
- [3.4.3 Obtaining a Corpus](https://youtu.be/4mYBqqbW408)
- [3.4.4 Bag of Words Model](https://youtu.be/7bKNknkmIJI)
- [3.4.5 Hashing Vectorizer](https://youtu.be/_wPHgQhiMVY)
- [3.4.6 TF-IDF](https://youtu.be/kv9cE-uOnis)
- [3.4.7 Improving Signal](https://youtu.be/UR4JcVOBiAI)
- [3.4.8 N-grams and Similarity](https://youtu.be/tuwCNPEFRDs)
- [3.4.9 Word Usage Classifier](https://youtu.be/FNU4dNeKrHo)
- [3.4.10 Exercise I](https://youtu.be/kAZuP3efPbg)
- [3.4.11 Exercise II](https://youtu.be/MkpFJkSrC2o)
- [3.4.12 Exercise III and IV](https://youtu.be/Uw1_iZgihC0)
### Ch4 Decision Trees & Gradient Boosting
- [4.1.1 Intro to Decision Trees](https://youtu.be/4cIr8W9tXD8)
- [4.1.2 Tree Error Metrics](https://youtu.be/Pv8bnN3E4xA)
- [4.1.3 Trees for Regression](https://youtu.be/LvUazyMSRFM)
- [4.1.4 Training Trees and Hyperparameters](https://youtu.be/nC-pHG_hgdY)
- [4.1.5 Geometric Interpretation and Time Complexity](https://youtu.be/l23BnV-4Xwc)
- [4.1.6 Time Complexity Continued](https://youtu.be/oD_INvReM_I)
- [4.1.7 Random Forests](https://youtu.be/JUzss0-pvz8)
- [4.1.8 Extreme Random Forests](https://youtu.be/GLjopN8Lw94)
- [4.1.9 Gradient Boosting Trees I](https://youtu.be/KJtV7fTrFH4)
- [4.1.10 Gradient Boosting Trees II](https://youtu.be/E2R4D2Gc4x4)
- [4.1.11 Feature Importance](https://youtu.be/WDLvgw-Znmg)
- [4.1.12 Exercises](https://youtu.be/7q3pk2VcDFQ)
### Ch5 SVM & Clustering
- [5.1.1 Intro to SVM](https://youtu.be/GxSSq1B-BHg)
- [5.1.2 Largest Margin Classifier](https://youtu.be/EK-69gK8y9E)
- [5.1.3 Soft Margin Classifier](https://youtu.be/QT-Flust1Hc)
- [5.1.4 SVM Kernels](https://youtu.be/VQWkkPV1Q_Q)
- [5.1.5 SVM vs Logistic Regression](https://youtu.be/4G7uEZCAr2U)
- [5.1.6 SVM Regression](https://youtu.be/xp8TJgzgWp0)
- [5.1.7 SVM Lagrangian Dual](https://youtu.be/defq3yJ8cyw)
- [5.1.8 Kernel Trick](https://youtu.be/YcRcxm4LSqg)
- [5.1.9 SVM Time Complexity and Multiclass](https://youtu.be/FR5D4A3Nn0c)
- [5.1.10 SVM Tuning Kernels Exercise Part I](https://youtu.be/MggSYvtMcLQ)
- [5.1.11 SVM Tuning Kernels Exercise Part II](https://youtu.be/tvWb6XObsFg)
- [5.1.12 SVM Kernel Approximations](https://youtu.be/A3HPwA0IWIM)
- [5.1.13 SVM Online Learning](https://youtu.be/bgzt8UL8x4Y)
- [5.1.14 SVM Online Learning Pipeline](https://youtu.be/E1RoFHjJRDc)
- [5.2.1 Intro to Clustering](https://youtu.be/HktZtB7Te0c)
- [5.2.2 Metrics for Clustering](https://youtu.be/kWsylUt8LxA)
- [5.2.3 KMeans Clustering](https://youtu.be/_OKUAiC9FLY)
- [5.2.4 Elbow Plots](https://youtu.be/J_jH7cXGUSQ)
- [5.2.5 Gaussian Mixture Models](https://youtu.be/WwDiKfHW52U)
- [5.2.6 Choosing Cluster Based on Silhouette](https://youtu.be/Gj4HHh4dDEk)
- [5.2.7 GMM Choosing Number of Components](https://youtu.be/QO3F4lBs4m4)
### Ch6 Time Series Analysis & Dimensionality Reduction
- [ 6.1.1 Intro to Time Series](https://youtu.be/BvDcWoLnWFk)
- [ 6.1.2 Crossvalidation in Time Series](https://youtu.be/ZOJAof-3YYA)
- [ 6.1.3 Stationary Signal](https://youtu.be/WytJdsQBeos)
- [ 6.1.4 Modeling Drift](https://youtu.be/53FLb9usJBk)
- [ 6.1.5 Fourier Transforms Part I](https://youtu.be/k-cF7LB7p4w)
- [ 6.1.6 Fourier Transforms Part II](https://youtu.be/7KSdhV2elq8)
- [ 6.1.7 Fourier Components in our Model](https://youtu.be/ek3vt4xLL-k)
- [ 6.1.8 Modeling Noise](https://youtu.be/IBBcCxCPv1w)
- [ 6.1.9 Moving Statistics](https://youtu.be/8LUt3PBcCFk)
- [ 6.1.10 Full Model](https://youtu.be/oIIE1Fh8eNc)
- [ 6.1.11 ARMA and ARIMA](https://youtu.be/UPox0nn4IvE)
- [ 6.1.12 AR Example](https://youtu.be/GaQ70ZHo69s)
- [ 6.2.1 Intro to Dimension Reduction](https://youtu.be/b5BihaS90q0)
- [6.2.2 Math of Projections](https://youtu.be/CZjSrCpkxxk)
- [6.2.3 PCA](https://youtu.be/dCYH6MVyfSA)
- [6.2.4 PCA in Scikit Learn](https://youtu.be/lSxcJbaRvkA)
- [6.2.5 PCA Implementation Details](https://youtu.be/QQl4h1gyCwc)
- [6.2.6 Choosing the Number of Components](https://youtu.be/4azFI2QzyXU)
- [6.2.7 Truncated SVD](https://youtu.be/nfwInvCoym0)
- [6.2.8 NMF](https://youtu.be/ELLDyWiiSXU)
- [6.2.9 Using PCA with Supervised ML](https://youtu.be/u8tUlG28CH0)
- [6.2.10 PCA for Visualization](https://youtu.be/MRwKUI5ulio)
- [6.2.11 NMF Exercise Part I](https://youtu.be/oo1sDDHE-DM)
- [6.2.12 NMF Exercise Part II](https://youtu.be/PQt9XPS_bT0)
- [6.2.13 Variants of PCA](https://youtu.be/0KdHFPXsonE)
### Ch7 Anomaly Detection
- [7.1.1 Intro to Anomaly Detection](https://youtu.be/KuFUQ7wWhsY)
- [7.1.2 One class SVM](https://youtu.be/2l0TD7gCzvQ)
- [7.1.3 Isolation Forest](https://youtu.be/xIAwOj_xh9s)
- [7.1.4 Comparison Between One-class SVM and Isolation Forest](https://youtu.be/VIyK4gLB2hg)
- [7.1.5 Intro to Case Study](https://youtu.be/TQk9uSFMo6w)
- [7.1.6 Initial Baseline Model Part I](https://youtu.be/qrJTbKyv1eA)
- [7.1.7 Initial Baseline model Part II](https://youtu.be/HMRoWtI6eMo)
- [7.1.8 Full Baseline Model](https://youtu.be/gTyO2ldQfWc)
- [7.1.9 Z-score Detection](https://youtu.be/d2YIby-isCE)
- [7.1.10 Rolling Z-score Detection](https://youtu.be/cQ2YIZmeQKg)
- [7.1.11 Using External Features Initial Model](https://youtu.be/_SvveHx_Nfk)
- [7.1.12 Using External Features Tuning the Model](https://youtu.be/2Z5SQsx4LFE)
- [7.1.13 Packaging the Time Series Anomaly Detector](https://youtu.be/hTWHZTTbmJk)
### Ch8 Model Deployment
- [8.1.1 Model Considerations](https://youtu.be/SFFFnqugdOQ)
- [8.1.2 Model Development](https://youtu.be/DrClE-USVDg)
- [8.1.3 Flask App Local Development](https://youtu.be/chRA6Lngw-A)
- [8.1.4 GET requests](https://youtu.be/v1yYLbB_uUE)
- [8.1.5 Making GET Requests with Model](https://youtu.be/q9DgF7mJbrc)
- [8.1.6 Using our Model with Twitter Web API](https://youtu.be/AFwGAeBDeEk)
- [8.1.7 POST Requests and Flask Templates](https://youtu.be/qbi3aT0KEsA)
- [8.1.8 Preparing for Deployment to the Web](https://youtu.be/zqbIVwnA0go)
- [8.1.9 Deploying our App to the Web with Heroku](https://youtu.be/ALlhG0pjx1w)
- [8.2.1 Rethinking Model Tuning](https://youtu.be/TPcytvHTQoA)
- [8.2.2 Intro to Bayes Theorem](https://youtu.be/bzsLD59kF2g)
- [8.2.3 Bayesian Inference](https://youtu.be/i6EyxNl_mdk)
- [8.2.4 Bayesian Optimization](https://youtu.be/GIQL38tkyRs)
- [8.3.1 End of Course Material](https://youtu.be/htdGrsIMZkg)
### Office Hours
- [LIVE Streaming Office Hours](https://www.youtube.com/channel/UCW5qH1I2RMA0CnJ0uo7OTvQ/live)
- [LIVE Streaming Office Hours](https://www.youtube.com/channel/UCW5qH1I2RMA0CnJ0uo7OTvQ/live)
- [LIVE Streaming Office Hours](https://www.youtube.com/channel/UCW5qH1I2RMA0CnJ0uo7OTvQ/live)
- [LIVE Streaming Office Hours](https://www.youtube.com/channel/UCW5qH1I2RMA0CnJ0uo7OTvQ/live)
- [OFFICE HOURS PLAYLIST](https://www.youtube.com/playlist?list=PLeDYvCW3J3jmrHB7ESo8hZ5XqlYdB3SnV)