{"id":14958267,"url":"https://github.com/empathy87/the-elements-of-statistical-learning-python-notebooks","last_synced_at":"2025-04-13T00:43:54.366Z","repository":{"id":37431179,"uuid":"167110829","full_name":"empathy87/The-Elements-of-Statistical-Learning-Python-Notebooks","owner":"empathy87","description":"A series of Python Jupyter notebooks that help you better understand \"The Elements of Statistical Learning\" book","archived":false,"fork":false,"pushed_at":"2021-07-18T17:48:43.000Z","size":60437,"stargazers_count":878,"open_issues_count":3,"forks_count":272,"subscribers_count":23,"default_branch":"master","last_synced_at":"2025-04-13T00:43:45.905Z","etag":null,"topics":["data-analysis","data-science","machine-learning","python","sklearn","statistical-learning","tensorflow","tutorials"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/empathy87.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2019-01-23T03:30:18.000Z","updated_at":"2025-04-08T05:53:59.000Z","dependencies_parsed_at":"2022-08-08T20:15:51.157Z","dependency_job_id":null,"html_url":"https://github.com/empathy87/The-Elements-of-Statistical-Learning-Python-Notebooks","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/empathy87%2FThe-Elements-of-Statistical-Learning-Python-Notebooks","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/empathy87%2FThe-Elements-of-Statistical-Learning-Python-Notebooks/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/empathy87%2FThe-Elements-of-Statistical-Learning-Python-Notebooks/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/empathy87%2FThe-Elements-of-Statistical-Learning-Python-Notebooks/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/empathy87","download_url":"https://codeload.github.com/empathy87/The-Elements-of-Statistical-Learning-Python-Notebooks/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248650419,"owners_count":21139672,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["data-analysis","data-science","machine-learning","python","sklearn","statistical-learning","tensorflow","tutorials"],"created_at":"2024-09-24T13:16:38.782Z","updated_at":"2025-04-13T00:43:54.347Z","avatar_url":"https://github.com/empathy87.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# \"The Elements of Statistical Learning\" Notebooks\nReproducing examples from the \"The Elements of Statistical Learning\" by Trevor Hastie, Robert Tibshirani and Jerome Friedman with Python and its popular libraries: \n**numpy**, **math**, **scipy**, **sklearn**, **pandas**, **tensorflow**, **statsmodels**, **sympy**, **catboost**, **pyearth**, **mlxtend**, **cvxpy**. Almost all plotting is done using **matplotlib**, sometimes using **seaborn**. \n\n## Examples\nThe documented Jupyter Notebooks are in the [examples](https://github.com/empathy87/The-Elements-of-Statistical-Learning-Python-Notebooks/tree/master/examples) folder:\n### [examples/Mixture.ipynb](https://github.com/empathy87/The-Elements-of-Statistical-Learning-Python-Notebooks/blob/master/examples/Mixture.ipynb)\n\nClassifying the points from a mixture of \"gaussians\" using linear regression, nearest-neighbor, logistic regression with natural cubic splines basis expansion, neural networks, support vector machines, flexible discriminant analysis over MARS regression, mixture discriminant analysis, k-Means clustering, Gaussian mixture model and random forests.\n\n![alt](https://github.com/empathy87/The-Elements-of-Statistical-Learning-Python-Notebooks/blob/master/images/mixture.png)\n### [examples/Prostate Cancer.ipynb](https://github.com/empathy87/The-Elements-of-Statistical-Learning-Python-Notebooks/blob/master/examples/Prostate%20Cancer.ipynb)\n\nPredicting prostate specific antigen using ordinary least squares, ridge/lasso regularized linear regression, principal components regression, partial least squares and best subset regression. Model parameters are selected by K-folds cross-validation.\n\n![alt](https://github.com/empathy87/The-Elements-of-Statistical-Learning-Python-Notebooks/blob/master/images/cancer.png)\n### [examples/South African Heart Disease.ipynb](https://github.com/empathy87/The-Elements-of-Statistical-Learning-Python-Notebooks/blob/master/examples/South%20African%20Heart%20Disease.ipynb)\nUnderstanding the risk factors using logistic regression, L1 regularized logistic regression, natural cubic splines basis expansion for nonlinearities, thin-plate spline for mutual dependency, local logistic regression, kernel density estimation and gaussian mixture models.\n\n![alt](https://github.com/empathy87/The-Elements-of-Statistical-Learning-Python-Notebooks/blob/master/images/chd.png)\n### [examples/Vowel.ipynb](https://github.com/empathy87/The-Elements-of-Statistical-Learning-Python-Notebooks/blob/master/examples/Vowel.ipynb)\nVowel speech recognition using regression of an indicator matrix, linear/quadratic/regularized/reduced-rank discriminant analysis and logistic regression.\n\n![alt](https://github.com/empathy87/The-Elements-of-Statistical-Learning-Python-Notebooks/blob/master/images/vowel.png)\n### [examples/Bone Mineral Density.ipynb](https://github.com/empathy87/The-Elements-of-Statistical-Learning-Python-Notebooks/blob/master/examples/Bone%20Mineral%20Density.ipynb)\nComparing patterns of bone mineral density relative change for men and women using smoothing splines.\n\n![alt](https://github.com/empathy87/The-Elements-of-Statistical-Learning-Python-Notebooks/blob/master/images/bone.png)\n### [examples/Air Pollution Data.ipynb](https://github.com/empathy87/The-Elements-of-Statistical-Learning-Python-Notebooks/blob/master/examples/Air%20Pollution.ipynb)\nAnalysing Los Angeles pollution data using smoothing splines.\n\n![alt](https://github.com/empathy87/The-Elements-of-Statistical-Learning-Python-Notebooks/blob/master/images/ozone_vs_pressure_gradient.png)\n### [examples/Phoneme Recognition.ipynb](https://github.com/empathy87/The-Elements-of-Statistical-Learning-Python-Notebooks/blob/master/examples/Phoneme%20Recognition.ipynb)\nPhonemes speech recognition using reduced flexibility logistic regression.\n\n![alt](https://github.com/empathy87/The-Elements-of-Statistical-Learning-Python-Notebooks/blob/master/images/phoneme.png)\n### [examples/Galaxy.ipynb](https://github.com/empathy87/The-Elements-of-Statistical-Learning-Python-Notebooks/blob/master/examples/Galaxy.ipynb)\nAnalysing radial velocity of galaxy NGC7531 using local regression in multidimentional space.\n\n![alt](https://github.com/empathy87/The-Elements-of-Statistical-Learning-Python-Notebooks/blob/master/images/galaxy.png)\n### [examples/Ozone.ipynb](https://github.com/empathy87/The-Elements-of-Statistical-Learning-Python-Notebooks/blob/master/examples/Ozone.ipynb)\nAnalysing the factors influencing ozone concentration using local regression and trellis plot.\n\n![alt](https://github.com/empathy87/The-Elements-of-Statistical-Learning-Python-Notebooks/blob/master/images/ozone.png)\n### [examples/Spam.ipynb](https://github.com/empathy87/The-Elements-of-Statistical-Learning-Python-Notebooks/blob/master/examples/Spam.ipynb)\nDetecting email spam using logistic regression, generalized additive logistic model, decision tree, multivariate adaptive regression splines, boosting and random forest.\n\n![alt](https://github.com/empathy87/The-Elements-of-Statistical-Learning-Python-Notebooks/blob/master/images/spam.png)\n### [examples/California Housing.ipynb](https://github.com/empathy87/The-Elements-of-Statistical-Learning-Python-Notebooks/blob/master/examples/California%20Housing.ipynb)\nAnalysing the factors influencing California houses prices using boosting over decision trees and partial dependance plots.\n\n![alt](https://github.com/empathy87/The-Elements-of-Statistical-Learning-Python-Notebooks/blob/master/images/california.png)\n\n### [examples/Demographics.ipynb](https://github.com/empathy87/The-Elements-of-Statistical-Learning-Python-Notebooks/blob/master/examples/Demographics.ipynb)\nPredicting shopping mall customers occupation, and hence identifying demographic variables that discriminate between different occupational categories using boosting and market basket analysis.\n\n![alt](https://github.com/empathy87/The-Elements-of-Statistical-Learning-Python-Notebooks/blob/master/images/demographics.png)\n\n### [examples/ZIP Code.ipynb](https://github.com/empathy87/The-Elements-of-Statistical-Learning-Python-Notebooks/blob/master/examples/ZIP%20Code.ipynb)\nRecognizing small hand-drawn digits using LeCun's Net-1 - Net-5 neural networks. \n\n![alt](https://github.com/empathy87/The-Elements-of-Statistical-Learning-Python-Notebooks/blob/master/images/zip1.png)\n\nAnalysing of the number three variation in ZIP codes using principal component and archetypal analysis.\n\n![alt](https://github.com/empathy87/The-Elements-of-Statistical-Learning-Python-Notebooks/blob/master/images/zip2.png)\n\n### [examples/Human Tumor Microarray Data.ipynb](https://github.com/empathy87/The-Elements-of-Statistical-Learning-Python-Notebooks/blob/master/examples/Human%20Tumor%20Microarray%20Data.ipynb)\nAnalysing microarray data using K-means clustring and hierarchical clustering. \n\n![alt](https://github.com/empathy87/The-Elements-of-Statistical-Learning-Python-Notebooks/blob/master/images/tumor.png)\n\n### [examples/Country Dissimilarities.ipynb](https://github.com/empathy87/The-Elements-of-Statistical-Learning-Python-Notebooks/blob/master/examples/Country%20Dissimilarities.ipynb)\nAnalysing country dissimilarities using K-medoids clustering and multidimensional scaling.\n\n![alt](https://github.com/empathy87/The-Elements-of-Statistical-Learning-Python-Notebooks/blob/master/images/country.png)\n\n### [examples/Signature.ipynb](https://github.com/empathy87/The-Elements-of-Statistical-Learning-Python-Notebooks/blob/master/examples/Signature.ipynb)\nAnalysing signature shapes using Procrustes transformation.\n\n![alt](https://github.com/empathy87/The-Elements-of-Statistical-Learning-Python-Notebooks/blob/master/images/signature.png)\n\n### [examples/Waveform.ipynb](https://github.com/empathy87/The-Elements-of-Statistical-Learning-Python-Notebooks/blob/master/examples/Waveform.ipynb)\nRecognizing wave classes using linear, quadratic, flexible (over MARS regression), mixture discriminant analysis and decision trees.\n\n![alt](https://github.com/empathy87/The-Elements-of-Statistical-Learning-Python-Notebooks/blob/master/images/waveform.png)\n\n### [examples/Protein Flow-Cytometry.ipynb](https://github.com/empathy87/The-Elements-of-Statistical-Learning-Python-Notebooks/blob/master/examples/Protein%20Flow%20Cytometry.ipynb)\nAnalysing protein flow-cytometry data using graphical-lasso undirected graphical model for continuous variables. \n\n![alt](https://github.com/empathy87/The-Elements-of-Statistical-Learning-Python-Notebooks/blob/master/images/cytometry.png)\n\n### [examples/SRBCT Microarray.ipynb](https://github.com/empathy87/The-Elements-of-Statistical-Learning-Python-Notebooks/blob/master/examples/SRBCT%20Microarray.ipynb)\nAnalysing microarray data of 2308 genes and selecting the most significant genes for cancer classification using nearest shrunken centroids. \n\n![alt](https://github.com/empathy87/The-Elements-of-Statistical-Learning-Python-Notebooks/blob/master/images/srbct.png)\n\n### [examples/14 Cancer Microarray.ipynb](https://github.com/empathy87/The-Elements-of-Statistical-Learning-Python-Notebooks/blob/master/examples/14%20Cancer.ipynb)\nAnalysing microarray data of 16,063 genes gathered by Ramaswamy et al. (2001) and selecting the most significant genes for cancer classification using nearest shrunken centroids, L2-penalized discriminant analysis, support vector classifier, k-nearest neighbors, L2-penalized multinominal, L1-penalized multinominal and elastic-net penalized multinominal. It is a difficult classification problem with p\u003e\u003eN (only 144 training observations).\n\n### [examples/Skin of the Orange.ipynb](https://github.com/empathy87/The-Elements-of-Statistical-Learning-Python-Notebooks/blob/master/examples/Skin%20of%20the%20Orange.ipynb)\nSolving a synthetic classification problem using Support Vector Machines and multivariate adaptive regression splines to show the influence of additional noise features.\n\n### [examples/Radiation Sensitivity.ipynb](https://github.com/empathy87/The-Elements-of-Statistical-Learning-Python-Notebooks/blob/master/examples/Radiation%20Sensitivity.ipynb)\nAssessing the significance of 12,625 genes from microarray study of radiation sensitivity using Benjamini-Hochberg method and the significane analysis of microarrays (SAM) approach.\n\n![alt](https://github.com/empathy87/The-Elements-of-Statistical-Learning-Python-Notebooks/blob/master/images/radiation.png)\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fempathy87%2Fthe-elements-of-statistical-learning-python-notebooks","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fempathy87%2Fthe-elements-of-statistical-learning-python-notebooks","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fempathy87%2Fthe-elements-of-statistical-learning-python-notebooks/lists"}