Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/tirthajyoti/r-stats-machine-learning

Misc Statistics and Machine Learning codes in R
https://github.com/tirthajyoti/r-stats-machine-learning

classification clustering decision-trees hypothesis-testing k-means machine-learning nearest-neighbors neural-network principal-component-analysis r random-forest regression statistics support-vector-machines

Last synced: 2 months ago
JSON representation

Misc Statistics and Machine Learning codes in R

Awesome Lists containing this project

README

        

## Please feel free to [add me here on LinkedIn](https://www.linkedin.com/in/tirthajyoti-sarkar-2127aa7/) if you are interested in data science and like to connect.

# Statistics and Machine Learning R scripts
Misc Machine Learning and statistical analysis code examples in R

## Packages used/demonstrated
* [caret](caret.r-forge.r-project.org)
* [rattle](https://cran.r-project.org/web/packages/rattle/vignettes/rattle.pdf)
* [randomForest](https://cran.r-project.org/web/packages/randomForest/randomForest.pdf)
* [rpart](https://cran.r-project.org/web/packages/rpart/rpart.pdf)
* [dplyr](https://cran.r-project.org/web/packages/dplyr/dplyr.pdf)
* [ggplot2](https://cran.r-project.org/package=ggplot2/ggplot2.pdf)
* [corrplot](https://cran.r-project.org/web/packages/corrplot/vignettes/corrplot-intro.html)
* [factoextra](http://www.sthda.com/english/wiki/factoextra-r-package-easy-multivariate-data-analyses-and-elegant-visualization)
* [glmnet](https://cran.r-project.org/web/packages/glmnet/glmnet.pdf)
* [MASS](https://cran.r-project.org/web/packages/MASS/MASS.pdf)
* [mgcv](https://cran.r-project.org/web/packages/mgcv/index.html)
* ... and some more...

## Supervised learning (Regression and Classification)
* Linear regression
* Poisson regression

* Stepwise selection method
* LASSO, Ridge, and Elastic Net regularization methods

* Residual analysis
* Spline regression

* Logistic regression

* Support vector machine

* k-Nearest Neighbor
* Decision Tree
* Random Forest

* Feedforward neural network

## Unsupervised learning
* k-means Clustering

* Principal Component Analysis (PCA)

## Statistics/Data wrangling
* Missing data imputation
* Demo of Central Limit Theorem'
* Outlier detection using Grubb's test
* **Cu**mulative **Sum** (CUSUM) for change detection
* Demo of Hypothesis shopping (*why you should be suspicious of p-Values*)