An open API service indexing awesome lists of open source software.

https://github.com/quocduyenanhnguyen/data-mining-case-study-project

In this project, for supervised learning, I used regression and decision tree techniques to build predictive models and tested model accuracy by evaluating MSE and misclassification cost. For unsupervised learning, I performed cluster analysis on Iris dataset to identify subgroups and I used association rules to analyze transaction details in the Groceries dataset.
https://github.com/quocduyenanhnguyen/data-mining-case-study-project

boston-housing-dataset credit-card-dataset groceries-dataset iris-dataset predictive-modeling r rstudio supervised-learning unsupervised-learning

Last synced: 8 months ago
JSON representation

In this project, for supervised learning, I used regression and decision tree techniques to build predictive models and tested model accuracy by evaluating MSE and misclassification cost. For unsupervised learning, I performed cluster analysis on Iris dataset to identify subgroups and I used association rules to analyze transaction details in the Groceries dataset.

Awesome Lists containing this project

README

          

[Note: you can preview files that are in R and PDF format by clicking on the file]

# Software I used:
[RStudio](https://www.rstudio.com/products/rstudio/download/)

# Description:
Supervised learning: I was in charge of writing codes to build linear regression model, logistic regression model, and decision tree models (both regression and classification trees) on the Boston Housing and Credit Card dataset, and tested model accuracy by evaluating MSE values (for numerical response variable) and misclassification costs (for binary response variable). And then I concluded the analysis with several written reports, with the help of my partner by delegating tasks to my partner. The output of my codes along with my description for each output are also included in all reports.

Unsupervised learning: I was in charge of writing codes to perform cluster analysis (K-means and Hierarchical) on the Iris dataset to better understand clusters with similar and dissimilar characteristics, and to find unknown subgroups. I also wrote codes to explore Groceries dataset and analyzed transaction details based on association rules, and I concluded my analysis with a written report, with the help of my partner. The output of my codes and the description of my output are also included in the report.