Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/sayakpaul/pima_indian_diabetes


https://github.com/sayakpaul/pima_indian_diabetes

data-analysis data-science data-visualization machine-learning numpy pandas python scikit-learn

Last synced: 7 days ago
JSON representation

Awesome Lists containing this project

README

        

# Pima_Indian_Diabetes

The dataset details are available here:
https://archive.ics.uci.edu/ml/machine-learning-databases/pima-indians-diabetes/pima-indians-diabetes.names

What I did is pretty much straight forward. I took some well-known classifiers such as Logistic Regression, Support Vector Machines,
kNN etc and compared and analyzed their performances on this dataset. Before that I standardized the dataset using our very own
StandardScaler(). As with the classifiers I applied SGD (Stochastic Gradient Descent) with Logistic Regression and SVM for
optimized training. I also applied PCA for Dimensionality Reduction.

Among all of them Logistic Regression coupled with PCA seemed to be the best one.