Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/sayakpaul/pima_indian_diabetes

data-analysis data-science data-visualization machine-learning numpy pandas python scikit-learn

Last synced: 21 days ago
JSON representation

Host: GitHub
URL: https://github.com/sayakpaul/pima_indian_diabetes
Owner: sayakpaul
Created: 2017-02-27T10:41:32.000Z (almost 8 years ago)
Default Branch: master
Last Pushed: 2018-01-13T14:16:38.000Z (about 7 years ago)
Last Synced: 2024-12-28T03:39:38.232Z (about 1 month ago)
Topics: data-analysis, data-science, data-visualization, machine-learning, numpy, pandas, python, scikit-learn
Language: Python
Size: 10.7 KB
Stars: 1
Watchers: 1
Forks: 1
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# Pima_Indian_Diabetes

The dataset details are available here:
https://archive.ics.uci.edu/ml/machine-learning-databases/pima-indians-diabetes/pima-indians-diabetes.names

What I did is pretty much straight forward. I took some well-known classifiers such as Logistic Regression, Support Vector Machines,
kNN etc and compared and analyzed their performances on this dataset. Before that I standardized the dataset using our very own
StandardScaler(). As with the classifiers I applied SGD (Stochastic Gradient Descent) with Logistic Regression and SVM for
optimized training. I also applied PCA for Dimensionality Reduction.

Among all of them Logistic Regression coupled with PCA seemed to be the best one.