Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/ekramasif/predict_the_diabetes
Training an Artificial Neural Network with this dataset and Predict the Diabetes.
https://github.com/ekramasif/predict_the_diabetes
ann artificial-neural-network dataset deep-learning diabetes-prediction machine-learning neural-network pima-indian-diabetes-dataset tensorflow
Last synced: about 1 month ago
JSON representation
Training an Artificial Neural Network with this dataset and Predict the Diabetes.
- Host: GitHub
- URL: https://github.com/ekramasif/predict_the_diabetes
- Owner: ekramasif
- License: apache-2.0
- Created: 2021-09-25T12:13:00.000Z (about 3 years ago)
- Default Branch: main
- Last Pushed: 2022-03-16T17:39:21.000Z (over 2 years ago)
- Last Synced: 2023-03-10T03:02:51.287Z (over 1 year ago)
- Topics: ann, artificial-neural-network, dataset, deep-learning, diabetes-prediction, machine-learning, neural-network, pima-indian-diabetes-dataset, tensorflow
- Language: Jupyter Notebook
- Homepage: https://nbviewer.org/github/ekramasif/Predict_the_Diabetes/blob/main/main.ipynb
- Size: 3.18 MB
- Stars: 4
- Watchers: 1
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Training an Artificial Neural Network with this dataset and Predicting the Diabetes. [![GitHub license](https://img.shields.io/badge/license-MIT-blue.svg)](https://github.com/ekramasif/Predict_the_Diabetes/blob/main/LICENSE)
## About Dataset:
This dataset is originally from the National Institute of Diabetes and Digestive and Kidney Diseases. The objective of the dataset is to diagnostically predict whether or not a patient has diabetes, based on certain diagnostic measurements included in the dataset. Several constraints were placed on the selection of these instances from a larger database. In particular, all patients here are females at least 21 years old of Pima Indian heritage.## Training/Validation Split:
#### I use to 'Randomly' select 80% data for Training and 20% data for Validation/Test purposes. Those 80-20% split should come uniformly from each of the TARGET types, meaning I was choosing exactly 40-10% data from TARGET = 0 and the other 40-10% from TARGET = 1 (but randomly within each target).## Dataset: The dataset contains the following Columns (for clarity see the 'data.tsv'):
1) Preg = Number of times pregnant.
2) GLU = Plasma glucose concentration a 2 hours in an oral glucose tolerance test
3) BP = Diastolic blood pressure (mm Hg)
4) ST = Triceps skin fold thickness (mm)
5) INS = 2-Hour serum insulin (mu U/ml)
6) BMI = Body mass index (weight in kg/(height in m)^2)
7) DPF = Diabetes pedigree function
8) Age = Age in years9) Outcome = 1 - YES (meaning the patient might Diabetes); 0 - NO (the patient doesn't Diabetes)
## parameters I use Activation functions: Sigmoid
Optimizer: Adam
No. of Layers: 6 hidden layer & 1 output layer
Units: 800, 500, 250, 150, 75, 40, 1
Loss: binary_crossentropy
Metrics: accuracy