https://github.com/ekramasif/predict_the_diabetes

Training an Artificial Neural Network with this dataset and Predict the Diabetes.
https://github.com/ekramasif/predict_the_diabetes

ann artificial-neural-network dataset deep-learning diabetes-prediction machine-learning neural-network pima-indian-diabetes-dataset tensorflow

Last synced: 10 months ago
JSON representation

Training an Artificial Neural Network with this dataset and Predict the Diabetes.

Host: GitHub
URL: https://github.com/ekramasif/predict_the_diabetes
Owner: ekramasif
License: apache-2.0
Created: 2021-09-25T12:13:00.000Z (almost 5 years ago)
Default Branch: main
Last Pushed: 2022-03-16T17:39:21.000Z (over 4 years ago)
Last Synced: 2025-06-02T03:47:15.322Z (about 1 year ago)
Topics: ann, artificial-neural-network, dataset, deep-learning, diabetes-prediction, machine-learning, neural-network, pima-indian-diabetes-dataset, tensorflow
Language: Jupyter Notebook
Homepage: https://nbviewer.org/github/ekramasif/Predict_the_Diabetes/blob/main/main.ipynb
Size: 3.18 MB
Stars: 3
Watchers: 2
Forks: 1
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          # Training an Artificial Neural Network with this dataset and Predicting the Diabetes. [![GitHub license](https://img.shields.io/badge/license-MIT-blue.svg)](https://github.com/ekramasif/Predict_the_Diabetes/blob/main/LICENSE)

## About Dataset:

This dataset is originally from the National Institute of Diabetes and Digestive and Kidney Diseases. The objective of the dataset is to diagnostically predict whether or not a patient has diabetes, based on certain diagnostic measurements included in the dataset. Several constraints were placed on the selection of these instances from a larger database. In particular, all patients here are females at least 21 years old of Pima Indian heritage.

## Training/Validation Split:

#### I use to 'Randomly' select 80% data for Training and 20% data for Validation/Test purposes. Those 80-20% split should come uniformly from each of the TARGET types, meaning I was choosing exactly 40-10% data from TARGET = 0 and the other 40-10% from TARGET = 1 (but randomly within each target).

## Dataset: The dataset contains the following Columns (for clarity see the 'data.tsv'):

    1) Preg = Number of times pregnant.

    2) GLU = Plasma glucose concentration a 2 hours in an oral glucose tolerance test

    3) BP = Diastolic blood pressure (mm Hg)

    4) ST = Triceps skin fold thickness (mm)

    5) INS = 2-Hour serum insulin (mu U/ml)

    6) BMI = Body mass index (weight in kg/(height in m)^2)

    7) DPF = Diabetes pedigree function

    8) Age = Age in years

    9) Outcome  = 1 - YES (meaning the patient might Diabetes); 0 - NO (the patient doesn't Diabetes)

    

## parameters I use

       Activation functions: Sigmoid

       Optimizer: Adam

       No. of Layers: 6 hidden layer & 1 output layer

       Units: 800, 500, 250, 150, 75, 40, 1

       Loss: binary_crossentropy

       Metrics: accuracy

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/ekramasif/predict_the_diabetes

Awesome Lists containing this project

README