https://github.com/ralstonraphael/water_access_ml_project
This project analyzes life expectancy data from the World Health Organization (WHO), sourced from Kaggle. The dataset spans 183 countries across 6 regions, covering metrics such as life expectancy, mortality rates, healthcare access, and socioeconomic factors.
https://github.com/ralstonraphael/water_access_ml_project
data-science machine-learning numpy pandas python
Last synced: 7 months ago
JSON representation
This project analyzes life expectancy data from the World Health Organization (WHO), sourced from Kaggle. The dataset spans 183 countries across 6 regions, covering metrics such as life expectancy, mortality rates, healthcare access, and socioeconomic factors.
- Host: GitHub
- URL: https://github.com/ralstonraphael/water_access_ml_project
- Owner: ralstonraphael
- Created: 2025-04-07T23:48:27.000Z (10 months ago)
- Default Branch: main
- Last Pushed: 2025-04-08T00:01:11.000Z (10 months ago)
- Last Synced: 2025-04-10T02:54:35.199Z (10 months ago)
- Topics: data-science, machine-learning, numpy, pandas, python
- Language: Jupyter Notebook
- Homepage:
- Size: 1.98 MB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# ๐ Life Expectancy Analysis with WHO Data
## ๐ Project Overview
This project explores and analyzes life expectancy data provided by the World Health Organization (sourced from Kaggle). Using Python's powerful data science ecosystem, we clean, transform, and visualize insights about global health trends. We then apply machine learning techniques to predict life expectancy based on socioeconomic and health factors.
---
## ๐งช Tech Stack
- Python
- Pandas, Numpy
- Matplotlib, Seaborn
- Scikit-learn
- TensorFlow/Keras
---
## ๐ง Key Learnings
- Mastery in data preprocessing: handling missing values, normalization, and encoding.
- Feature engineering to enhance model inputs.
- Exploratory Data Analysis (EDA) with Matplotlib and Seaborn.
- Training and evaluating machine learning models using TensorFlow/Keras.
- Interpretation of results to derive health policy insights.
---
## ๐ Key Insights
- **GDP and schooling** are strongly positively correlated with higher life expectancy.
- **HIV/AIDS prevalence** shows a strong negative correlation.
- Countries with better **healthcare expenditure** generally enjoy higher life expectancy.
- Model performance indicates socioeconomic indicators can be good predictors of life expectancy, though regional anomalies exist.
---
## ๐งพ How to Run
```bash
# Install required libraries
pip install pandas numpy matplotlib seaborn scikit-learn tensorflow