https://github.com/easonlai/eda_for_prudential_life_insurance_sample_data

Notebook sample of Exploratory Data Analysis (EDA) for Prudential Life Insurance Sample Data
https://github.com/easonlai/eda_for_prudential_life_insurance_sample_data

azure-databricks azuredatabricks data-analysis data-analysis-python data-analytics databricks databricks-notebooks eda exploratory-data-analysis insurance insurance-sample-data jupyter-notebook python python3

Last synced: 8 months ago
JSON representation

Notebook sample of Exploratory Data Analysis (EDA) for Prudential Life Insurance Sample Data

Host: GitHub
URL: https://github.com/easonlai/eda_for_prudential_life_insurance_sample_data
Owner: easonlai
Created: 2021-07-30T05:55:26.000Z (over 4 years ago)
Default Branch: main
Last Pushed: 2021-07-31T15:44:12.000Z (over 4 years ago)
Last Synced: 2025-01-08T12:14:03.295Z (10 months ago)
Topics: azure-databricks, azuredatabricks, data-analysis, data-analysis-python, data-analytics, databricks, databricks-notebooks, eda, exploratory-data-analysis, insurance, insurance-sample-data, jupyter-notebook, python, python3
Language: Jupyter Notebook
Homepage:
Size: 4.18 MB
Stars: 1
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# Exploratory Data Analysis (EDA) for Prudential Life Insurance Sample Data

I find great Insurance sample data from [Kaggle](https://www.kaggle.com/) which is about ["Prudential Life Insurance Assessment - Can you make buying life insurance easier?"](https://www.kaggle.com/c/prudential-life-insurance-assessment). This sample data is great for practice data analysis. As usual, I use [Jupyter Notebook](https://jupyter.org/) & [Azure Databricks](https://docs.microsoft.com/en-us/azure/databricks/scenarios/what-is-azure-databricks) notebook to perform analysis.

Data Fields Description
* Id, A unique identifier associated with an application.
* Product_Info_1-7, A set of normalized variables relating to the product applied for
* Ins_Age, Normalized age of applicant
* Ht, Normalized height of applicant
* Wt, Normalized weight of applicant
* BMI, Normalized BMI of applicant
* Employment_Info_1-6, A set of normalized variables relating to the employment history of the applicant.
* InsuredInfo_1-6, A set of normalized variables providing information about the applicant.
* Insurance_History_1-9, A set of normalized variables relating to the insurance history of the applicant.
* Family_Hist_1-5, A set of normalized variables relating to the family history of the applicant.
* Medical_History_1-41, A set of normalized variables relating to the medical history of the applicant.
* Medical_Keyword_1-48, A set of dummy variables relating to the presence of/absence of a medical keyword being associated with the application.
* Response, This is the target variable, an ordinal variable relating to the final decision associated with an application

File Content Description
* data/prudential_life_insurance_sample_data.csv <-- Sample data from Kaggle
* eda_for_prudential_life_insurance_sample_data.ipynb <-- Notebook sample of EDA
* eda_for_prudential_life_insurance_sample_data_databricks.ipynb <-- Notebook sample for Databricks
* eda_for_prudential_life_insurance_sample_data_databricks.html <-- Notebook HTML export from Databricks

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/easonlai/eda_for_prudential_life_insurance_sample_data

Awesome Lists containing this project

README