An open API service indexing awesome lists of open source software.

https://github.com/patrickdocs/insurance-data-pipeline-etl-visualization


https://github.com/patrickdocs/insurance-data-pipeline-etl-visualization

Last synced: 5 months ago
JSON representation

Awesome Lists containing this project

README

          

# Insurance Data Pipeline - ETL & Visualization

## 📌 Project Overview
This project focuses on building an ETL (Extract, Transform, Load) pipeline for processing insurance data and performing data visualization to gain insights. The dataset includes various factors affecting insurance premiums, claims, and policy adjustments.

## 📂 Project Structure
```
Insurance-Data-Pipeline-ETL-Visualization/
│-- insurance.csv # Raw dataset used for processing
│-- Insurance_ETL.ipynb # Jupyter Notebook containing ETL and visualization
│-- README.md # Project documentation
```

## 🔧 Features
- **ETL Pipeline:** Extracts, transforms, and loads insurance data for analysis.
- **Data Cleaning & Validation:** Handles missing values and ensures data integrity.
- **Data Visualization:** Generates insights using Seaborn and Matplotlib.
- **Key Insights:** Analyzes claim severity, regional claims distribution, and premium adjustments.

## 📊 Visualizations
1. **Claims Severity Distribution** - Bar chart visualizing different levels of claim severity.
2. **Claims Frequency by Region** - Total number of claims across various regions.
3. **Premium Adjustments** - Examining how premium amounts change based on multiple factors.

## 🚀 How to Run
1. Clone the repository:
```bash
git clone https://github.com/PatrickDoCS/Insurance-Data-Pipeline-ETL-Visualization.git
```
2. Install dependencies:
```bash
pip install pandas numpy matplotlib seaborn
```
3. Open and run `Insurance_ETL.ipynb` in Jupyter Notebook.

## 📌 Dependencies
- Python 3.x
- Pandas
- NumPy
- Matplotlib
- Seaborn

## 📬 Contact
For any questions, feel free to reach out:
- **GitHub:** [PatrickDoCS](https://github.com/PatrickDoCS)