https://github.com/michael95-m/packaging-insurance-claim-model
Packaging regression model from scikit-learn
https://github.com/michael95-m/packaging-insurance-claim-model
feature-engineering machine-learning python python-package scikit-learn
Last synced: 27 days ago
JSON representation
Packaging regression model from scikit-learn
- Host: GitHub
- URL: https://github.com/michael95-m/packaging-insurance-claim-model
- Owner: Michael95-m
- Created: 2023-01-17T09:13:31.000Z (over 3 years ago)
- Default Branch: main
- Last Pushed: 2023-01-23T05:20:35.000Z (over 3 years ago)
- Last Synced: 2025-05-30T13:21:24.375Z (about 1 year ago)
- Topics: feature-engineering, machine-learning, python, python-package, scikit-learn
- Language: Jupyter Notebook
- Homepage:
- Size: 1.37 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Packaging insurance claim model
## Steps I make through this repository
- **notebook**
You can see the EDA, feature engineering and modeling notebook inside **notebook** folder.
- **source code**
You can see the main source code that make from above notebooks in **insurance_claim_model** folder.
Actually I didn't give much time to EDA and feature engineering parts because I want to emphasize and focus to make the **well-structured code** with good practices like testing with **tox** and **pytest** and using lint tools.
- **Packaging**
You can see **MANIFEST.in**, **pyproject.toml** and **setup.py** which are essential for python packaging.
The steps I make for this repo is that at first, I make EDA and modeling notebook. From these notebooks, I create the python files(.py). Then make the python package which can installed with pip. You can see my python package in [here](https://pypi.org/project/insurance-claim-model/).
## About Dataset
The dataset I use in this repository is from [here](https://www.kaggle.com/datasets/thedevastator/insurance-claim-analysis-demographic-and-health). It is about insurance claim analysis from demographics and health factors. It is released on 2023.
I make a regression model with random forest which can predict insurance claim from that dataset.