https://github.com/justus-coded/justus_portfolio
This repository serves as my data science portfolio, which contains all my projects
https://github.com/justus-coded/justus_portfolio
data-preprocessing data-visualization exploratory-data-analysis predictive-analysis
Last synced: 3 months ago
JSON representation
This repository serves as my data science portfolio, which contains all my projects
- Host: GitHub
- URL: https://github.com/justus-coded/justus_portfolio
- Owner: Justus-coded
- Created: 2020-08-20T16:26:27.000Z (almost 6 years ago)
- Default Branch: master
- Last Pushed: 2020-10-22T04:07:14.000Z (over 5 years ago)
- Last Synced: 2025-03-11T08:41:25.903Z (over 1 year ago)
- Topics: data-preprocessing, data-visualization, exploratory-data-analysis, predictive-analysis
- Homepage: https://justus-coded.github.io/Justus_portfolio/
- Size: 9.77 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
> HELLO THERE!
# I am Justus Ilemobayo, a Data Analyst and Data Scientist
## PORTFOLIO
### [Project 1: Employee-Attrition-Problem](https://github.com/Justus-coded/Employee-Attrition-Problem)
**A detailed analysis on Employee Attrition and Predictive Analysis using Machine Learning Algorithms.**
#### Description
The data is for company X which is trying to control attrition. There are two sets of data: “Existing employees” and “Employees who have left”.
##### ANALYSIS
1. Exploratory Data Analysis and Data Visualization
2. Data Modelling and Data Preprocessing
3. Predictive Analysis
**Exploratory Data Analysis**
The analysis was done using Python Libraries (MatPlot Library and Seaborn). Data Visualization was also done using Tableau. Check the Dashboard folder for data visualization images gotten using Tableau.
From the Analysis done we concluded that Low satisfaction level is a key factor in employee attrition. More details are in the Powerpoint slides.
**Data Modelling and Data Preprocessing**
Here we checked for missing values in the dataset. Also, create a new column showing weather an employee left or not. We then merge the two Data Frames together. Check jupyter notebook (.ipynb) for more details.
**Predictive Analysis**
Using predictive analysis techniques, we predicted based on the data, whether an employee would leave the company or not. Using CatBoost Classifier, we achieved 99.47% accuracy. Check jupyter notebook (.ipynb) for more details.
### [Project 2: Insurance-Prediction](https://github.com/Justus-coded/Insurance-Prediction)
**To build a predictive model to determine if a building will have an insurance claim during a certain period or not**
#### Data Description
Recently, there has been an increase in the number of building collapse in Lagos and major cities in Nigeria. An Insurance Company offers a building insurance policy that protects buildings against damages that could be caused by a fire or vandalism, by a flood or storm.
The target variable, Claim, is a:
1 if the building has at least a claim over the insured period.,
0 if the building doesn’t have a claim over the insured period.
#### Content
1. Exploratory Data Analysis
2. Data Cleaning and Preprocessing
3. Feature Extraction and generation
4. Predictive Analysis
* **Exploratory Data Analysis**
Here we dig into the data using Data Visualization Techniques to find relationship between various features. Check jupyter notebook on Exploratory Data Analysis. Tableau was also used to create several dashboards.
* **Data Cleaning and Preprocessing**
Fill in all missing values, remove redundant features, convert the data-type of categorical features.
* **Feature Extraction and Genration**
Create new features from relationships between existing ones. For Example, create a new column of painted building with garden or painted building with fenced building.
* **Predictive Analysis**
Using various Machine Learning Algorithms like CatBoost and XGBoost, predict whether or not a building will have an insurance claim during a certain period