Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/mainak-97/bank-loan-case-study
Utilized Advanced Excel:- This project focuses on using Exploratory Data Analysis (EDA) to identify patterns and key indicators of loan default in a financial dataset. By analyzing customer attributes and loan characteristics, the goal is to enhance decision-making in loan approvals, reduce financial risks, and optimize lending strategies.
https://github.com/mainak-97/bank-loan-case-study
Last synced: about 1 month ago
JSON representation
Utilized Advanced Excel:- This project focuses on using Exploratory Data Analysis (EDA) to identify patterns and key indicators of loan default in a financial dataset. By analyzing customer attributes and loan characteristics, the goal is to enhance decision-making in loan approvals, reduce financial risks, and optimize lending strategies.
- Host: GitHub
- URL: https://github.com/mainak-97/bank-loan-case-study
- Owner: Mainak-97
- Created: 2024-08-27T19:45:48.000Z (5 months ago)
- Default Branch: main
- Last Pushed: 2024-08-28T13:51:48.000Z (5 months ago)
- Last Synced: 2024-08-28T22:04:44.441Z (5 months ago)
- Homepage:
- Size: 13 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Bank Loan Case Study
![Logo](https://i.imgur.com/gupMXvf.jpeg)
# Project File Link:
https://drive.google.com/file/d/1NxDMRrEtSD36c0HDhw82U6AcUuqB_wuV/view
# Overview
This project involves the analysis of a dataset from a finance company that specializes in lending various types of loans to urban customers. The main goal is to use Exploratory Data Analysis (EDA) to understand the patterns in customer attributes and loan characteristics that influence the likelihood of loan default. The analysis will help the company make better decisions regarding loan approvals, reducing the risk of financial loss while ensuring that capable applicants are not rejected.
# Business Objectives:The primary objectives of this project are:
1. **Identify Patterns Indicating Loan Default:**
- Recognize key factors that may lead to difficulties in paying installments.2. **Improve Decision-Making:**
- Use insights from the data to make informed decisions about loan approval, potentially denying risky loans, reducing loan amounts, or adjusting interest rates for high-risk customers.# Steps Included:
- *Data Preparation:* Begin by loading the dataset into Excel. Review and clean the data, addressing any missing values or outliers.
- *EDA Process:* Follow the outlined tasks to perform the EDA, ensuring to document your findings and insights at each step.
- *Reporting:* Use the results from the EDA to create meaningful visualizations and summaries to present the key findings.
# Analysis:
1. **Missing Data Identification and Handling.**- **Objective:**
To Identify missing data points in the loan application dataset and handle it appropriately to maintain the accuracy of the analysis.
- **Steps:**
Used Excel functions like COUNT, ISBLANK, and IF to identify missing data.
Applyied imputation methods such as AVERAGE or MEDIAN to fill in missing values.
- **Visualization:**
A bar chart, representing the proportion of missing values for each variable.
2. **Identification of Outliers.**
- **Objective:**Detecting the possible outliers in the dataset, focusing on numerical variables, and assess their impact on the analysis.
- **Steps:**Used Excel functions like QUARTILE and IQR to identify outliers.
Applied conditional formatting to highlight potential outliers and determine if they are valid or require further investigation.
- **Visualization:**
Scatter plot to visualize the distribution of numerical variables and highlight outliers.
3. **Analyze Data Imbalance**
- **Objective:**Determination if there is a data imbalance in the dataset and calculation of the ratio of data imbalance, which is critical for reliable model-building.
- **Steps:**
Used Excel functions like COUNTIF and SUM to calculate the class proportions.
Assessed the frequency of each class to understand the extent of imbalance.
- **Visualization:**
A pie chart, visualising the distribution of the target variable and highlight the imbalance.
4. **Univariate, Segmented Univariate, and Bivariate Analysis.**
- **Objective:**
Conduct of various analyses to gain insights into the factors driving loan defaults.
- **Steps:**
Performed univariate analysis to understand the distribution of individual variables.
Conducted segmented univariate analysis to compare distributions across different scenarios.
Exploration of bivariate relationships between variables and the target variable.
- **Visualization:**
Column charts, visualising the distributions and relationships.
5. **Identify Top Correlations for Different Scenarios.**
- **Objective:**
Segment the dataset by different scenarios and identify the top correlations between variables and the target variable.
- **Steps:**
Used Excel's CORREL function to calculate correlation coefficients within each segment.
Ranked the correlations to identify the top indicators of loan default.
- **Visualization:**
Correlation matrics, Heatmaps to visualize correlations, with highlighted top correlated variables.
## Screenshots![Image1](https://i.imgur.com/fwASt2j.jpeg)
![Image2](https://i.imgur.com/WF65dgw.jpeg)
![Image3](https://i.imgur.com/sfAUu5O.jpeg)
![Image4](https://i.imgur.com/nETTIi8.jpeg)
![Image5](https://i.imgur.com/UEyFj1K.jpeg)
![Image6](https://i.imgur.com/EIWFb3R.jpeg)
![Image7](https://i.imgur.com/juFvaLM.jpeg)
![Image8](https://i.imgur.com/R7A3MW3.jpeg)
- Univariate Analysis
![Image9](https://i.imgur.com/SXL99kD.jpeg)- Segmented Univariate Analysis
![Image10](https://i.imgur.com/njOeZMH.jpeg)- Bivariate Analysis
![Image11](https://i.imgur.com/poz263v.jpeg)![Image12](https://i.imgur.com/tOUDvco.jpeg)
![Image13](https://i.imgur.com/1cTnW8e.jpeg)
# Conclusion
This project provides a detailed approach to understanding the factors that influence loan defaults using EDA techniques.The insights derived from this analysis can significantly improve the decision-making process in loan approvals, helping to mitigate risks while maximizing business opportunities.
## Tech Stack- **Excel:** For data analysis, visualization, and statistical computations.
- **PowerPoint:** For presenting findings and insights.
- **Loom Video:** Video presentation.
### Links
- *Loom Video Presentation*:-
https://www.loom.com/share/00f0b8072f4b492684f82f4cdaca7858?sid=8972302a-0492-4f07-a3d1-b5bc702954ac### Author
- **Mainak Mukherjee**
- Email: [email protected]
- Linkedin: www.linkedin.com/in/mainak8
### Concepts Used
* Advanced Excel Technicality
* Data Visualization
* Statistical Knowledge
## Project Impact and Learning Experience- **Project Impact:**
This project provided critical insights into the factors that influence loan defaults, enabling more informed decision-making in the loan approval process. By identifying key indicators of risk, the company can now better assess applicants and tailor loan offerings, leading to reduced financial losses and optimized business operations. The EDA techniques applied in this project help ensure that capable applicants are not unjustly rejected while minimizing the approval of high-risk loans, thereby improving overall financial stability and customer satisfaction.- **Learning Experience:**
Working on this project offered valuable experience in applying Exploratory Data Analysis (EDA) to real-world financial data. It enhanced my ability to identify and handle missing data, detect outliers, and manage data imbalance—all critical skills in data analytics. Additionally, the project deepened my understanding of risk analytics in banking, particularly how customer and loan attributes can predict loan defaults. The process of segmenting data and identifying correlations provided hands-on experience in using Excel’s analytical tools, which will be beneficial in future data-driven decision-making scenarios.