https://github.com/rbhatia46/lendingclub-loan-analysis
Lending Club Loan Default Analysis using historic loan applications data.
https://github.com/rbhatia46/lendingclub-loan-analysis
exploratory-data-analysis finance loan-default-prediction matplotlib pandas python seaborn
Last synced: 8 months ago
JSON representation
Lending Club Loan Default Analysis using historic loan applications data.
- Host: GitHub
- URL: https://github.com/rbhatia46/lendingclub-loan-analysis
- Owner: rbhatia46
- Created: 2019-07-28T04:03:56.000Z (over 6 years ago)
- Default Branch: master
- Last Pushed: 2019-07-28T04:42:17.000Z (over 6 years ago)
- Last Synced: 2025-01-24T18:37:04.889Z (10 months ago)
- Topics: exploratory-data-analysis, finance, loan-default-prediction, matplotlib, pandas, python, seaborn
- Language: Jupyter Notebook
- Homepage:
- Size: 8.4 MB
- Stars: 4
- Watchers: 3
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
[](http://hits.dwyl.io/rbhatia46/LendingClub-Loan-Analysis)
# LendingClub-Loan-Analysis

Lending loans to ‘risky’ applicants is the largest source of financial loss(called credit loss) for any bank/lending company. If we are able to identify these risky loan applicants, then such loans can be reduced thereby cutting down the amount of credit loss. Identification of such applicants using Data Analysis is the aim of this case study. Lending Club (a peer-to-peer lending company) wants to understand the driving factors behind loan default. The company can utilise this knowledge for its portfolio and risk assessment.
2 types of risks are associated with the
bank’s decision:
* If the applicant is likely to repay the loan, then not approving the
loan results in a loss of business to the company
* If the applicant is not likely to repay the loan, i.e. he/she is likely to
default, then approving the loan may lead to a financial loss for the
company
## Data Used :
The data used was acquired from Kaggle, open-sourced by LendingClub itself to welcome Data Scientist help them identify driving factors behind
loan default, using historic data of loan applications.
* Be sure to checkout the Data Dictionary for the meaning of each column in the dataset.
* The data given contains the information about past loan applicants and
whether they ‘defaulted’ or not. The aim is to identify patterns which
indicate if a person is likely to default, which may be used for taking
actions such as denying the loan, reducing the amount of loan, lending (to
risky applicants) at a higher interest rate, etc.
* When a person applies for a loan, there are 2 types of decisions that
could be taken by the company:
* **Loan accepted** -
If the company approves the loan, there are 3
possible scenarios described below:
1. Fully paid: Applicant has fully paid the loan (the principal and
the interest rate)
2. Current: Applicant is in the process of paying the installments,
i.e. the tenure of the loan is not yet completed. These
candidates are not labelled as 'defaulted'.
3. Charged-off: Applicant has not paid the instalments in due
time for a long period of time, i.e. he/she has defaulted on the
loan.
* **Loan rejected** - The company had rejected the loan (because the
candidate does not meet their requirements etc.). Since the loan was
rejected, there is no transactional history of those applicants with the
company and so this data is not available with the company (and thus
in this dataset)
* This project is typical a Data Exploration project and tries to find as much findings as possible without the use of any Machine Learning and just simple plain EDA and Data Visualization.
* The analysis and Visualizations in the notebook are self-explanatory to decide the driving factors for a new loan application and act accordingly.