https://github.com/nouranhaitham/ml_loanprediction
A notebook aimed to predict loan eligibility with high accuracy using advanced machine learning models including Logistic Regression and KNN.
https://github.com/nouranhaitham/ml_loanprediction
classification-model decision-making insights knn-classifier loan-prediction-analysis logist logistic-regression machine-learning python regression-models
Last synced: 8 months ago
JSON representation
A notebook aimed to predict loan eligibility with high accuracy using advanced machine learning models including Logistic Regression and KNN.
- Host: GitHub
- URL: https://github.com/nouranhaitham/ml_loanprediction
- Owner: NouranHaitham
- Created: 2024-02-10T13:44:41.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-09-16T22:04:41.000Z (about 1 year ago)
- Last Synced: 2025-01-15T07:36:50.559Z (9 months ago)
- Topics: classification-model, decision-making, insights, knn-classifier, loan-prediction-analysis, logist, logistic-regression, machine-learning, python, regression-models
- Language: Jupyter Notebook
- Homepage:
- Size: 391 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Loan Prediction Machine Learning Project 📊💡
Welcome to the Loan Prediction Machine Learning Project repository! This project focuses on predicting loan eligibility based on various customer attributes. By leveraging classification algorithms, we aim to develop a robust model that accurately assesses whether a customer qualifies for a loan.
## Overview 🌟
Our project encompasses the following key stages:
1. **Data Exploration:** We conducted an in-depth analysis of the dataset to understand the impact of attributes like marital status, education, and employment status on loan eligibility.
2. **Data Preprocessing:** We handled missing values, outliers, and encoded categorical variables to prepare the data for model training.
3. **Feature Analysis:** We explored trends and correlations in the data, uncovering insights into factors affecting loan approvals.
4. **Model Implementation:** We implemented and compared classification models, including Logistic Regression and K-Nearest Neighbors (KNN), to determine the best-performing algorithm for loan prediction.
5. **Performance Evaluation:** The models were evaluated using accuracy, precision, recall, and F1-score metrics. Logistic Regression emerged as the superior model with an accuracy of 79% compared to KNN's 77%.
6. **Predicting Loan Status:** We used the Logistic Regression model to predict loan status for new customers and analyzed the results to gain further insights.
## Key Features 🚀
- **Logistic Regression:** A powerful classification model that achieved an accuracy of 79%, effectively identifying eligible and ineligible loan applicants.
- **K-Nearest Neighbors (KNN):** An alternative classification model evaluated using GridSearchCV to find the optimal number of neighbors. Although KNN performed well, Logistic Regression proved to be more accurate.- **Data Insights:** Analyzed patterns in the new customer data to identify trends such as the percentage of married individuals in semiurban areas who obtained loans.
- **Visualization:** Utilized visualizations to understand the distribution of loan status across different attributes like marital status and employment.
## Getting Started 🚀
1. **Clone the Repository:**
```bash
git clone https://github.com/yourusername/loan-prediction-project.git
```
2. **Navigate to the Project Directory:**
```bash
cd loan-prediction-project
```
3. **Install Dependencies:**
```bash
pip install -r requirements.txt
```
4. **Run Data Analysis and Model Training Scripts:**
To explore and preprocess the data:
```bash
jupyter notebook data_analysis.ipynb
```
To train and evaluate the models:
```bash
python train_models.py
```## Performance Metrics 📈
- **Logistic Regression:**
- **Accuracy:** 79%
- **Precision:**
- Class 0: 0.95
- Class 1: 0.76
- **Recall:**
- Class 0: 0.40
- Class 1: 0.99
- **F1-score:**
- Class 0: 0.56
- Class 1: 0.86- **K-Nearest Neighbors (KNN):**
- **Accuracy:** 77%
- **Precision:**
- Class 0: 0.81
- Class 1: 0.76
- **Recall:**
- Class 0: 0.42
- Class 1: 0.95
- **F1-score:**
- Class 0: 0.55
- Class 1: 0.84## Insights and Analysis 🔍
- The Logistic Regression model demonstrates superior performance compared to K-Nearest Neighbors (KNN) with a higher accuracy of 79% versus 77%.
- Key insights from the new customer data reveal significant trends, such as the percentage of married individuals in semiurban areas who secured loans.