Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/keshav434/loan-application-data-analysis
Data Analysis and visualization project involing bias detection and building predictive models using Python.
https://github.com/keshav434/loan-application-data-analysis
data-analytics data-visualization python
Last synced: about 2 months ago
JSON representation
Data Analysis and visualization project involing bias detection and building predictive models using Python.
- Host: GitHub
- URL: https://github.com/keshav434/loan-application-data-analysis
- Owner: keshav434
- Created: 2024-07-19T10:29:08.000Z (6 months ago)
- Default Branch: main
- Last Pushed: 2024-08-18T18:41:15.000Z (5 months ago)
- Last Synced: 2024-08-18T19:53:56.635Z (5 months ago)
- Topics: data-analytics, data-visualization, python
- Language: Jupyter Notebook
- Homepage:
- Size: 864 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
Project Info:
Rahuri Finance requires building an AI model to predict the "Loan Payment Failure" tendency (a binary classification problem) for a given loan application. Additionally, we need to identify any bias towards specific attributes.Project Steps:
(a) Preprocessing:
Detail the preprocessing steps, including handling missing values, plotting graphs, data discretization, normalization, and data encoding.
(b) Initial Data Exploration:
Utilize techniques like association rule mining and clustering for initial data exploration to identify potential biases. Report any findings.
(c) Building the Classification Model:
Explore various classification techniques (e.g., Decision Trees, Neural Networks, Naïve Bayes, SVM, and Ensemble methods).
Evaluate training and validation errors to determine model suitability.
Use cross-validation for model assessment and perform hyperparameter tuning with tools like GridSearchCV.
Report the final cross-validated accuracies and optimal hyperparameters for each classification model.
Tools and Libraries:Python Libraries: Pandas, NumPy, Matplotlib, and Scikit-learn.