{"id":18331023,"url":"https://github.com/dataquestio/loan-prediction","last_synced_at":"2025-04-07T15:07:26.059Z","repository":{"id":52380710,"uuid":"62182093","full_name":"dataquestio/loan-prediction","owner":"dataquestio","description":"Predict which loans will be foreclosed on.","archived":false,"fork":false,"pushed_at":"2023-05-10T09:22:52.000Z","size":5,"stargazers_count":219,"open_issues_count":5,"forks_count":142,"subscribers_count":21,"default_branch":"master","last_synced_at":"2025-03-31T12:08:08.253Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/dataquestio.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2016-06-29T00:07:26.000Z","updated_at":"2025-02-23T16:08:48.000Z","dependencies_parsed_at":"2024-11-05T19:54:28.596Z","dependency_job_id":null,"html_url":"https://github.com/dataquestio/loan-prediction","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dataquestio%2Floan-prediction","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dataquestio%2Floan-prediction/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dataquestio%2Floan-prediction/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dataquestio%2Floan-prediction/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/dataquestio","download_url":"https://codeload.github.com/dataquestio/loan-prediction/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247675597,"owners_count":20977376,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-05T19:27:46.655Z","updated_at":"2025-04-07T15:07:26.037Z","avatar_url":"https://github.com/dataquestio.png","language":"Python","readme":"Loan Prediction\n-----------------------\n\nPredict whether or not loans acquired by Fannie Mae will go into foreclosure.  Fannie Mae acquires loans from other lenders as a way of inducing them to lend more.  Fannie Mae releases data on the loans it has acquired and their performance afterwards [here](http://www.fanniemae.com/portal/funding-the-market/data/loan-performance-data.html).\n\nInstallation\n----------------------\n\n### Download the data\n\n* Clone this repo to your computer.\n* Get into the folder using `cd loan-prediction`.\n* Run `mkdir data`.\n* Switch into the `data` directory using `cd data`.\n* Download the data files from Fannie Mae into the `data` directory.  \n    * You can find the data [here](http://www.fanniemae.com/portal/funding-the-market/data/loan-performance-data.html).\n    * You'll need to register with Fannie Mae to download the data.\n    * It's recommended to download all the data from 2012 Q1 to present.\n* Extract all of the `.zip` files you downloaded.\n    * On OSX, you can run `find ./ -name \\*.zip -exec unzip {} \\;`.\n    * At the end, you should have a bunch of text files called `Acquisition_YQX.txt`, and `Performance_YQX.txt`, where `Y` is a year, and `X` is a number from `1` to `4`.\n* Remove all the zip files by running `rm *.zip`.\n* Switch back into the `loan-prediction` directory using `cd ..`.\n\n### Install the requirements\n \n* Install the requirements using `pip install -r requirements.txt`.\n    * Make sure you use Python 3.\n    * You may want to use a virtual environment for this.\n\nUsage\n-----------------------\n\n* Run `mkdir processed` to create a directory for our processed datasets.\n* Run `python assemble.py` to combine the `Acquisition` and `Performance` datasets.\n    * This will create `Acquisition.txt` and `Performance.txt` in the `processed` folder.\n* Run `python annotate.py`.\n    * This will create training data from `Acquisition.txt` and `Performance.txt`.\n    * It will add a file called `train.csv` to the `processed` folder.\n* Run `python predict.py`.\n    * This will run cross validation across the training set, and print the accuracy score.\n\nExtending this\n-------------------------\n\nIf you want to extend this work, here are a few places to start:\n\n* Generate more features in `annotate.py`.\n* Switch algorithms in `predict.py`.\n* Add in a way to make predictions on future data.\n* Try seeing if you can predict if a bank should have issued the loan.\n    * Remove any columns from `train` that the bank wouldn't have known at the time of issuing the loan.\n        * Some columns are known when Fannie Mae bought the loan, but not before\n    * Make predictions.\n* Explore seeing if you can predict columns other than `foreclosure_status`.\n    * Can you predict how much the property will be worth at sale time?\n* Explore the nuances between performance updates.\n    * Can you predict how many times the borrower will be late on payments?\n    * Can you map out the typical loan lifecycle?","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdataquestio%2Floan-prediction","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdataquestio%2Floan-prediction","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdataquestio%2Floan-prediction/lists"}