https://github.com/kwokhing/wids-datathon-patient-survival
A challenge to create a model that uses data from the first 24 hours of intensive care to predict patient survival
https://github.com/kwokhing/wids-datathon-patient-survival
feature-engineering gradient-boosting-machine imputation kaggle lightgbm machine-learning
Last synced: 8 months ago
JSON representation
A challenge to create a model that uses data from the first 24 hours of intensive care to predict patient survival
- Host: GitHub
- URL: https://github.com/kwokhing/wids-datathon-patient-survival
- Owner: KwokHing
- Created: 2021-07-31T07:18:09.000Z (over 4 years ago)
- Default Branch: main
- Last Pushed: 2023-10-22T06:49:09.000Z (about 2 years ago)
- Last Synced: 2025-01-30T05:26:54.583Z (10 months ago)
- Topics: feature-engineering, gradient-boosting-machine, imputation, kaggle, lightgbm, machine-learning
- Language: Jupyter Notebook
- Homepage: https://www.kaggle.com/competitions/widsdatathon2020
- Size: 54.7 KB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
## WiDS Datathon On Predicting Patient Survival
This repo provides the submission entry for a [kaggle dataton](https://www.kaggle.com/competitions/widsdatathon2020) to create a model that uses data from the first 24 hours of intensive care to predict patient's survival.
There are numerous missing data in this challenge, the key to getting higher accuracy lies in (i) imputation of data, (ii) feature engineering based on domain knowledge (e.g calculating BMI, or other medical metrics), and (iii) features selection (as there are far too many features in the dataset and not all are useful). LightGBM is used to achieve an approximately 90% accuracy, I believe that any gradient boosted models with some decent work on data imputation, feature engineering and selection should provide a fairly accurate prediction model.

## Getting started
Open `WiDS_Patient_Survival.ipynb` on a jupyter notebook environment. Alternatively, you can view the codes in [](https://colab.research.google.com/drive/1SyQV6VI7hIbXPwTPOzhsOOoGgqRXwR2w?usp=sharing). The notebook consists of further technical details.
## Improvements
Could potentially explore the use of Deep Learning Techniques.