https://github.com/compcode1/liver-disease-predictions-ml
The goal of this project was to develop a machine learning model to predict Liver Disease (Liver Dx) using health-related features, leveraging the powerful XGBoost algorithm.
https://github.com/compcode1/liver-disease-predictions-ml
Last synced: 10 months ago
JSON representation
The goal of this project was to develop a machine learning model to predict Liver Disease (Liver Dx) using health-related features, leveraging the powerful XGBoost algorithm.
- Host: GitHub
- URL: https://github.com/compcode1/liver-disease-predictions-ml
- Owner: Compcode1
- License: gpl-3.0
- Created: 2024-10-17T23:31:59.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-10-17T23:38:13.000Z (over 1 year ago)
- Last Synced: 2025-03-18T01:15:51.183Z (about 1 year ago)
- Language: Jupyter Notebook
- Size: 164 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
The goal of this project was to develop a machine learning model to predict **Liver Disease (Liver Dx)** using health-related features, leveraging the powerful **XGBoost** algorithm. Machine learning models in healthcare applications require careful balancing between precision (avoiding false positives) and recall (minimizing false negatives) to ensure reliable predictions. This project aimed to create a model that can accurately predict liver disease while maintaining a strong balance between these two metrics.
### General Approach:
The typical machine learning pipeline involves steps such as data preprocessing, feature engineering, model selection, and hyperparameter tuning. Hyperparameter tuning is usually a crucial step to refine the model and improve its generalization capabilities. However, in this project, the initial evaluation of the XGBoost model showed such outstanding results that we opted to deviate from the usual workflow. Instead of focusing on hyperparameter optimization, we shifted our focus toward adjusting the classification **threshold**.
### Deviation from Hyperparameter Tuning:
After initial model training, the results were highly promising:
- **AUC-ROC**: 0.9994
- **Accuracy**: 0.9898
- **Precision**: 0.9743
- **Recall**: 0.9646
- **F1 Score**: 0.9694
Given the strength of these results, the need for hyperparameter tuning was minimized. Rather than spending time on tuning model parameters like `max_depth`, `n_estimators`, or `learning_rate`, we decided to focus on adjusting the **decision threshold**. This allowed us to refine the balance between **Precision** and **Recall** without the additional complexity of hyperparameter tuning.
### Focus on Threshold Adjustment:
Adjusting the threshold enables the model to emphasize either **Precision** (reducing false positives) or **Recall** (reducing false negatives), depending on the specific requirements of the problem. In healthcare applications like liver disease detection, reducing false negatives (i.e., missing actual cases of liver disease) is often more critical. However, we also wanted to explore a scenario where reducing false positives would be more valuable.
The primary benefit of adjusting the threshold is that it allows us to control the trade-off between **Precision** and **Recall** directly. By raising or lowering the threshold, we can skew the model's predictions to suit the needs of a particular use case. This flexibility makes the model more adaptable to real-world applications where different levels of risk tolerance might be required.
### Conclusion:
In this project, by focusing on threshold adjustments, we were able to explore how the model behaves under different decision-making scenarios, prioritizing either **Precision** or **Recall** depending on the context. This approach streamlined the development process, leveraging the already excellent baseline performance of the XGBoost model. This deviation from the traditional model optimization route showcases how strong initial model performance can simplify the workflow, allowing for a more targeted fine-tuning process focused on achieving specific goals.