https://github.com/kurkigal/mental-health-prediction-xgboost

Last synced: 2 months ago
JSON representation

Host: GitHub
URL: https://github.com/kurkigal/mental-health-prediction-xgboost
Owner: KurKigal
License: mit
Created: 2025-08-26T13:33:12.000Z (2 months ago)
Default Branch: main
Last Pushed: 2025-08-26T13:35:06.000Z (2 months ago)
Last Synced: 2025-08-26T18:28:57.476Z (2 months ago)
Language: Jupyter Notebook
Size: 10.7 KB
Stars: 0
Watchers: 0
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# Mental Health Prediction with XGBoost

This project develops a machine learning model to predict depression risk using XGBoost classifier, achieving 93.8% accuracy on mental health survey data.

## Project Objective
To predict depression risk based on various lifestyle, academic, and professional factors using machine learning techniques for early mental health intervention.

## Dataset Information
- **Source**: Kaggle Playground Series S4E11
- **Size**: 140,700 training samples × 20 features
- **Target**: Binary classification (Depression: 0/1)
- **Features**: Demographics, lifestyle, work/study patterns, mental health indicators

## Key Features
- **Demographics**: Age, Gender, City
- **Education/Work**: Academic pressure, work pressure, CGPA, job satisfaction
- **Lifestyle**: Sleep duration, dietary habits, work/study hours
- **Mental Health**: Suicidal thoughts, family history, financial stress

## Technologies Used
```python
pandas==1.5.0
numpy==1.23.0
xgboost==1.7.0
scikit-learn==1.1.0
matplotlib==3.6.0
seaborn==0.12.0
```

## Data Preprocessing
- **Missing Value Treatment**:
- Categorical: Mode imputation
- Numerical: Mean imputation
- **Encoding**: One-hot encoding for categorical variables
- **Train-Validation Split**: 80-20 ratio

## Model Performance
### XGBoost Classifier Results:
- **Accuracy**: 93.81%
- **Precision**:
- Class 0 (No Depression): 96%
- Class 1 (Depression): 84%
- **Recall**:
- Class 0: 96%
- Class 1: 82%
- **F1-Score**:
- Class 0: 96%
- Class 1: 83%

### Class Distribution:
- **No Depression (0)**: 22,986 samples (81.7%)
- **Depression (1)**: 5,154 samples (18.3%)

## Installation & Usage

### Requirements
```bash
pip install pandas numpy xgboost scikit-learn matplotlib seaborn jupyter
```

### Data Download
Download the dataset from Kaggle:
```bash
kaggle competitions download -c playground-series-s4e11
```

### Execution
```bash
jupyter notebook mental-health-prediction-xgboost-93-acc.ipynb
```

## Model Insights
- **High Accuracy**: 93.8% overall classification accuracy
- **Balanced Performance**: Good precision-recall balance for both classes
- **Feature Importance**: Work pressure, sleep patterns, and financial stress are key predictors
- **Scalable**: XGBoost handles large datasets efficiently

## Key Findings
1. **Strong Predictors**: Suicidal thoughts history, family mental illness history, and work pressure are significant indicators
2. **Lifestyle Impact**: Sleep duration and dietary habits correlate with depression risk
3. **Academic vs Professional**: Both academic and work-related stress contribute to mental health outcomes
4. **Gender Patterns**: Model captures gender-specific depression risk patterns

## Applications
- **Healthcare Screening**: Early depression risk assessment
- **Educational Support**: Student mental health monitoring
- **Workplace Wellness**: Employee mental health programs
- **Research**: Mental health epidemiology studies

## Limitations
- Based on self-reported survey data
- May not capture all depression complexity
- Requires clinical validation for real-world deployment
- Potential bias in survey responses

## Future Improvements
- Feature engineering for temporal patterns
- Ensemble methods with multiple algorithms
- Cross-validation for robust performance
- External dataset validation
- Clinical expert consultation

## Ethical Considerations
This model is intended for research and screening purposes only. Professional medical consultation is essential for any mental health diagnosis or treatment decisions.

## Developer
- **Kaggle**: [@kurkigal](https://www.kaggle.com/kurkigal)

## License
MIT License

## Contributing
Contributions welcome! Please ensure ethical guidelines are followed when working with mental health data.

---
⭐ If this project helps advance mental health research, please consider starring it!

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/kurkigal/mental-health-prediction-xgboost

Awesome Lists containing this project

README