https://github.com/birmingham-and-solihull-ics/unsupervised-clustering-practices
K-means and hierarchical clustering of GP practices based on QOF performance measures
https://github.com/birmingham-and-solihull-ics/unsupervised-clustering-practices
hierachical-clustering k-means-clustering python qof unsupervised-machine-learning
Last synced: 12 days ago
JSON representation
K-means and hierarchical clustering of GP practices based on QOF performance measures
- Host: GitHub
- URL: https://github.com/birmingham-and-solihull-ics/unsupervised-clustering-practices
- Owner: Birmingham-and-Solihull-ICS
- License: other
- Created: 2025-02-20T10:54:09.000Z (4 months ago)
- Default Branch: main
- Last Pushed: 2025-02-20T10:57:43.000Z (4 months ago)
- Last Synced: 2025-03-11T09:44:32.898Z (4 months ago)
- Topics: hierachical-clustering, k-means-clustering, python, qof, unsupervised-machine-learning
- Homepage:
- Size: 24.4 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# **GP Practice Segmentation Using Unsupervised Machine Learning**
## **Project Overview**
This project aims to apply **unsupervised machine learning** techniques, specifically **K-Means clustering** and **Hierarchical clustering**, to segment **General Practitioner (GP) practices** based on **Quality and Outcomes Framework (QOF) performance measures** and other **demographic markers**. By identifying meaningful clusters, this analysis will help uncover patterns in healthcare performance and inform targeted interventions.## **Objectives**
- Explore and preprocess **QOF performance data** and **demographic indicators**.
- Apply **K-Means** and **Hierarchical clustering** to group GP practices into meaningful segments.
- Evaluate the clustering results to identify patterns and insights.
- Visualize and interpret the clusters for actionable recommendations.## **Project Status**
🟢 **Ongoing** (Early Stages)
- **Data collection**: In progress
- **Data cleaning and preprocessing**: Not started
- **Feature selection**: Not started
- **Clustering implementation**: Not started
- **Evaluation and visualization**: Not started## **Data Sources**
The analysis will utilize publicly available data, including but not limited to:
- **Quality and Outcomes Framework (QOF)** – Performance measures of GP practices.
- **Demographic Data** – Population characteristics, socioeconomic factors, and regional variations.
- **Additional Datasets** – Any relevant healthcare indicators.## **Methodology**
1. **Data Preparation**
- Collect and preprocess **QOF and demographic data**.
- Handle missing values, outliers, and standardize features.
2. **Feature Engineering**
- Select relevant performance and demographic metrics.
- Apply **dimensionality reduction (if needed)** to improve clustering performance.3. **Clustering Techniques**
- Implement **K-Means clustering** to segment GP practices.
- Use **Hierarchical clustering** for alternative segmentation and comparison.4. **Model Evaluation**
- Use **Elbow Method** and **Silhouette Score** to determine the optimal number of clusters.
- Compare clustering results and interpret key patterns.5. **Visualization & Insights**
- Generate **heatmaps**, **scatter plots**, and **geospatial maps** to illustrate cluster characteristics.
- Summarize findings to inform policy recommendations.This repository is dual licensed under the [Open Government v3]([https://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/) & MIT. All code can outputs are subject to Crown Copyright.