{"id":18518944,"url":"https://github.com/chrispsang/healthcare-dataanalysis","last_synced_at":"2026-04-29T01:31:45.419Z","repository":{"id":250606401,"uuid":"834865897","full_name":"chrispsang/HealthCare-DataAnalysis","owner":"chrispsang","description":"Analyze synthetic patient data to identify trends, improve healthcare delivery, and predict patient outcomes using machine learning models. Includes data exploration, preprocessing, model building, and visualizations.","archived":false,"fork":false,"pushed_at":"2024-08-08T20:59:17.000Z","size":195,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-05-14T16:16:57.586Z","etag":null,"topics":["data-analysis","data-science","data-visualization","healthcare","jupyter-notebook","machine-learning","python"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/chrispsang.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2024-07-28T15:46:11.000Z","updated_at":"2024-08-08T20:59:20.000Z","dependencies_parsed_at":"2024-08-08T23:00:47.026Z","dependency_job_id":"a3f9a73f-581d-47ea-a0dd-41e881ed739a","html_url":"https://github.com/chrispsang/HealthCare-DataAnalysis","commit_stats":null,"previous_names":["chrispsang/healthcare-dataanalysis"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/chrispsang/HealthCare-DataAnalysis","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/chrispsang%2FHealthCare-DataAnalysis","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/chrispsang%2FHealthCare-DataAnalysis/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/chrispsang%2FHealthCare-DataAnalysis/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/chrispsang%2FHealthCare-DataAnalysis/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/chrispsang","download_url":"https://codeload.github.com/chrispsang/HealthCare-DataAnalysis/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/chrispsang%2FHealthCare-DataAnalysis/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32407164,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-28T19:38:08.556Z","status":"ssl_error","status_checked_at":"2026-04-28T19:37:55.688Z","response_time":56,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["data-analysis","data-science","data-visualization","healthcare","jupyter-notebook","machine-learning","python"],"created_at":"2024-11-06T17:14:54.047Z","updated_at":"2026-04-29T01:31:45.405Z","avatar_url":"https://github.com/chrispsang.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003ch1 align=\"center\"\u003eHealthcare Data Analysis Project\u003c/h1\u003e\n\nThis project involves analyzing synthetic patient data to identify trends, improve healthcare delivery, and predict patient outcomes. The project utilizes various machine learning models to derive insights from the data. By leveraging these models, we aim to uncover patterns and relationships within the dataset that can inform better healthcare practices and decision-making.\n\n## Table of Contents\n- [Key Objectives](#key-objectives)\n- [Data Description](#data-description)\n- [Visualizations](#visualizations)\n- [Installation](#installation)\n- [Results](#results)\n\n## Key Objectives\n- **Data Exploration:** Understanding the structure and characteristics of the dataset.\n- **Data Preprocessing:** Cleaning and preparing the data for analysis.\n- **Model Building:** Developing predictive models to forecast patient outcomes.\n- **Insights \u0026 Visualizations:** Deriving actionable insights and presenting them through visualizations.\n\n## Data Description\nThe dataset consists of synthetic data for 1000 patients with the following features:\n*   `Age`: Age of the patient in years.\n*   `Gender`: Gender of the patient (Male/Female).\n*   `Height`: Height of the patient in centimeters.\n*   `Weight`: Weight of the patient in kilograms.\n*   `Diagnosis`: Medical diagnosis (Diabetes, Hypertension, Heart Disease, Healthy).\n*   `Outcome`: Binary outcome (0: No event, 1: Event occurred).\n\n## Visualizations\n![Feature Importances from Random Forest](feature_importance.png)\n\n*Feature Importances from Random Forest*\n\nThis bar chart displays the importance of various features in predicting patient outcomes using a Random Forest model. Features like Age, BMI, and Weight have higher importance scores, indicating they are more influential in predicting the outcomes compared to other features like Gender and Diagnosis.\n\n![Correlation Heatmap](correlation_heatmap.png)\n\n*Correlation Heatmap*\n\nThis heatmap shows the correlation coefficients between different features in the dataset. The color intensity represents the strength of the correlation, with red indicating a positive correlation and blue indicating a negative correlation. For example, the strong positive correlation between Weight and BMI (0.81) indicates that as weight increases, BMI also increases.\n\n## Installation \n\n### Prerequisites\nEnsure you have the following installed on your system:\n- [Python 3.x](https://www.python.org/downloads/)\n- [Jupyter Notebook](https://jupyter.org/install)\n\n### Step 1: Clone the Repository\nClone the repository to your local machine using the following command:\n```sh\ngit clone https://github.com/chrispsang/HealthCare-DataAnalysis.git\ncd HealthCare-DataAnalysis\n```\n\n### Step 2: Set Up the Environment\n\nIt is recommended to use a virtual environment to manage dependencies. You can set up a virtual environment using `venv` or `conda`.\n\n### Using `venv`\n1. Create a virtual environment:\n    ```sh\n    python -m venv venv\n    ```\n2. Activate the virtual environment:\n    - On macOS/Linux:\n        ```sh\n        source venv/bin/activate\n        ```\n    - On Windows:\n        ```sh\n        venv\\Scripts\\activate\n        ```\n\n### Using `conda`\n1. Create a conda environment:\n    ```sh\n    conda create --name healthcare_analysis python=3.x\n    ```\n2. Activate the conda environment:\n    ```sh\n    conda activate healthcare_analysis\n    ```\n\n### Step 3: Install Dependencies\n\nInstall the required packages using the `requirements.txt` file:\n```sh\npip install -r requirements.txt\n```\n### Step 4: Run the Jupyter Notebook\nLaunch Jupyter Notebook and open the `healthcare.ipynb` file:\n```sh\njupyter notebook\n```\n\n## Results\n\nThe project results include:\n*   **Exploratory Data Analysis (EDA):** Summary statistics and visualizations to understand the dataset.\n*   **Model Performance:** Evaluation metrics for various machine learning models.\n*   **Insights:** Key findings from the analysis, including potential factors influencing patient outcomes.\n-------\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fchrispsang%2Fhealthcare-dataanalysis","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fchrispsang%2Fhealthcare-dataanalysis","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fchrispsang%2Fhealthcare-dataanalysis/lists"}