{"id":23334067,"url":"https://github.com/dizzydroid/asu_seniorproject_ml","last_synced_at":"2025-04-07T12:13:56.261Z","repository":{"id":268998102,"uuid":"900271464","full_name":"dizzydroid/ASU_SeniorProject_ML","owner":"dizzydroid","description":"An ASU | CSE375: Machine Learning project  —  COVID-19 Outcome Prediction using different ML models, and finding the optimal model for this classification task.","archived":false,"fork":false,"pushed_at":"2025-01-05T11:46:37.000Z","size":1387,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-02-13T14:46:27.091Z","etag":null,"topics":["classification","decision-trees","knn-classification","logistic-regression","machine-learning","machine-learning-algorithms","naive-bayes-classifier","svm"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/dizzydroid.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-12-08T10:52:18.000Z","updated_at":"2025-01-05T11:46:40.000Z","dependencies_parsed_at":"2024-12-20T08:28:00.113Z","dependency_job_id":"395f994c-ce7a-41e0-b676-e0ac45de074c","html_url":"https://github.com/dizzydroid/ASU_SeniorProject_ML","commit_stats":null,"previous_names":["dizzydroid/asu_seniorproject_ml"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dizzydroid%2FASU_SeniorProject_ML","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dizzydroid%2FASU_SeniorProject_ML/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dizzydroid%2FASU_SeniorProject_ML/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dizzydroid%2FASU_SeniorProject_ML/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/dizzydroid","download_url":"https://codeload.github.com/dizzydroid/ASU_SeniorProject_ML/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247648975,"owners_count":20972945,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["classification","decision-trees","knn-classification","logistic-regression","machine-learning","machine-learning-algorithms","naive-bayes-classifier","svm"],"created_at":"2024-12-21T00:37:35.143Z","updated_at":"2025-04-07T12:13:56.234Z","avatar_url":"https://github.com/dizzydroid.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"\n# COVID-19 Outcome Prediction\n\u003cdiv id=\"header\" align=\"center\"\u003e\n \u003cimg src=\"assets/asu_ml.png\"\u003e\n\u003c/div\u003e\n\n## Problem Overview\n\nThe goal of this project is to predict the outcome (recovery or death) for individuals infected with COVID-19. The prediction is based on a set of pre-defined symptoms and demographic factors, using time-series data provided by the World Health Organization (WHO). \n\nGiven the ongoing global pandemic, early detection of the likely outcome can help healthcare professionals prioritize resources and patient care. \n\nThe dataset includes information from January 22, 2020, and provides features such as:\n- **Country \u0026 Location**\n- **Age Group**\n- **Gender**\n- **Symptoms**\n- **History of Visit to Wuhan**\n\nYou are tasked with developing classifiers that can accurately predict the outcome for new hospital admissions.\n\n---\n\n## Models \u0026 Approach\n\n### 1. **K-Nearest Neighbors (KNN)**\n\u003cdiv id=\"header\" align=\"center\"\u003e\n \u003cimg src=\"assets/knn.png\"\u003e\n\u003c/div\u003e  \n\n[K-Nearest Neighbors (KNN)](https://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm):  \nKNN is a simple, instance-based learning algorithm. It predicts the class of a new sample based on the majority class among its nearest neighbors in the feature space.\n\n### 2. **Logistic Regression**\n\u003cdiv id=\"header\" align=\"center\"\u003e\n \u003cimg src=\"assets/logreg.png\"\u003e\n\u003c/div\u003e  \n\n[Logistic Regression](https://en.wikipedia.org/wiki/Logistic_regression):  \nLogistic regression is a statistical model that predicts the probability of a binary outcome, using a linear combination of input features.\n\n### 3. **Naïve Bayes**\n\u003cdiv id=\"header\" align=\"center\"\u003e\n \u003cimg src=\"assets/bayes.png\"\u003e\n\u003c/div\u003e  \n\n[Naïve Bayes](https://en.wikipedia.org/wiki/Naive_Bayes_classifier):  \nA probabilistic classifier based on applying Bayes' theorem, assuming independence between features. It's particularly effective for text classification but can be applied to other types of data as well.\n\n### 4. **Decision Trees**\n\u003cdiv id=\"header\" align=\"center\"\u003e\n \u003cimg src=\"assets/decisiontrees.png\"\u003e\n\u003c/div\u003e  \n\n[Decision Trees](https://en.wikipedia.org/wiki/Decision_tree_learning):  \nA decision tree is a flowchart-like tree structure used for classification. It splits the data based on feature values to make predictions. It's interpretable and simple to understand.\n\n### 5. **Support Vector Machines (SVM)**\n\u003cdiv id=\"header\" align=\"center\"\u003e\n \u003cimg src=\"assets/svm.png\"\u003e\n\u003c/div\u003e  \n\n[Support Vector Machines](https://en.wikipedia.org/wiki/Support_vector_machine):  \nSVM is a powerful classifier that works by finding the hyperplane that best separates data into different classes. It is effective in high-dimensional spaces.\n\n---\n\n## Project Execution\n\nThe project consists of several phases:\n1. **Data Preprocessing:** The dataset has already been cleaned and preprocessed.\n2. **Model Training:** The data is split into training, validation, and test sets. Each model is trained and evaluated.\n3. **Hyperparameter Tuning:** For each model, we will tune the hyperparameters to maximize performance.\n4. **Model Comparison:** We compare the models based on precision, recall, F1-score, and ROC/AUC curves.\n\n### Key Metrics\n- **Precision:** How many predicted positives are actually positive.\n- **Recall:** How many actual positives are correctly predicted.\n- **F1-Score:** A balanced measure of precision and recall.\n- **ROC/AUC:** Measures the model’s ability to distinguish between classes.\n\n---\n\n## Results\n\nAt the end of the project, we will have a performance comparison across all models, helping to identify the best-performing classifier for COVID-19 outcome prediction. The model will be chosen based on the highest combined performance across multiple metrics.\n\n---\n\n## Acknowledgments\n- World Health Organization (WHO) for the dataset.\n- Contributors to the various machine learning algorithms and techniques.\n\n---","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdizzydroid%2Fasu_seniorproject_ml","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdizzydroid%2Fasu_seniorproject_ml","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdizzydroid%2Fasu_seniorproject_ml/lists"}