Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/aegis301/fetal_health

Last synced: about 1 month ago
JSON representation

Host: GitHub
URL: https://github.com/aegis301/fetal_health
Owner: aegis301
Created: 2022-02-19T19:12:48.000Z (almost 3 years ago)
Default Branch: master
Last Pushed: 2022-06-23T14:26:28.000Z (over 2 years ago)
Last Synced: 2024-11-06T19:56:00.320Z (3 months ago)
Language: Jupyter Notebook
Size: 19.9 MB
Stars: 1
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# Info

This project is a simple implementation of machine learning on the fetal health classification dataset that can be found on kaggle.com [https://www.kaggle.com/datasets/andrewmvd/fetal-health-classification].

In this project, I will use machine learning to predict the fetal health by using features from cardiotocographic (CTG) data.

## How to Use

To run the notebooks, put the data found under the link posted above inside the `data/` folder.

## How to Read

If you're interested in my work, you can go through the Fetal_Health.ipynb notebook. It contains a condensed version of my work on data exploration, feature selection, feature engineering and model training, as well as some work on data clustering. However, extensive clustering and deep learning are not (yet) implemented in this project. The other notebooks are a more extensive representation of my reasoning and are meant to complete the documentation, they do not contain additional information that is relevant to the reader and can be skipped if no details are required.

## TL;DR

By means of machine learning I was able to sufficiently predict the health of a fetus by using CTG values, achieving over 90% in specificity and sensitivity for the categories `normal` and `pathological`, `suspect` was harder to distinguish, as was to be expected. By using rigorous feature selection methods I was able to condense the feature space down to only twelve input features. While linear models showed good responses, especially support vector machines, Random Forests were the most powerful prediction algorithms in this project.