https://github.com/gthomas08/data-mining-and-machine-learning-project
Machine Learning Project
https://github.com/gthomas08/data-mining-and-machine-learning-project
jupyter-notebook machine-learning neural-network python random-forest
Last synced: 4 months ago
JSON representation
Machine Learning Project
- Host: GitHub
- URL: https://github.com/gthomas08/data-mining-and-machine-learning-project
- Owner: gthomas08
- Created: 2021-04-26T08:42:35.000Z (about 4 years ago)
- Default Branch: master
- Last Pushed: 2021-09-02T11:49:36.000Z (almost 4 years ago)
- Last Synced: 2025-01-21T03:41:44.458Z (5 months ago)
- Topics: jupyter-notebook, machine-learning, neural-network, python, random-forest
- Language: Jupyter Notebook
- Homepage:
- Size: 3.02 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Data Mining and Machine Learning Project
## About The Project
This project was part of the Computer Engineering and Informatics Department (CEID) of University of Patras curriculum.
## Exercises
1. Stroke Dataset - Dataset that contains information about patients, including if a patient had a stroke or not.
1. Analyze the dataset and visualize the results.
2. Handle missing values using the following methods:
1. Remove columns where missing values are present.
2. Fill the missing values with the mean value of the respective column (where possible).
3. Fill the missing values using Linear Regression (where possible).
4. Fill the missing values by implementing k-Nearest Neighbors (where possible).
5. Fill the missing values by combining methods c and d.
3. Predict if a patient is prone or not to having a stroke using a Random Forest.
2. Spam Dataset - Dataset that contains emails and if they are spam or not.
1. Convert the emails to vectors using the Word Embeddings method.
2. Predict if an email is spam or not by implementing a Neural Network.## Technologies
- Jupyter Notebook
- Python
- matplotlib
- tensorflow
- imblearn
- seabron
- sklearn
- pandas
- numpy
- keras