Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/nrhartnett/titanicmachinelearning
This project, part of my Master's in Business Analytics at The State University of New York at Buffalo, utilizes Python, SQL, and machine learning algorithms to analyze the RMS Titanic disaster, revealing insights into survival rates by gender, age groups, and socio-economic class.
https://github.com/nrhartnett/titanicmachinelearning
data-processing jupyter-notebook normalization python sql sqlite3
Last synced: 25 days ago
JSON representation
This project, part of my Master's in Business Analytics at The State University of New York at Buffalo, utilizes Python, SQL, and machine learning algorithms to analyze the RMS Titanic disaster, revealing insights into survival rates by gender, age groups, and socio-economic class.
- Host: GitHub
- URL: https://github.com/nrhartnett/titanicmachinelearning
- Owner: nrhartnett
- Created: 2024-01-02T15:13:44.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2024-01-02T15:33:39.000Z (about 1 year ago)
- Last Synced: 2024-11-28T18:12:42.956Z (3 months ago)
- Topics: data-processing, jupyter-notebook, normalization, python, sql, sqlite3
- Language: Jupyter Notebook
- Homepage:
- Size: 939 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Titanic-Machine-Learning
This comprehensive group-project, undertaken as part of the EAS-503: Python for Data Scientists course during my pursuit of a Master of Science degree in Business Analytics at The State University of New York at Buffalo, leverages a robust integration of Python programming, SQL, SQLite3, databases, normalization, machine learning, data parsing, and algorithms to analyze the RMS Titanic disaster. The study involves parsing and normalizing raw data from CSV files using Python, followed by loading the processed data into a normalized SQLite3 database table. Utilizing Pandas for data exploration and visualization, the analysis reveals survival rates by gender, age groups, and socio-economic class. In the independent project segment, machine learning models are developed and trained on features like Ticket class, number of parents/children aboard, number of siblings/spouses aboard, port of Embarkation, Sex, and Age. The inclusion of diverse algorithms, such as Logistic Regression, SVM, Decision Trees, KNN Classifier, and Random Forest, contributes to a nuanced understanding of the factors influencing passenger survival during this historical event within the broader context of business analytics.The full project may be viewed within the Jupyter Notebook file: "Project Notebook.ipynb".
The data utilized in this project may be found at: https://www.kaggle.com/competitions/titanic.