Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/mariamagro/titanic_analysis

This repository focuses on the analysis of the Titanic dataset using R programming language. The primary objective is to uncover insights into the factors influencing survival during the tragic Titanic incident. The analysis employs decision trees and random forests to model and understand the patterns within the dataset.
https://github.com/mariamagro/titanic_analysis

Last synced: about 1 month ago
JSON representation

Host: GitHub
URL: https://github.com/mariamagro/titanic_analysis
Owner: mariamagro
Created: 2023-11-13T11:56:17.000Z (about 1 year ago)
Default Branch: main
Last Pushed: 2024-08-18T17:32:31.000Z (5 months ago)
Last Synced: 2024-08-18T18:48:51.243Z (5 months ago)
Language: R
Size: 788 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# Titanic Survival Prediction Project

## Overview

This repository contains code and data for a project on predicting survival on the Titanic using machine learning models. The project was made in collaboration with [MarinaGRey](https://github.com/MarinaGRey). It focuses on preprocessing the Titanic dataset and evaluating various machine learning techniques to predict survival outcomes.

## Files

- **BestModel.RData**: Contains the final trained model and associated functions.
- **Report.pdf**: A comprehensive report detailing the project, including data preprocessing, model evaluation, and conclusions.
- **code.R**: The R script used for data preprocessing, model training, and evaluation.
- **titanic_train.RData**: The dataset used for training and evaluating the models.

## Description of `code.R`

### Project Metadata and library loading

### Data Preprocessing

### Model Training and Evaluation

The script includes code for:
- Splitting the data into training and test sets.
- Training various machine learning models such as Logistic Regression, Decision Trees, and Random Forests.
- Evaluating model performance using metrics like accuracy, precision, recall, and F1-score.

### Results

The script saves the best-performing model and its evaluation metrics to `BestModel.RData`.

## Report

The `Report.pdf` provides an in-depth analysis of the project, including:
- Detailed explanations of data preprocessing steps.
- Evaluation of different machine learning models.
- Final conclusions and recommendations based on model performance.