An open API service indexing awesome lists of open source software.

https://github.com/adithivs/prodigy_ds_02


https://github.com/adithivs/prodigy_ds_02

data-science eda logistic-regression python

Last synced: 29 days ago
JSON representation

Awesome Lists containing this project

README

          

# PRODIGY_DS_02
## Introduction
This project's primary goal is to do exploratory data analysis on the Titanic dataset and derive significant insights from the findings.Here, we attempt to do the necessary preprocessing, such as managing missing values, identifying outliers, determining the correlation between variables, displaying the data, and forecasting the test set's survival.

## About the Dataset

The Titanic Dataset divided into `Train Set` and `Test Set`, contains detailed information about the passengers aboard the Titanic. This dataset includes features that describe the passengers demographics, socio-economic status, and other relevant information, as well as the outcome variable indicating whether the passenger survived or perished in the disaster.

### Features of the dataset

PassengerId:A unique identifier for each passenger.


Survived: Binary variable indicating survival (0 = No, 1 = Yes).


Pclass:Ticket class (1 = 1st, 2 = 2nd, 3 = 3rd).



Name: Full name of the passenger.


Sex:Gender of the passenger (male/female).


Age: Age of passenger in years.


SibSp: Number of siblings and spouses aboard the Titanic.


Parch:Number of parents and children aboard the Titanic.


Ticket: Ticket number.


Fare: Amount of money the passenger paid for the ticket.


Cabin: Cabin number.


Embarked:Port of embarkation (C = Cherbourg, Q = Queenstown, S = Southampton).

## Conclusion
Important insights into the factors influencing survival rates after the catastrophic Titanic accident were obtained through the thorough data cleaning and exploratory data analysis performed on the Titanic dataset. This study improves our understanding of past events and serves as an example of how data science techniques may be applied to draw important conclusions from large, complicated datasets.Explored relationships between variables such as gender, passenger class, age, fare, and survival rates.
By using logistic regression analysis we predicted the survival of test set passengers.

## Contact Information
- Adithi Vellengara(LinkedIn)
- Email 📧: adithivs06@gmail.com