https://github.com/adithivs/prodigy_ds_02
https://github.com/adithivs/prodigy_ds_02
data-science eda logistic-regression python
Last synced: 29 days ago
JSON representation
- Host: GitHub
- URL: https://github.com/adithivs/prodigy_ds_02
- Owner: AdithiVS
- License: bsd-2-clause
- Created: 2024-06-15T15:38:09.000Z (about 2 years ago)
- Default Branch: main
- Last Pushed: 2024-06-16T13:38:26.000Z (almost 2 years ago)
- Last Synced: 2025-01-21T20:14:57.115Z (over 1 year ago)
- Topics: data-science, eda, logistic-regression, python
- Language: Jupyter Notebook
- Homepage:
- Size: 214 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# PRODIGY_DS_02
## Introduction
This project's primary goal is to do exploratory data analysis on the Titanic dataset and derive significant insights from the findings.Here, we attempt to do the necessary preprocessing, such as managing missing values, identifying outliers, determining the correlation between variables, displaying the data, and forecasting the test set's survival.
## About the Dataset
The Titanic Dataset divided into `Train Set` and `Test Set`, contains detailed information about the passengers aboard the Titanic. This dataset includes features that describe the passengers demographics, socio-economic status, and other relevant information, as well as the outcome variable indicating whether the passenger survived or perished in the disaster.
### Features of the dataset
PassengerId:A unique identifier for each passenger.
Survived: Binary variable indicating survival (0 = No, 1 = Yes).
Pclass:Ticket class (1 = 1st, 2 = 2nd, 3 = 3rd).
Name: Full name of the passenger.
Sex:Gender of the passenger (male/female).
Age: Age of passenger in years.
SibSp: Number of siblings and spouses aboard the Titanic.
Parch:Number of parents and children aboard the Titanic.
Ticket: Ticket number.
Fare: Amount of money the passenger paid for the ticket.
Cabin: Cabin number.
Embarked:Port of embarkation (C = Cherbourg, Q = Queenstown, S = Southampton).
## Conclusion
Important insights into the factors influencing survival rates after the catastrophic Titanic accident were obtained through the thorough data cleaning and exploratory data analysis performed on the Titanic dataset. This study improves our understanding of past events and serves as an example of how data science techniques may be applied to draw important conclusions from large, complicated datasets.Explored relationships between variables such as gender, passenger class, age, fare, and survival rates.
By using logistic regression analysis we predicted the survival of test set passengers.
## Contact Information
- Adithi Vellengara(LinkedIn)
- Email 📧: adithivs06@gmail.com