https://github.com/ahsankhizar5/titanic-eda-visualization
Exploratory Data Analysis and Visualization on the Titanic Dataset using Python, Pandas, Matplotlib, and Seaborn to uncover survival patterns.
https://github.com/ahsankhizar5/titanic-eda-visualization
data-analysis data-science data-visualization eda kaggle machine-learning matplotlib pandas python seaborn titanic-dataset
Last synced: 19 days ago
JSON representation
Exploratory Data Analysis and Visualization on the Titanic Dataset using Python, Pandas, Matplotlib, and Seaborn to uncover survival patterns.
- Host: GitHub
- URL: https://github.com/ahsankhizar5/titanic-eda-visualization
- Owner: ahsankhizar5
- Created: 2025-05-30T10:40:08.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2025-05-30T10:48:23.000Z (about 1 year ago)
- Last Synced: 2025-05-30T13:57:54.351Z (about 1 year ago)
- Topics: data-analysis, data-science, data-visualization, eda, kaggle, machine-learning, matplotlib, pandas, python, seaborn, titanic-dataset
- Language: Python
- Homepage:
- Size: 188 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# ๐ข Titanic Dataset - EDA & Visualization
A complete **Exploratory Data Analysis (EDA)** on the Titanic dataset using Python. This project cleans the dataset, explores key insights, visualizes patterns, and summarizes findings to understand survival factors.
---
## ๐ Dataset
- Source: [Kaggle - Titanic Dataset](https://www.kaggle.com/c/titanic/data)
- Filename: `TiTanic_Dataset.csv`
---
## ๐ Objective
Perform EDA and generate visual insights to answer:
- Who were most likely to survive?
- Were there patterns in class, gender, age, or fare?
- What variables are correlated?
---
## ๐ Features Explored
- Passenger Class (Pclass)
- Sex
- Age
- Fare
- Survival
- Siblings/Spouse & Parents/Children (SibSp, Parch)
- Embarked
---
## ๐ Visualizations
Saved in the `Graphs/` folder:
- ๐ฆ Bar charts for categorical data (Sex, Pclass)
- ๐ Histograms for distributions (Age, Fare)
- ๐ก๏ธ Correlation Heatmap
---
## ๐งน Cleaning & Processing
- Handled missing values:
- `Age`: Filled with median
- `Embarked`: Filled with mode
- `Cabin`: Dropped (too sparse)
- Removed duplicates
- Detected outliers in `Fare` using IQR
---
## ๐ก Key Insights
- Majority of passengers were **male** and in **3rd class**
- **Females had higher survival rates**
- **Younger passengers** were common
- Strong correlation between **SibSp** and **Parch** (family)
- **Fare** had significant outliers
Full findings are documented in [`TiTanic_EDA_Summery.docx`](./TiTanic_EDA_Summery.docx)
---
## โถ๏ธ Run It Yourself
```bash
python eda_titanic.py
```
---
## โ๏ธ Tools Used
- **Python**
- **Pandas**
- **Matplotlib** & **Seaborn**
- **Numpy**
---
## ๐จโ๐ป Author
**Ahsan Khizar**
[GitHub](https://github.com/ahsankhizar5) โ [LinkedIn](https://linkedin.com/in/ahsankhizar5)
---
> ๐ก *"Models may predict prices, but code quality predicts trust."* โ Ahsan Khizar