https://github.com/juanpablo70/pgad-exam
Midterm Exam Titanic data set
https://github.com/juanpablo70/pgad-exam
data-science dataframe dataset jupyter-notebook matplotlib numpy pandas python seaburn
Last synced: 2 months ago
JSON representation
Midterm Exam Titanic data set
- Host: GitHub
- URL: https://github.com/juanpablo70/pgad-exam
- Owner: JuanPablo70
- Created: 2023-09-02T23:24:31.000Z (almost 3 years ago)
- Default Branch: main
- Last Pushed: 2023-09-03T18:22:10.000Z (almost 3 years ago)
- Last Synced: 2025-03-15T23:43:23.005Z (over 1 year ago)
- Topics: data-science, dataframe, dataset, jupyter-notebook, matplotlib, numpy, pandas, python, seaburn
- Language: Jupyter Notebook
- Homepage:
- Size: 200 KB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
## Escuela Colombiana de Ingeniería
# Midterm Exam
1. Load the titanic.csv data set.
2. Implement a function from scratch that takes the dataset as an argument and, within the function, displays information about missing values: variable and percentage.
3. Implement a function that allows imputing (filling in missing values) for each variable with its respective mode value.
4. For each of the records in the titanic dataframe, change the values in the 'embarked' column (C, S, Q) to the name of the city: Cherbourg, Southampton, and Queenstown, respectively.
5. For each of the records in the titanic dataframe, add a 'AgeGroup' column, and fill the column according to each passenger's age:
Early Childhood: 0-5 years
Childhood: 6-11 years
Adolescence: 12-18 years
Youth: 14-26 years
Adulthood: 27-59 years
Old Age: 60 years and older
6. For each of the following questions: make one Exploratory Data Analysis (EDA) graph to provide your answer to each question:
a. Can analyzing the fare determine any impact on survival during the journey?
b. Determine the number of deaths for each of the ports of embarkation. What do you conclude?
c. Determine at least three variables that correlate with the incidence of survival. What do you conclude?
### Prerequisites
+ Python
+ Jupyter Notebook
+ Git
### Installing
To download this project, you must run the following command down below.
```
git clone https://github.com/JuanPablo70/PGAD-EXAM.git
```
Open Jupyter Notebook on your computer and open the ```MidtermExam.ipynb``` file.
### Running the notebook
Once you have opened the ```MidtermExam.ipynb``` file, run each cell with ```shift``` + ```Enter``` or go to the Cell tab and click on ```Run All``` and see the results.
### Author
Juan Pablo Sánchez Bermúdez