An open API service indexing awesome lists of open source software.

https://github.com/juanpablo70/pgad-exam

Midterm Exam Titanic data set
https://github.com/juanpablo70/pgad-exam

data-science dataframe dataset jupyter-notebook matplotlib numpy pandas python seaburn

Last synced: 2 months ago
JSON representation

Midterm Exam Titanic data set

Awesome Lists containing this project

README

          

## Escuela Colombiana de Ingeniería

# Midterm Exam

1. Load the titanic.csv data set.

2. Implement a function from scratch that takes the dataset as an argument and, within the function, displays information about missing values: variable and percentage.

3. Implement a function that allows imputing (filling in missing values) for each variable with its respective mode value.

4. For each of the records in the titanic dataframe, change the values in the 'embarked' column (C, S, Q) to the name of the city: Cherbourg, Southampton, and Queenstown, respectively.

5. For each of the records in the titanic dataframe, add a 'AgeGroup' column, and fill the column according to each passenger's age:

Early Childhood: 0-5 years

Childhood: 6-11 years

Adolescence: 12-18 years

Youth: 14-26 years

Adulthood: 27-59 years

Old Age: 60 years and older

6. For each of the following questions: make one Exploratory Data Analysis (EDA) graph to provide your answer to each question:

a. Can analyzing the fare determine any impact on survival during the journey?

b. Determine the number of deaths for each of the ports of embarkation. What do you conclude?

c. Determine at least three variables that correlate with the incidence of survival. What do you conclude?

### Prerequisites

+ Python
+ Jupyter Notebook
+ Git

### Installing

To download this project, you must run the following command down below.

```
git clone https://github.com/JuanPablo70/PGAD-EXAM.git
```

Open Jupyter Notebook on your computer and open the ```MidtermExam.ipynb``` file.

### Running the notebook

Once you have opened the ```MidtermExam.ipynb``` file, run each cell with ```shift``` + ```Enter``` or go to the Cell tab and click on ```Run All``` and see the results.

### Author

Juan Pablo Sánchez Bermúdez