Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/juanpablo70/pgad-nba-prediction-project

Programming for Data Analysis from Open Data Sources
https://github.com/juanpablo70/pgad-nba-prediction-project

data-science jupyter-notebook matplotlib mysql numpy pandas pymysql python scipy seaburn sqlalchemy

Last synced: about 2 months ago
JSON representation

Programming for Data Analysis from Open Data Sources

Awesome Lists containing this project

README

        

## Escuela Colombiana de Ingeniería

# Final Project

1. **Data Source Selection**

Select one dataset from the above open data sources or any other relevant data source.

Considerations: The dataset should be rich enough to allow multiple analyses and should be relevant to the student’s interests or our course objectives.

2. **Loading the dataset into a SQL**

Load the data set into a relational table designed in SQL database. The Professor must providethe information on a connection.

Considerations: You can use the MySQL account provided by the professor, which is freely accessibleand will always be available while the course ends.

3. **Data Preparation**

Load, clean, and preprocess the selected dataset using programming-based tools and libraries.

Considerations: Ensure the data is ready for analysis, handle missing values, encode categorical variables, and normalize or standardize numerical variables as necessary.

4. **Exploratory Data Analysis (EDA)**

Perform an initial dataset analysis using visualization techniques and statistical methods to gain insights and identify patterns, trends, and potential relationships between variables.

Deliverable: An EDA report containing visualizations and observations about the dataset.

5. **Descriptive Analysis**

Calculate summary statistics for the dataset, such as means, medians, standard deviations, quartiles, and correlations, to provide a quantitative data description.

Deliverable: A report describing the key statistics of the dataset and their implications.

6. **Inferential Analysis or Predictive Modeling**

Use inferential statistics or machine learning algorithms to make predictions or draw conclusions based on the data.

Deliverable: A report detailing the model selection, evaluation, and interpretation of the results, along with any actionable insights or recommendations

### Prerequisites

+ Python
+ Jupyter Notebook
+ Git

### Installing

To download this project, you must run the following command down below.

```
git clone https://github.com/JuanPablo70/PGAD-NBA-Prediction-Project.git
```

Open Jupyter Notebook on your computer and open the ```PGAD_Project.ipynb``` file.

### Running the notebook

Once you have opened the ```PGAD_Project.ipynb``` file, run each cell with ```shift``` + ```Enter``` or go to the Cell tab and click on ```Run All``` and see the results.

### Authors

Juan Pablo Sánchez Bermúdez - [JuanPablo70](https://github.com/JuanPablo70)

Juan Camilo Bazurto Arias - [juan-bazurto-eci](https://github.com/juan-bazurto-eci)