Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/juanpablo70/pgad-nba-prediction-project
Programming for Data Analysis from Open Data Sources
https://github.com/juanpablo70/pgad-nba-prediction-project
data-science jupyter-notebook matplotlib mysql numpy pandas pymysql python scipy seaburn sqlalchemy
Last synced: about 2 months ago
JSON representation
Programming for Data Analysis from Open Data Sources
- Host: GitHub
- URL: https://github.com/juanpablo70/pgad-nba-prediction-project
- Owner: JuanPablo70
- Created: 2023-09-03T17:47:59.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2023-09-03T18:20:16.000Z (over 1 year ago)
- Last Synced: 2024-04-16T17:14:08.288Z (9 months ago)
- Topics: data-science, jupyter-notebook, matplotlib, mysql, numpy, pandas, pymysql, python, scipy, seaburn, sqlalchemy
- Language: Jupyter Notebook
- Homepage:
- Size: 361 KB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
## Escuela Colombiana de Ingeniería
# Final Project
1. **Data Source Selection**
Select one dataset from the above open data sources or any other relevant data source.
Considerations: The dataset should be rich enough to allow multiple analyses and should be relevant to the student’s interests or our course objectives.2. **Loading the dataset into a SQL**
Load the data set into a relational table designed in SQL database. The Professor must providethe information on a connection.
Considerations: You can use the MySQL account provided by the professor, which is freely accessibleand will always be available while the course ends.
3. **Data Preparation**
Load, clean, and preprocess the selected dataset using programming-based tools and libraries.
Considerations: Ensure the data is ready for analysis, handle missing values, encode categorical variables, and normalize or standardize numerical variables as necessary.
4. **Exploratory Data Analysis (EDA)**Perform an initial dataset analysis using visualization techniques and statistical methods to gain insights and identify patterns, trends, and potential relationships between variables.
Deliverable: An EDA report containing visualizations and observations about the dataset.5. **Descriptive Analysis**
Calculate summary statistics for the dataset, such as means, medians, standard deviations, quartiles, and correlations, to provide a quantitative data description.
Deliverable: A report describing the key statistics of the dataset and their implications.
6. **Inferential Analysis or Predictive Modeling**
Use inferential statistics or machine learning algorithms to make predictions or draw conclusions based on the data.
Deliverable: A report detailing the model selection, evaluation, and interpretation of the results, along with any actionable insights or recommendations### Prerequisites
+ Python
+ Jupyter Notebook
+ Git### Installing
To download this project, you must run the following command down below.
```
git clone https://github.com/JuanPablo70/PGAD-NBA-Prediction-Project.git
```Open Jupyter Notebook on your computer and open the ```PGAD_Project.ipynb``` file.
### Running the notebook
Once you have opened the ```PGAD_Project.ipynb``` file, run each cell with ```shift``` + ```Enter``` or go to the Cell tab and click on ```Run All``` and see the results.
### Authors
Juan Pablo Sánchez Bermúdez - [JuanPablo70](https://github.com/JuanPablo70)
Juan Camilo Bazurto Arias - [juan-bazurto-eci](https://github.com/juan-bazurto-eci)