An open API service indexing awesome lists of open source software.

https://github.com/nobleknightt/dss-tec

Files for Data Science and Statistics TEC
https://github.com/nobleknightt/dss-tec

data-science statistics

Last synced: 4 months ago
JSON representation

Files for Data Science and Statistics TEC

Awesome Lists containing this project

README

          

# Data Science and Statistics TEC
Files for Data Science and Statistics TEC

## Instructions
- [x] Import data into Jupyter Notebook
- [x] Apply `head()`, `tail()`, `shape`, `columns`, `dtypes`, `describe()` on data
- [x] Apply `describe()` on following columns
- `Population growth (annual %)`
- `GDP growth (annual %)`
- `Inflation, GDP deflator (annual %)`
- `Inflation, consumer prices (annual %)`
- [x] Plot Histogram, Boxplot and Density Curve of above 4 variables and Infer the plots
- [x] Test Null Hypothesis: Average of `Inflation, consumer prices (annual %)` and `GDP growth (annual %)` is equal, Find mean of each variable, Identify test, Conduct and Infer test
- [x] Test Null Hypothesis: Average `GDP growth (annual %)`, `Inflation, GDP deflator (annual %)`, `Inflation, consumer prices (annual %)` and `Population growth (annual %)` are equal, Find mean of each variable, Identify test, Conduct and Infer test
- [x] Plot Scatter Plot of `GDP growth (annual %)`, `Inflation, GDP deflator (annual %)` and Infer
- [x] Calculate Correlation & Covariance of above 4 variables and Infer
- [x] Consider variables `Exports of goods and services (% of GDP)` and `Imports of goods and services (% of GDP)`. Analyse the 2 variables over time and Perform Trend Analysis
- [x] Analyze `CO2 emissions (metric tons per capita)` and Perform Descriptive Statistics, Line Chart and Trend Analysis
- [x] Create a DataFrame of following variables
- `Population growth (annual %)`
- `CO2 emissions (metric tons per capita)`
- `GDP growth (annual %)`
- `Industry (including construction), value added (% of GDP)`
- `Exports of goods and services (% of GDP)`
- `Imports of goods and services (% of GDP)`
- `Gross capital formation (% of GDP)`
- `Inflation, consumer prices (annual %)`
- [x] Build following models with `GDP growth (annual %)` as dependent variable, Interpret R Square, Predict and Calculate residual and Compare models on RMSE
- [x] Multiple Linear Regression
- [x] Decision Tree Regression
- [x] Random Forest Regression
- [x] Identify all Healthcare Indicators, Create DataFrame and Analyze using
- [x] Descriptive Statistics
- [x] Data Visualization
- [x] Conduct both ttest and Anova, Choose variables accordingly and Interpret test outcome
- [x] Identify all Education Indicators, Create DataFrame and Analyze using
- [x] Descriptive Statistics
- [x] Data Visualization