Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/bradleyboehmke/uc-bana-6043

Additional resources for the UC BANA 6043 Statistical Computing course
https://github.com/bradleyboehmke/uc-bana-6043

data-analysis data-science data-visualization python

Last synced: 23 days ago
JSON representation

Additional resources for the UC BANA 6043 Statistical Computing course

Awesome Lists containing this project

README

        

UC BANA 6043 Statistical Computing
================

**By [Brad Boehmke](https://github.com/bradleyboehmke) 🚀**

Welcome to Statistical Computing with Python! This course provides an intensive, hands-on introduction to statistical computing and data science with the Python programming language. You will gain foundational skills in managing data structures, performing data wrangling, computing and visualizing statistical relationships, managing various environments conducive for statistical analysis, and performing machine learning modeling. Most importantly, since this course only has time to introduce foundational skills, much emphasis is placed on giving you a mental model of Python's data science ecosystem so you know how, when, and where to continue advancing your statistical computing capabilities.

## Learning Objectives

Upon successfully completing this course, you will:

* Have a mental model of the Python data science ecosystem: libraries, capabilities, vocabulary, and widely-available Python resources.
* Have the ability to use Python within both interactive (Jupyter, REPL) and non-interactive (scripts) environments.
* Be able to perform core data wrangling activities: importing data, reshaping data, transforming data, and exporting data.
* Be able to compute descriptive statistics and visualize key patterns and relationships with your data.
* Be exposed to modeling via scikit-learn and discuss the fundamentals of building models in Python.
* Have the resources and understanding to continue advancing your statistical computing capabilities.

## Schedule

| Module | Description |
|:-------------:|:--------------------------------------------------------------------|
| **1** | **Starting with the Basics** |
| | Introduction to JupyterLab and the notebook environment |
| | Python fundamentals |
| **2** | **Python Data Science Ecosystem & DataFrames** |
| | Modules, packages, and a preview of Python's data science ecosystem |
| | Importing data and working with DataFrames |
| **3** | **Data Wrangling Part 1** |
| | Subsetting and manipulating data |
| | Computing summary statistics at different levels |
| **4** | **Data Wrangling Part 2** |
| | Tidying and joining data |
| | Handling text data |
| **5** | **Data Visualization** |
| | Higher and lower level plotting APIs |
| | Interactive visualizations |
| **6** | **Creating Efficient Code in Python** |
| | Control statements & iteration |
| | Writing functions |
| **7** | **Intro to Machine Learning with Scikit-Learn** |
| | Basics of the Scikit-learn API |
| | Feature engineering and model evaluation/selection |

## Getting Started

The primary course material is provided via this Jupyter Book resource [:closed_book:](https://bradleyboehmke.github.io/uc-bana-6043/).