Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/TomasBeuzen/python-programming-for-data-science

Content from the University of British Columbia's Master of Data Science course DSCI 511.
https://github.com/TomasBeuzen/python-programming-for-data-science

data-manipulation data-science numpy pandas programming python teaching

Last synced: about 1 month ago
JSON representation

Content from the University of British Columbia's Master of Data Science course DSCI 511.

Awesome Lists containing this project

README

        

# Python Programming for Data Science

**By [Tomas Beuzen](https://www.tomasbeuzen.com/) 🚀**

Welcome to Python Programming for Data Science! With this [website](https://www.tomasbeuzen.com/python-programming-for-data-science/) I aim to provide an introduction to everything you need to know to start using Python for data science. We'll cover topics such as data structures, basic programming, code testing and documentation, and using libraries like NumPy and Pandas for data exploration and analysis.



>If you're interested in learning more about Python packages, check out my and [Tiffany Timber's](https://www.tiffanytimbers.com/) book [**Python Packages**](https://py-pkgs.org/). Or, if you'd like to learn more about using Python and PyTorch for deep learning, you can check out my other online material [**Deep Learning with PyToch**](https://www.tomasbeuzen.com/deep-learning-with-pytorch/).

>The content of this site is adapted from material I used to teach the 2020/2021 offering of the course "DSCI 511 Python Programming for Data Science" for the University of British Columbia's Master of Data Science Program. That material has built upon previous course material developed by [Patrick Walls](https://www.math.ubc.ca/~pwalls/) and [Mike Gelbart](https://www.mikegelbart.com/).

## Key Learning Outcomes

These are the key learning outcomes for this material:

1. Translate fundamental programming concepts such as loops, conditionals, etc into Python code.
2. Understand the key data structures in Python.
3. Understand how to write functions in Python and assess if they are correct via unit testing.
4. Know when and how to abstract code (e.g., into functions, or classes) to make it more modular and robust.
5. Produce human-readable code that incorporates best practices of programming, documentation, and coding style.
6. Use NumPy perform common data wrangling and computational tasks in Python.
7. Use Pandas to create and manipulate data structures like Series and DataFrames.
8. Wrangle different types of data in Pandas including numeric data, strings, and datetimes.

## Getting Started

The material on this site is written in Jupyter notebooks and rendered using [Jupyter Book](https://jupyterbook.org/intro.html) to make it easily accessible. However, if you wish to run these notebooks on your local machine, you can do the following:

1. Clone the GitHub repository:
```sh
git clone https://github.com/TomasBeuzen/python-programming-for-data-science.git
```
2. Install the conda environment by typing the following in your terminal:
```sh
conda env create -f py4ds.yaml
```
3. Open the course in JupyterLab by typing the following in your terminal:
```sh
cd python-programming-for-data-science
jupyterlab
```

>If you're not comfortable with `git`, `GitHub` or `conda`, feel free to just read through the material on this website - you're not missing out on anything!