https://github.com/janpipek/eda-polars-way
Tutorial: Exploratory Data Analysis, the Polars Way
https://github.com/janpipek/eda-polars-way
Last synced: 4 months ago
JSON representation
Tutorial: Exploratory Data Analysis, the Polars Way
- Host: GitHub
- URL: https://github.com/janpipek/eda-polars-way
- Owner: janpipek
- License: mit
- Created: 2024-04-17T09:13:01.000Z (about 2 years ago)
- Default Branch: main
- Last Pushed: 2025-04-06T11:56:08.000Z (about 1 year ago)
- Last Synced: 2025-10-11T05:37:32.634Z (8 months ago)
- Language: Jupyter Notebook
- Size: 34.2 MB
- Stars: 22
- Watchers: 1
- Forks: 6
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Tutorial: Exploratory Data Analysis, the Polars Way
(as given at [PyCon Italia 2024](https://2024.pycon.it/), [EuroPython 2024](https://ep2024.europython.eu/), and [PyCon Austria 2025](https://pycon.pyug.at/en/)).
## Preparation
Please prepare a Python environment that you can use during the workshop.
We will work in Jupyter Notebook. However, you can also use jupyter lab or one of the IDES,
[Visual Studio Code](https://code.visualstudio.com) or [PyCharm](https://www.jetbrains.com/pycharm/).
### Clone this repository
```shell
git clone https://github.com/janpipek/eda-polars-way.git
```
or using `gh` client:
```shell
gh repo clone janpipek/eda-polars-way
```
Alternatively, you can just download the repo as a package from here:
https://github.com/janpipek/eda-polars-way/archive/refs/heads/main.zip
### Prepare Python Environment
Python version 3.10+ is required.
First, `cd` into the repository directory:
```shell
cd eda-polars-way
```
#### uv "installation"
You do not have to install anything. Just run the appropriate `uv run ...` commands.
#### Pip installation
The included `requirements.txt` file should be enough for you to set up a Python environment
using the `pip` command. First create the environment (just once):
```shell
python -m venv .venv
# Activate the environment (every time you open the shell),
# depending on your os
source .venv/bin/activate # <--- Linux, Mac
.venv\Scripts\activate.bat # <--- Windows
# Install the required packages
python -m pip install -r requirements.txt
```
You will have to repeat the activation step whenever you start a new command-line shell.
### (Absolutely lazy) on-line environment
This is not recommended but working in case you have probelms installing on your laptop.
Create an account at https://deepnote.com (for free) and launch the repo by clicking the button:
[
](https://deepnote.com/launch?url=https%3A%2F%2Fgithub.com%2Fjanpipek%2Feda-polars-way)
Note that you will have to install additional packages (there is a command you need to uncomment).
## How to use this repo
All contents (a bit of text + all exercises) are located in `exercises.ipynb`. The exercise are partly filled and accompanied by hints. If you are still unsure, in `solutions.ipynb`, you have working code to answer the questions. To help SQL-savvy, the `solutions-sql.ipynb` file contains solution using the SQL API of polars).
### Run the notebook
```shell
uv run jupyter notebook
```
or (if you installed via pip and activated your environment)
```shell
jupyter notebook
```
## Data sources
All the data sources are believed to be open and publicly distributable,
see `data/README.md` for more details.
## Useful links
### Official documentation
- [Polars User Guide](https://docs.pola.rs/)
- [Polars API reference](https://docs.pola.rs/py-polars/html/reference/index.html)
### Articles
- [Python Polars: A Lightning-Fast DataFrame Library](https://realpython.com/polars-python/) @ RealPython
### Talks & videos
- [R. Vink: What polars does for you](https://www.youtube.com/watch?v=UwRlFtSd_-8), EuroPython 2023
- [M. Harrison: Getting Started with Polars](https://www.youtube.com/watch?v=CJ0f45evuME), Pycon US 2023
### On-line courses
- https://www.udemy.com/course/data-analysis-with-polars/