Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/markfairbanks/tidypolars
Tidy interface to polars
https://github.com/markfairbanks/tidypolars
Last synced: 4 days ago
JSON representation
Tidy interface to polars
- Host: GitHub
- URL: https://github.com/markfairbanks/tidypolars
- Owner: markfairbanks
- License: mit
- Created: 2021-09-30T23:12:40.000Z (about 3 years ago)
- Default Branch: main
- Last Pushed: 2024-10-29T21:32:21.000Z (15 days ago)
- Last Synced: 2024-10-29T23:45:37.784Z (15 days ago)
- Language: Python
- Homepage: http://tidypolars.readthedocs.io
- Size: 725 KB
- Stars: 342
- Watchers: 8
- Forks: 11
- Open Issues: 18
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
Awesome Lists containing this project
README
# tidypolars
[![PyPI Latest Release](https://img.shields.io/pypi/v/tidypolars.svg)](https://pypi.org/project/tidypolars/)
[![conda-forge](https://anaconda.org/conda-forge/tidypolars/badges/version.svg)](https://anaconda.org/conda-forge/tidypolars)tidypolars is a data frame library built on top of the blazingly fast [polars](https://github.com/pola-rs/polars) library that gives access to methods and functions familiar to R tidyverse users.
## Installation
You can install tidypolars with `pip`:```bash
$ pip install tidypolars
```Or through `conda`:
```bash
$ conda install -c conda-forge tidypolars
```### General syntax
tidypolars methods are designed to work like tidyverse functions:
```python
import tidypolars as tp
from tidypolars import col, descdf = tp.tibble(x = range(3), y = range(3, 6), z = ['a', 'a', 'b'])
(
df
.select('x', 'y', 'z')
.filter(col('x') < 4, col('y') > 1)
.arrange(desc('z'), 'x')
.mutate(double_x = col('x') * 2,
x_plus_y = col('x') + col('y'))
)
``````
┌─────┬─────┬─────┬──────────┬──────────┐
│ x ┆ y ┆ z ┆ double_x ┆ x_plus_y │
│ --- ┆ --- ┆ --- ┆ --- ┆ --- │
│ i64 ┆ i64 ┆ str ┆ i64 ┆ i64 │
╞═════╪═════╪═════╪══════════╪══════════╡
│ 2 ┆ 5 ┆ b ┆ 4 ┆ 7 │
├╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┤
│ 0 ┆ 3 ┆ a ┆ 0 ┆ 3 │
├╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┤
│ 1 ┆ 4 ┆ a ┆ 2 ┆ 5 │
└─────┴─────┴─────┴──────────┴──────────┘
```The key difference from R is that column names must be wrapped in `col()` in the following methods:
* `.filter()`
* `.mutate()`
* `.summarize()`The general idea - when doing calculations on a column you need to wrap it in `col()`. When doing simple column selections (like in `.select()`) you can pass the column names as strings.
A full list of functions can be found [here](https://tidypolars.readthedocs.io/en/latest/reference.html).
### Group by syntax
Methods operate by group by calling the `by` arg.
* A single column can be passed with `_by = 'z'`
* Multiple columns can be passed with `_by = ['y', 'z']````python
(
df
.summarize(avg_x = tp.mean(col('x')),
_by = 'z')
)
``````
┌─────┬───────┐
│ z ┆ avg_x │
│ --- ┆ --- │
│ str ┆ f64 │
╞═════╪═══════╡
│ a ┆ 0.5 │
├╌╌╌╌╌┼╌╌╌╌╌╌╌┤
│ b ┆ 2 │
└─────┴───────┘
```### Selecting/dropping columns
tidyselect functions can be mixed with normal selection when selecting columns:
```python
df = tp.tibble(x1 = range(3), x2 = range(3), y = range(3), z = range(3))df.select(tp.starts_with('x'), 'z')
``````
┌─────┬─────┬─────┐
│ x1 ┆ x2 ┆ z │
│ --- ┆ --- ┆ --- │
│ i64 ┆ i64 ┆ i64 │
╞═════╪═════╪═════╡
│ 0 ┆ 0 ┆ 0 │
├╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌┤
│ 1 ┆ 1 ┆ 1 │
├╌╌╌╌╌┼╌╌╌╌╌┼╌╌╌╌╌┤
│ 2 ┆ 2 ┆ 2 │
└─────┴─────┴─────┘
```To drop columns use the `.drop()` method:
```python
df.drop(tp.starts_with('x'), 'z')
``````
┌─────┐
│ y │
│ --- │
│ i64 │
╞═════╡
│ 0 │
├╌╌╌╌╌┤
│ 1 │
├╌╌╌╌╌┤
│ 2 │
└─────┘
```### Converting to/from pandas data frames
If you need to use a package that requires pandas data frames, you can convert from a tidypolars `tibble` to
a pandas `DataFrame`.To do this you'll first need to install pyarrow:
```bash
pip install pyarrow
```To convert to a pandas `DataFrame`:
```python
df = df.as_pandas()
```To convert from a pandas `DataFrame` to a tidypolars `tibble`:
```python
df = tp.as_tibble(df)
```## Contributing
Interested in contributing? Check out the contributing guidelines. Please note that this project is released with a Code of Conduct. By contributing to this project, you agree to abide by its terms.