Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/drhagen/tabeline
User-friendly data frame and data grammar library for Python
https://github.com/drhagen/tabeline
data-grammar data-table dplyr
Last synced: about 2 months ago
JSON representation
User-friendly data frame and data grammar library for Python
- Host: GitHub
- URL: https://github.com/drhagen/tabeline
- Owner: drhagen
- License: mit
- Created: 2022-02-12T20:52:33.000Z (almost 3 years ago)
- Default Branch: master
- Last Pushed: 2024-08-19T18:23:38.000Z (4 months ago)
- Last Synced: 2024-09-18T00:06:10.744Z (3 months ago)
- Topics: data-grammar, data-table, dplyr
- Language: Python
- Homepage: https://tabeline.drhagen.com
- Size: 1.61 MB
- Stars: 14
- Watchers: 2
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Contributing: docs/contributing.md
- License: LICENSE
Awesome Lists containing this project
README
# Tabeline
Tabeline is a data frame and data grammar library. You write the expressions in strings and supply them to methods on the `DataFrame` class. The strings are parsed by Parsita and converted into Polars for execution.
Tabeline draws inspiration from dplyr, the data grammar of R's tidyverse, especially for its methods names. The `filter`, `mutate`, `group_by`, and `summarize` methods should all feel familiar. But Tabeline is as proper a Python library as can be, using methods instead of pipes, like is standard in R.
Tabeline uses Polars under the hood, but adds a lot of handling of edge cases from Polars, which otherwise result in crashes or behavior that is not type stable.
See the [Documentation](https://tabeline.drhagen.com) for the full user guide.
## Installation
It is recommended to install Tabeline from PyPI using `pip`.
```shell
pip install tabeline
```## Motivating example
```python
from tabeline import DataFrame# Construct a data frame using clean syntax
# from_csv, from_pandas, and from_polars are also available
df = DataFrame(
id=[0, 0, 0, 0, 1, 1, 1, 1, 1],
t=[0, 6, 12, 24, 0, 6, 12, 24, 48],
y=[0, 2, 3, 1, 0, 4, 3, 2, 1],
)# Use data grammar methods and string expressions to define
# transformed data frames
analysis = (
df
.filter("t <= 24")
.group_by("id")
.summarize(auc="trapz(t, y)")
)print(analysis)
# shape: (2, 2)
# ┌─────┬──────┐
# │ id ┆ auc │
# │ --- ┆ --- │
# │ i64 ┆ f64 │
# ╞═════╪══════╡
# │ 0 ┆ 45.0 │
# ├╌╌╌╌╌┼╌╌╌╌╌╌┤
# │ 1 ┆ 63.0 │
# └─────┴──────┘
```