Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/pwwang/datar
A Grammar of Data Manipulation in python
https://github.com/pwwang/datar
data-manipulation dplyr forcats groupby pandas tibble tidyr tidyverse tribble
Last synced: 5 days ago
JSON representation
A Grammar of Data Manipulation in python
- Host: GitHub
- URL: https://github.com/pwwang/datar
- Owner: pwwang
- License: mit
- Created: 2020-11-28T07:10:06.000Z (almost 4 years ago)
- Default Branch: master
- Last Pushed: 2024-03-14T04:14:34.000Z (8 months ago)
- Last Synced: 2024-04-14T20:10:40.508Z (7 months ago)
- Topics: data-manipulation, dplyr, forcats, groupby, pandas, tibble, tidyr, tidyverse, tribble
- Language: Python
- Homepage: https://pwwang.github.io/datar/
- Size: 21.4 MB
- Stars: 255
- Watchers: 11
- Forks: 17
- Open Issues: 8
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- jimsghstars - pwwang/datar - A Grammar of Data Manipulation in python (Python)
README
# datar
A Grammar of Data Manipulation in python
[![Pypi][6]][7] [![Github][8]][9] ![Building][10] [![Docs and API][11]][5] [![Codacy][12]][13] [![Codacy coverage][14]][13] [![Downloads][20]][7]
[Documentation][5] | [Reference Maps][15] | [Notebook Examples][16] | [API][17]
`datar` is a re-imagining of APIs for data manipulation in python with multiple backends supported. Those APIs are aligned with tidyverse packages in R as much as possible.
## Installation
```shell
pip install -U datar# install with a backend
pip install -U datar[pandas]# More backends support coming soon
```## Backends
|Repo|Badges|
|-|-|
|[datar-numpy][1]|![3] ![18]|
|[datar-pandas][2]|![4] ![19]|
|[datar-arrow][22]|![23] ![24]|## Example usage
```python
# with pandas backend
from datar import f
from datar.dplyr import mutate, filter_, if_else
from datar.tibble import tibble
# or
# from datar.all import f, mutate, filter_, if_else, tibbledf = tibble(
x=range(4), # or c[:4] (from datar.base import c)
y=['zero', 'one', 'two', 'three']
)
df >> mutate(z=f.x)
"""# output
x y z
0 0 zero 0
1 1 one 1
2 2 two 2
3 3 three 3
"""df >> mutate(z=if_else(f.x>1, 1, 0))
"""# output:
x y z
0 0 zero 0
1 1 one 0
2 2 two 1
3 3 three 1
"""df >> filter_(f.x>1)
"""# output:
x y
0 2 two
1 3 three
"""df >> mutate(z=if_else(f.x>1, 1, 0)) >> filter_(f.z==1)
"""# output:
x y z
0 2 two 1
1 3 three 1
"""
``````python
# works with plotnine
# example grabbed from https://github.com/has2k1/plydata
import numpy
from datar import f
from datar.base import sin, pi
from datar.tibble import tibble
from datar.dplyr import mutate, if_else
from plotnine import ggplot, aes, geom_line, theme_classicdf = tibble(x=numpy.linspace(0, 2 * pi, 500))
(
df
>> mutate(y=sin(f.x), sign=if_else(f.y >= 0, "positive", "negative"))
>> ggplot(aes(x="x", y="y"))
+ theme_classic()
+ geom_line(aes(color="sign"), size=1.2)
)
```![example](./example.png)
```python
# very easy to integrate with other libraries
# for example: klib
import klib
from pipda import register_verb
from datar import f
from datar.data import iris
from datar.dplyr import pulldist_plot = register_verb(func=klib.dist_plot)
iris >> pull(f.Sepal_Length) >> dist_plot()
```![example](./example2.png)
## Testimonials
[@coforfe](https://github.com/coforfe):
> Thanks for your excellent package to port R (`dplyr`) flow of processing to Python. I have been using other alternatives, and yours is the one that offers the most extensive and equivalent to what is possible now with `dplyr`.[1]: https://github.com/pwwang/datar-numpy
[2]: https://github.com/pwwang/datar-pandas
[3]: https://img.shields.io/codacy/coverage/0a7519dad44246b6bab30576895f6766?style=flat-square
[4]: https://img.shields.io/codacy/coverage/45f4ea84ae024f1a8cf84be54dd144f7?style=flat-square
[5]: https://pwwang.github.io/datar/
[6]: https://img.shields.io/pypi/v/datar?style=flat-square
[7]: https://pypi.org/project/datar/
[8]: https://img.shields.io/github/v/tag/pwwang/datar?style=flat-square
[9]: https://github.com/pwwang/datar
[10]: https://img.shields.io/github/actions/workflow/status/pwwang/datar/ci.yml?branch=master&style=flat-square
[11]: https://img.shields.io/github/actions/workflow/status/pwwang/datar/docs.yml?branch=master&style=flat-square
[12]: https://img.shields.io/codacy/grade/3d9bdff4d7a34bdfb9cd9e254184cb35?style=flat-square
[13]: https://app.codacy.com/gh/pwwang/datar
[14]: https://img.shields.io/codacy/coverage/3d9bdff4d7a34bdfb9cd9e254184cb35?style=flat-square
[15]: https://pwwang.github.io/datar/reference-maps/ALL/
[16]: https://pwwang.github.io/datar/notebooks/across/
[17]: https://pwwang.github.io/datar/api/datar/
[18]: https://img.shields.io/pypi/v/datar-numpy?style=flat-square
[19]: https://img.shields.io/pypi/v/datar-pandas?style=flat-square
[20]: https://img.shields.io/pypi/dm/datar?style=flat-square
[21]: https://github.com/tidyverse/dplyr
[22]: https://github.com/pwwang/datar-arrow
[23]: https://img.shields.io/codacy/coverage/5f4ef9dd2503437db18786ff9e841d8b?style=flat-square
[24]: https://img.shields.io/pypi/v/datar-arrow?style=flat-square