{"id":13857136,"url":"https://github.com/has2k1/plydata","last_synced_at":"2025-04-04T23:09:22.796Z","repository":{"id":56387854,"uuid":"87887605","full_name":"has2k1/plydata","owner":"has2k1","description":"A grammar for data manipulation in Python","archived":false,"fork":false,"pushed_at":"2023-09-19T20:24:47.000Z","size":826,"stargazers_count":276,"open_issues_count":10,"forks_count":11,"subscribers_count":18,"default_branch":"master","last_synced_at":"2024-10-14T13:11:24.053Z","etag":null,"topics":["data-manipulation","pandas","python"],"latest_commit_sha":null,"homepage":"https://plydata.readthedocs.io/en/stable/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-3-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/has2k1.png","metadata":{"files":{"readme":"README.rst","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null}},"created_at":"2017-04-11T04:17:16.000Z","updated_at":"2024-09-26T14:48:37.000Z","dependencies_parsed_at":"2024-02-09T01:41:28.239Z","dependency_job_id":"e2fcc555-aa02-4c96-a45b-f5473a28c192","html_url":"https://github.com/has2k1/plydata","commit_stats":null,"previous_names":[],"tags_count":12,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/has2k1%2Fplydata","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/has2k1%2Fplydata/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/has2k1%2Fplydata/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/has2k1%2Fplydata/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/has2k1","download_url":"https://codeload.github.com/has2k1/plydata/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247261612,"owners_count":20910108,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["data-manipulation","pandas","python"],"created_at":"2024-08-05T03:01:27.244Z","updated_at":"2025-04-04T23:09:22.766Z","avatar_url":"https://github.com/has2k1.png","language":"Python","funding_links":[],"categories":["Python"],"sub_categories":[],"readme":"#######\nplydata\n#######\n\n=========================    =======================\nLatest Release               |release|_\nLicense                      |license|_\nBuild Status                 |buildstatus|_\nCoverage                     |coverage|_\nDocumentation (Dev)          |documentation|_\nDocumentation (Release)      |documentation_stable|_\n=========================    =======================\n\nplydata is a library that provides a grammar for data manipulation.\nThe grammar consists of verbs that can be applied to pandas\ndataframes or database tables. It is based on the R packages\n`dplyr`_, `tidyr`_ and `forcats`_. plydata uses the ``\u003e\u003e`` operator\nas a pipe symbol, alternatively there is the ``ply(data, *verbs)``\nfunction that you can use instead of ``\u003e\u003e``.\n\nAt present the only supported data store is the *pandas* dataframe.\nWe expect to support *sqlite* and maybe *postgresql* and *mysql*.\n\nInstallation\n============\nplydata **only** supports Python 3.\n\n**Official version**\n\n.. code-block:: console\n\n   $ pip install plydata\n\n\n**Development version**\n\n.. code-block:: console\n\n   $ pip install git+https://github.com/has2k1/plydata.git@master\n\n\nExample\n-------\n\n.. code-block:: python\n\n    import pandas as pd\n    import numpy as np\n    from plydata import define, query, if_else, ply\n\n    # NOTE: query is the equivalent of dplyr's filter but with\n    #      slightly different python syntax  for the expressions\n\n    df = pd.DataFrame({\n        'x': [0, 1, 2, 3],\n        'y': ['zero', 'one', 'two', 'three']})\n\n    df \u003e\u003e define(z='x')\n    \"\"\"\n       x      y  z\n    0  0   zero  0\n    1  1    one  1\n    2  2    two  2\n    3  3  three  3\n    \"\"\"\n\n    df \u003e\u003e define(z=if_else('x \u003e 1', 1, 0))\n    \"\"\"\n       x      y  z\n    0  0   zero  0\n    1  1    one  0\n    2  2    two  1\n    3  3  three  1\n    \"\"\"\n\n    # You can pass the dataframe as the # first argument\n    query(df, 'x \u003e 1')  # same as `df \u003e\u003e query('x \u003e 1')`\n    \"\"\"\n       x      y\n    2  2    two\n    3  3  three\n    \"\"\"\n\n    # You can use the ply function instead of the \u003e\u003e operator\n    ply(df,\n        define(z=if_else('x \u003e 1', 1, 0)),\n        query('z == 1')\n    )\n    \"\"\"\n        x      y  z\n     2  2    two  1\n     3  3  three  1\n    \"\"\"\n\n\n    # The \u003e\u003e= operator can be used to modify the dataframe\n    # if there is a single operation\n    df \u003e\u003e= define(two_x='2*x')\n    df\n    \"\"\"\n        x      y  two_x\n     0  0   zero      0\n     1  1    one      2\n     2  2    two      4\n     3  3  three      6\n    \"\"\"\n\n    # df \u003e\u003e= define(two_x='2*x') \u003e\u003e define(three_x='3*x')\n    # is two operations and does not work\n\n\nplydata piping works with `plotnine`_.\n\n.. code-block:: python\n\n    from plotnine import ggplot, aes, geom_line\n\n    df = pd.DataFrame({'x': np.linspace(0, 2*np.pi, 500)})\n    (df\n     \u003e\u003e define(y='np.sin(x)')\n     \u003e\u003e define(sign=if_else('y \u003e= 0', '\"positive\"', '\"negative\"'))\n     \u003e\u003e (ggplot(aes('x', 'y'))\n         + geom_line(aes(color='sign'), size=1.5))\n     )\n\n.. figure:: ./doc/images/readme-image.png\n\nWhat about dplython or pandas-ply?\n----------------------------------\n\n`dplython`_ and `pandas-ply`_ are two other packages that have a similar\nobjective to plydata. The big difference is plydata does not use\na placeholder variable (`X`) as a stand-in for the dataframe. For example:\n\n.. code-block:: python\n\n    diamonds \u003e\u003e select(X.carat, X.cut, X.price)  # dplython\n\n    diamonds \u003e\u003e select('carat', 'cut', 'price')  # plydata\n    select(diamonds, 'carat', 'cut', 'price')    # plydata\n\nFor more, see the documentation_.\n\n.. |release| image:: https://img.shields.io/pypi/v/plydata.svg\n.. _release: https://pypi.python.org/pypi/plydata\n\n.. |license| image:: https://img.shields.io/pypi/l/plydata.svg\n.. _license: https://pypi.python.org/pypi/plydata\n\n.. |buildstatus| image:: https://github.com/has2k1/plydata/workflows/build/badge.svg?branch=master\n.. _buildstatus: https://github.com/has2k1/plydata/actions?query=branch%3Amaster+workflow%3A%22build%22\n\n.. |coverage| image:: https://codecov.io /github/has2k1/plydata/coverage.svg?branch=master\n.. _coverage: https://codecov.io/github/has2k1/plydata?branch=master\n\n.. |documentation| image:: https://readthedocs.org/projects/plydata/badge/?version=latest\n.. _documentation: https://plydata.readthedocs.io/en/latest/\n\n.. |documentation_stable| image:: https://readthedocs.org/projects/plydata/badge/?version=stable\n.. _documentation_stable: https://plydata.readthedocs.io/en/stable/\n\n.. _dplyr: https://github.com/tidyverse/dplyr\n.. _tidyr: https://github.com/tidyverse/tidyr\n.. _forcats: https://github.com/tidyverse/forcats\n.. _pandas-ply: https://github.com/coursera/pandas-ply\n.. _dplython: https://github.com/dodger487/dplython\n.. _plotnine: https://plotnine.readthedocs.io/en/stable/\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhas2k1%2Fplydata","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fhas2k1%2Fplydata","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhas2k1%2Fplydata/lists"}