Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/jeffzi/pandas-select
Supercharged pandas indexing
https://github.com/jeffzi/pandas-select
pandas pandera python scikit-learn
Last synced: 17 days ago
JSON representation
Supercharged pandas indexing
- Host: GitHub
- URL: https://github.com/jeffzi/pandas-select
- Owner: jeffzi
- License: bsd-3-clause
- Created: 2020-01-14T13:10:49.000Z (almost 5 years ago)
- Default Branch: master
- Last Pushed: 2021-03-28T23:29:53.000Z (over 3 years ago)
- Last Synced: 2024-10-09T09:25:58.067Z (28 days ago)
- Topics: pandas, pandera, python, scikit-learn
- Language: Python
- Homepage: https://pandas-select.readthedocs.io/
- Size: 171 KB
- Stars: 9
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.rst
- License: LICENSE
Awesome Lists containing this project
README
==================================================
``pandas-select``: Supercharged DataFrame indexing
==================================================.. image:: https://github.com/jeffzi/pandas-select/workflows/tests/badge.svg
:target: https://github.com/jeffzi/pandas-select/actions
:alt: Github Actions status.. image:: https://codecov.io/gh/jeffzi/pandas-select/branch/master/graph/badge.svg
:target: https://codecov.io/gh/jeffzi/pandas-select
:alt: Coverage.. image:: https://readthedocs.org/projects/project-template-python/badge/?version=stable
:target: https://pandas-select.readthedocs.io/
:alt: Documentation status.. image:: https://img.shields.io/pypi/v/pandas-select.svg
:target: https://pypi.org/project/pandas-select/
:alt: Latest PyPI version.. image:: https://img.shields.io/pypi/pyversions/pandas-select.svg
:target: https://pypi.org/project/pandas-select/
:alt: Python versions supported.. image:: https://img.shields.io/pypi/l/pandas-select.svg
:target: https://pypi.python.org/pypi/pandas-select/
:alt: License.. image:: https://img.shields.io/badge/code%20style-black-000000.svg
:target: https://github.com/psf/black
:alt: Code style: black.. image:: https://img.shields.io/badge/style-wemake-000000.svg
:target: https://github.com/wemake-services/wemake-python-styleguide``pandas-select`` is a collection of DataFrame selectors that facilitates indexing
and selecting data, fully compatible with pandas vanilla indexing.The selector functions can choose variables based on their
`name `_,
`data type `_,
`arbitrary conditions `_,
or any `combination of these `_.``pandas-select`` is inspired by the excellent R library `tidyselect `_.
.. installation-start
Installation
------------``pandas-select`` is a Python-only package `hosted on PyPI `_.
It can be installed via `pip `_:.. code-block:: console
pip install pandas-select
.. installation-end
Design goals
------------* Fully compatible with the
`pandas.DataFrame `_
``[]`` operator and the
`pandas.DataFrame.loc `_
accessor.* Emphasise readability and conciseness by cutting boilerplate:
.. code-block:: python
# pandas-select
df[AllNumeric()]
# vanilla
df.select_dtypes("number").columns# pandas-select
df[StartsWith("Type") | "Legendary"]
# vanilla
df.loc[:, df.columns.str.startswith("Type") | (df.columns == "Legendary")]* Ease the challenges of `indexing with hierarchical index `_
and offers an alternative to `slicers `_
when the labels cannot be listed manually... code-block:: python
# pandas-select
df_mi.loc[Contains("Jeff", axis="index", level="Name")]# vanilla
df_mi.loc[df_mi.index.get_level_values("Name").str.contains("Jeff")]* Play well with machine learning applications.
- Respect the columns order.
- Allow *deferred selection* when the DataFrame's columns are not known in advance,
for example in automated machine learning applications.
- Offer integration with `sklearn `_... code-block:: python
from pandas_select import AnyOf, AllBool, AllNominal, AllNumeric, ColumnSelector
from sklearn.compose import make_column_transformer
from sklearn.preprocessing import OneHotEncoder, StandardScalerct = make_column_transformer(
(StandardScaler(), ColumnSelector(AllNumeric() & ~AnyOf("Generation"))),
(OneHotEncoder(), ColumnSelector(AllNominal() | AllBool() | "Generation")),
)
ct.fit_transform(df)Project Information
-------------------``pandas-select`` is released under the `BS3 `_ license,
its documentation lives at `Read the Docs `_,
the code on `GitHub `_,
and the latest release on `PyPI `_.
It is tested on Python 3.6+.