Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/machow/databackend
https://github.com/machow/databackend
Last synced: 2 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/machow/databackend
- Owner: machow
- License: mit
- Created: 2022-08-08T01:41:03.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2024-04-23T19:26:48.000Z (8 months ago)
- Last Synced: 2024-04-23T20:31:08.803Z (8 months ago)
- Language: Python
- Size: 38.1 KB
- Stars: 5
- Watchers: 3
- Forks: 0
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# databackend
The `databackend` package allows you to register a subclass, without
needing to import the subclass itself. This is useful for implementing
actions over optional dependencies.## Example
For this example, we’ll implement a function, `fill_na()`, that fills in
missing values in a DataFrame. It works with DataFrame objects from two
popular libraries: `pandas` and `polars`. Importantly, neither library
needs to be installed.### Setup
The code below defines “abstract” parent classes for each of the
DataFrame classes in the two libraries.``` python
from databackend import AbstractBackendclass AbstractPandasFrame(AbstractBackend):
_backends = [("pandas", "DataFrame")]class AbstractPolarsFrame(AbstractBackend):
_backends = [("polars", "DataFrame")]
```Note that the abstract classes can be used as stand-ins for the real
thing in `issubclass()` and `isinstance`.``` python
from pandas import DataFrameissubclass(DataFrame, AbstractPandasFrame)
isinstance(DataFrame(), AbstractPandasFrame)
```True
> 📝 Note that you can use
> `AbstractPandasFrame.register_backend("pandas", "DataFrame")`, as an
> alternative way to register backends.### Simple fill_na: isinstance to switch behavior
The `fill_na()` function below uses custom handling for pandas and
polars.``` python
def fill_na(data, x):
if isinstance(data, AbstractPolarsFrame):
return data.fill_nan(x)
elif isinstance(data, AbstractPandasFrame):
return data.fillna(x)
else:
raise NotImplementedError()
```Notice that neither `pandas` nor `polars` need to be imported when
defining `fill_na()`.Here is an example of calling `fill_na()` on both kinds of DataFrames.
``` python
# test polars ----import polars as pl
df = pl.DataFrame({"x": [1, 2, None]})
fill_na(df, 3)# test pandas ----
import pandas as pd
df = pd.DataFrame({"x": [1, 2, None]})
fill_na(df, 3)
```x
0 1.0
1 2.0
2 3.0The key here is that a user could have only pandas, or only polars,
installed. Importantly, doing the isinstance checks do not import any
libraries!### Advanced fill_na: generic function dispatch
`databackend` shines when combined with [generic function
dispatch](https://mchow.com/posts/2020-02-24-single-dispatch-data-science/).
This is a programming approach where you declare a function
(e.g. `fill_na()`), and then register each backend specific
implementation on the function.Python has a built-in function implementing this called
[`functools.singledispatch`](https://docs.python.org/3/library/functools.html#functools.singledispatch).Here is an example of the previous `fill_na()` function written using
it.``` python
from functools import singledispatch@singledispatch
def fill_na2(data, x):
raise NotImplementedError(f"No support for class: {type(data)}")# handle polars ----
@fill_na2.register
def _(data: AbstractPolarsFrame, x):
return data.fill_nan(x)# handle pandas ----
@fill_na2.register
def _(data: AbstractPandasFrame, x):
return data.fillna(x)
```Note two important decorators:
- `@singledispatch` defines a default function. This gets called if no
specific implementations are found.
- `@fill_na2.register` defines specific versions of the function.Here’s an example of it in action.
``` python
# example ----import pandas as pd
import polars as pldf = pl.DataFrame({"x": [1, 2, None]})
fill_na2(df, 3)df = pd.DataFrame({"x": [1, 2, None]})
fill_na2(df, 3)
```x
0 1.0
1 2.0
2 3.0### How it works
Under the hood, `AbstractBackend` behaves similarly to python’s builtin
[`abc.ABC` class](https://docs.python.org/3/library/abc.html#abc.ABC).``` python
from abc import ABCclass MyABC(ABC):
passfrom io import StringIO
MyABC.register(StringIO)
# StringIO is a "virtual subclass" of MyABC
isinstance(StringIO("abc"), MyABC)
```True
The key difference is that you can specify the virtual subclass using
the tuple `("", "")`.When `issubclass(SomeClass, AbstractBackend)` runs, then…
- The standard ABC caching mechanism is checked, and potentially
returns the answer immediately.
- Otherwise, a subclass hook cycles through registered backends.
- The hook runs the subclass check for any backends that are imported
(e.g. are in `sys.modules`).Technically, `AbstractBackend` inherits all the useful metaclass things
from `abc.ABCMeta`, so these can be used also.