https://github.com/i2mint/py2json
Tools for json serialization of python objects
https://github.com/i2mint/py2json
json python serialization
Last synced: 2 months ago
JSON representation
Tools for json serialization of python objects
- Host: GitHub
- URL: https://github.com/i2mint/py2json
- Owner: i2mint
- License: mit
- Created: 2020-07-17T15:43:20.000Z (almost 6 years ago)
- Default Branch: master
- Last Pushed: 2023-08-25T16:47:18.000Z (almost 3 years ago)
- Last Synced: 2025-06-05T05:26:17.019Z (about 1 year ago)
- Topics: json, python, serialization
- Language: Python
- Homepage: https://i2mint.github.io/py2json/
- Size: 2.99 MB
- Stars: 2
- Watchers: 3
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# py2json
A small toolkit to help serialize Python objects and callable references into JSON-friendly representations and reconstruct them. It provides:
- `Ctor` — deconstruct/construct objects via a CONSTRUCTOR/ARGS/KWARGS dict representation.
- `fakit` — a lightweight mini-language to express function calls (f, a, k) and execute them.
- dotpath helpers — `obj_to_dotpath` and `dotpath_to_obj` for resolving dotted references.
- helpers to extract function metadata (`obj2dict.func_info_dict`) and a JSON encoder that
understands numpy and bytes.
# Quick examples
`py2json` provides a small `JsonCodec` and `make_json_codec` factory that wire
`Ctor` and `fakit` into a compact encode/decode interface. Instead of forcing
you to pick from fixed protocol names, the factory accepts a `path_parser`
argument which is either a callable `Callable[[str], str]` or a string used as
the separator between module and object parts (default is '.').
This lets you support styles such as:
- dotted: `package.module.attr` (default, `path_parser='.'`)
- colon: `package.module:Class.attr` (`path_parser=':'`)
The codec normalizes path strings via the `path_parser` and evaluates `$fak`
expressions via `refakit` (with an injectable `func_loader` for whitelisting).
Example (colon separator):
```py
from py2json import make_json_codec
codec = make_json_codec(path_parser=':')
encoded = codec.encode('collections.namedtuple:MyTuple')
decoded = codec.decode(encoded)
```
Tools for json serialization of python objects
## A peep a bit deeper
The `JsonCodec` instance that `make_json_codec` returns uses `Ctor` and `fakit`.
Let's have a quick peep at those.
Create a `namedtuple` using `Ctor` and instantiate it:
```py
from py2json.ctor import Ctor
from collections import namedtuple
ctor_jdict = Ctor.to_ctor_dict(namedtuple, args=('A', 'x y z'))
A = Ctor.construct(ctor_jdict)
inst = A('one', 'two', 'three')
```
Use `fakit` to express and run a call given a dotted path:
```py
from py2json.fakit import fakit
fakit({'f': 'os.path.join', 'a': ['I', 'am', 'a', 'filepath']})
```
Resolve dotted references and round-trip:
```py
from py2json.fakit import obj_to_dotpath, dotpath_to_obj
from inspect import Signature
dot = obj_to_dotpath(Signature.replace)
assert dotpath_to_obj(dot) is Signature.replace
```
Notes
-----
- `Ctor` will serialize callables to a JSON-friendly jdict with keys `{module, name, attr}` and
can `construct` them back into callables or instantiated objects.
- `fakit` accepts either a callable, a dotted string, or a small structure `(f, a, k)` and
uses a configurable `func_loader` to resolve `f`. For security, supply a whitelist `func_loader`.
See `misc/py2json_wip.ipynb` for runnable demos.
# Why py2json?
Here we tackle the problem of serializing a python object into a json.
Json is a convenient choice for web request responses or working with mongoDB for instance.
It is usually understood that we serialize an object to be able to deserialize it to recover the original object: Implicit in this is some definition of equality, which is not as trivial as it may seem. Usually **some** aspects of the deserialized object will be different, so we need to be clear on what should be the same.
For example, we probably don't care if the address of the deserialized object is different. But we probably care that it's key attributes are the same.
What should guide us in deciding what aspects of an object should be recovered?
Behavior.
The only value of an object is behavior that will ensue. This may be the behavior of all or some of the methods of a serialized instance, or the behavior of some other functions that will depend on the deserialized object.
Our approach to converting a python object to a json will touch on some i2i cornerstones that are more general: Conversion and contextualization.
## Behavior equivalence: What do we need an object to have?
Say we are given the code below.
```python
def func(obj):
return obj.a + obj.b
class A:
e = 2
def __init__(self, a=0, b=0, c=1, d=10):
self.a = a
self.b = b
self.c = c
self.d = d
def target_func(self, x=3):
t = func(self)
tt = self.other_method(t)
return x * tt / self.e
def other_method(self, x=1):
return self.c * x
```
Which we use to make the following object
```python
obj = A(a=2, b=3)
```
Say we want to json-serialize this so that a deserialized object `dobj` is such that for all valid `obj`, resulting `dobj`, and valid `x` input:
```
obj.target_func(x) == A.target_func(obj, x) == A.target_func(dobj, x)
```
The first equality is just a reminder of a python equivalence.
The second equality is really what we're after.
When this is true, we'll say that `obj` and `dobj` are equivalent on `A.target_func` -- or just "equivalent" when the function(s) it should be equivalent is clear.
To satisfy this equality we need `dobj` to:
- Contain all the attributes it needs to be able to compute the `A.target_func` function -- which means all the expressions contained in that function or, recursively, any functions it calls.
- Such that the values of a same attribute of `obj` and `dobj` are equivalent (over the functions in the call try of the target function that involve these attributes.
Let's have a manual look at it.
First, you need to compute `func(self)`, which will require the attributes `a` and `b`.
Secondly, you'll meed to computer `other_method`, which uses attribute `c`.
Finally, the last expression, `x * tt / self.e` uses the attribute `e`.
So what we need to make sure we serialize the attributes: `{'a', 'b', 'c', 'e'}`.
That wasn't too hard. But it could get convoluted. Either way, we really should use computers for such boring tasks!
That's something `py2json` would like to help you with.