Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/emmt/databags.jl
Data bags combine the flexibility of dictionaries to store data and the `obj.key` syntax to mimic dynamic structures.
https://github.com/emmt/databags.jl
data-structures julia
Last synced: about 2 months ago
JSON representation
Data bags combine the flexibility of dictionaries to store data and the `obj.key` syntax to mimic dynamic structures.
- Host: GitHub
- URL: https://github.com/emmt/databags.jl
- Owner: emmt
- License: other
- Created: 2019-10-28T08:33:12.000Z (about 5 years ago)
- Default Branch: master
- Last Pushed: 2024-05-02T17:00:35.000Z (9 months ago)
- Last Synced: 2024-11-14T17:06:28.703Z (2 months ago)
- Topics: data-structures, julia
- Language: Julia
- Homepage:
- Size: 29.3 KB
- Stars: 1
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE.md
Awesome Lists containing this project
README
# Flexible data containers for Julia
[![License][license-img]][license-url]
[![Build Status][github-ci-img]][github-ci-url]
[![Build Status][appveyor-img]][appveyor-url]
[![Coverage][codecov-img]][codecov-url]`DataBags` is a small [Julia][julia-url] package providing *data-bags* which
are a quick way to store structured data. Data-bags combine properties and
dictionaries to associate keys (preferably symbols or strings) with values (of
any types) in a flexible way. From the user viewpoint, data-bags behave like
dynamic structures whose fields can be modified or created with the syntax of
structured objects, *i.e.* `obj.key`. They can also be deleted by calling
`delete!(obj,key)`. As an example:```julia
using DataBags, Dates
A = DataBag(date = now(), Δx = 0.1, x = -3:0.1:5)
A.Δx # get value of key `Δx`
A.y = sin.(A.x) # creates new key `y`
```which shows how easy it is to create a data-bag and access its fields.
Data-bags can also be indexed by their keys like dictionaries:```julia
A[:Δx] # is the same as `A.Δx`
A[:y] = cos.(A[:x]) # is the same as `A.y = cos.(A.x)`
```but this is, to my opinion, less readable and boring to type especially in an
interactive session. More generally, data-bag types are sub-types of
`AbstractDict` so you can expect that data-bags can be used like dictionaries.
For instance, you can apply `pop`, `merge`, `merge!`, `delete!`, *etc.* on a
data-bag.Admittedly, data-bags are less efficient than true Julia structures (there is
some overhead for retrieving a field of a data-bag) but they can be very
handful in interactive sessions or when designing new code: when the exact
contents of your data structures is not yet determined, data-bags let you
extend their contents without the pain of redefining your structures,
re-including your code and recreating your objects, *etc.* Tools such as
[`Revise`][revise-url] can help but cannot automatically determine what to do
with new members of existing objects if their type definition has changed.## Creation of data-bags
Data-bags are created by calling the `DataBag(...)` constructor. The initial
contents of data-bags can be specified by keywords, by key-value pairs, or as a
dictionary (`AbstractDict`). To avoid ambiguities, these different styles
cannot be mixed. Below are a few examples:```julia
using DataBags
A = DataBag( units = "µm", Δx = 0.1, Δy = 0.2)
B = DataBag(:units => "µm", :Δx => 0.1, :Δy => 0.2)
C = DataBag("units" => "µm", "Δx" => 0.1, "Δy" => 0.2)
D = DataBag(1 => 0.9, 2 => sqrt(2), 3 => 4)
```These statements yield two data-bags, `A` and `B`, with symbolic keys (of type
`Symbol`), a data-bag, `C`, with textual keys (of type `String`) and a
data-bag, `D`, with integer keys (of type `Int`). All these data-bags can
store values of `Any` type.Accessing a value is possible via the syntax `obj[key]` or, for symbolic and
textual keys, via the syntax `obj.key`. Accessing values via the syntax
`obj.key` is faster for symbolic keys than for textual keys (because it
involves converting a symbol into a string).Data-bag constructors attempt to favor symbolic or string keys (to exploit the
`obj.key` syntax) and enforce unspecific values of `Any` type (for
flexibility). In order to override these rules, the parametric versions
`DataBag{K}` or `DataBag{K,V}` of the constructor, with `K` the key type and
`V` the value type, can be called instead. For example:```julia
E = DataBag{Integer}(1 => 0.9, 2 => sqrt(2), 3 => 4)
F = DataBag{Integer,Real}(1 => 0.9, 2 => sqrt(2), 3 => 4)
```yield two data-bags, `E` and `F`, both with integer keys (of any `Integer`
type), the values of `E` are unspecific while the values of `F` are restricted
to be `Real`.The same rules apply if the data-bag is built out of an existing dictionary
(remember that data-bags are themselves abstract dictionaries). So
`DataBag(F)` yields a data-bag with keys of the same type as those of `F` (that
is `Integer` in that case) but values of `Any` type.When a data-bag is built out of an existing dictionary, the data-bag creates a
new dictionary to store its values and initializes it with the contents of the
dictionary passed in argument. After the creation of the data-bag, the
data-bag and the original dictionary are independent. Their values, which may
be references to other objects, may not be independent though. If you want to
make a data-bag that stores its contents in a given dictionary, say `dict`,
call:```julia
wrap(DataBag, dict)
```instead of:
```julia
DataBag(dict)
```If no arguments nor keywords are specified, the data-bag created by `DataBag()`
is initially empty and has symbolic keys with any type of values, *i.e.* an
instance of `Dict{Symbol,Any}` is used for storing the key-value pairs.Unless `iterate` is overridden, iterating on an `AbstractDataBag` is iterating
on its key-value pairs.Calling the `contents` method on an `AbstractDataBag` yields the internal
object, an `AbstractDict`, used to store the data of the data-bag.## Defining custom data-bag types
The `DataBags` package provides simple means to facilitate creating new
sub-types of `DataBags.AbstractDataBag` so as to benefit from the common
interface implemented for data-bags. The following steps are needed:1. Make your type inherit from `DataBags.AbstractDataBag{K,V,D}` with `K` the
key type, `V` the value type and `D<:AbstractDict{K,V}` the type of the
dictionary storing the key-value pairs.2. Extend the `DataBags.contents(A::T)` method for your custom type `T` so that
it returns the dictionary storing the key-value pairs in an instance `A`.3. Optionally provide some constructor(s) to facilitate creation of objects of
type `T`. You may also consider extending the `DataBags.wrap` method if.Here is a first example:
```julia
using DataBags# Define a concrete sub-type of `DataBags.AbstractDataBag`.
struct BagEx1{K,V,D<:AbstractDict{K,V}} <: DataBags.AbstractDataBag{K,V,D}
data::D # object used to store key-value pairs
... # another member
... # yet another member
... # etc.
end# Override `DataBags.contents` to yield the dictionary that stores the data.
DataBags.contents(A::BagEx1) = Base.getfield(A, :data)
```Note that `Base.getfield` has to be used to retrieve a member of objects whose
type is derived from `DataBags.AbstractDataBag` as for the member `data` of the
object `A` in the above example. This is because the `getproperty` and
`setproperty!` methods are overridden to implement the `obj.key` syntax for
sub-types of `DataBags.AbstractDataBag`.In the above example, it is only possible to create a data-bag of type
`BagEx1` out of a dictionary which is shared by the data-bag. The only
advantage over a simple dictionary is the `obj.key` syntax provided keys have
type `Symbol` or `String`.To improve over this first example, we want to implement the same kind of
creation rules as `DataBag`. This leads to the following code:```julia
using DataBags# Define a concrete sub-type of `DataBags.AbstractDataBag`.
struct BagEx2{K,V,D<:AbstractDict{K,V}} <: DataBags.AbstractDataBag{K,V,D}
data::D # object used to store key-value pairs
... # another member
... # yet another member
... # etc.
# Explicitely define inner constructor to avoid outer constructor
# automatically created by Julia.
BagEx2{K,V,D}(data::D) where {K,V,D<:AbstractDict{K,V}} = new{K,V,D}(data)
end# Outer constructor.
BagEx2(args...; kdws...) =
wrap(BagEx2, contents(Dict{Any,Any}, args...; kdws...))# Override `DataBags.contents` to yield the dictionary that stores the data.
DataBags.contents(A::BagEx2) = Base.getfield(A, :data)# Override `DataBags.wrap` to create an instance of `BagEx2` that stores
# its data in a given dictionary.
DataBags.wrap(::Type{BagEx2}, data::D) where {K,V,D<:AbstractDict{K,V}} =
BagEx2{K,V,D}(data)
```In this second example, we have:
* Explictely defined an inner constructor so as to forbid creating a data-bag
that shares an existing dictionary, say `dict`, by calling the constructor
`BagEx2`. This is however possible by calling `wrap(BagEx2,dict)`.* Defined an outer constructor that calls the `wrap` method over the dictionary
created by the `DataBags.contents` method called with `Dict{K,V}` as a first
argument, followed by all arguments and keywords passed to your constructor:* Overridden methods `DataBags.contents` (as in the first example) and
`DataBags.wrap`. The latter is to wrap a dictionary in a new `BagEx2`
instance taking care of supplying the correct type parameters `{K,V,D}`.To add constructors with constraints on the type of keys and values, you may
have a look at the complete implementation of the `DataBag` type which is
summarized below:```julia
struct DataBag{K,V,D<:AbstractDict{K,V}} <: AbstractDataBag{K,V,D}
data::D # data data-bag
# Provide inner constructor to let outer constructors deal with type
# parameters.
DataBag{K,V,D}(data::D) where {K,V,D<:AbstractDict{K,V}} =
new{K,V,D}(data)
end# Outer constructors.
DataBag(args...; kwds...) =
wrap(DataBag, contents(Dict{Any,Any}, args...; kwds...))
DataBag{K}(args...; kwds...) where {K} =
wrap(DataBag, contents(Dict{K,Any}, args...; kwds...))
DataBag{K,V}(args...; kwds...) where {K,V} =
wrap(DataBag, contents(Dict{K,V}, args...; kwds...))# Extends the `contents` method to benefit from the API of `AbstractDataBag`.
@inline contents(A::DataBag) = Base.getfield(A, :data)# Extend the `wrap` method to create instances of `DataBag`.
wrap(::Type{DataBag}, data::D) where {K,V,D<:AbstractDict{K,V}} =
DataBag{K,V,D}(data)
wrap(::Type{DataBag{K}}, data::D) where {K,V,D<:AbstractDict{K,V}} =
wrap(DataBag, data)
wrap(::Type{DataBag{K,V}}, data::D) where {K,V,D<:AbstractDict{K,V}} =
wrap(DataBag, data)
wrap(::Type{DataBag{K,V,D}}, data::D) where {K,V,D<:AbstractDict{K,V}} =
wrap(DataBag, data)
```## A useful minimalist example
The `DataBag` type provided by `DataBags` may be sufficient for your needs but
you may want to specialize it a bit to exploit the power of *type dispatching*
in Julia and to implement some specific behavior. The most simple example of
creating such a sub-type takes about half a dozen of lines of code:```julia
using DataBags
struct BagEx3 <: DataBags.AbstractDataBag{Symbol,Any,Dict{Symbol,Any}}
data::Dict{Symbol,Any}
BagEx3(args...; kwds...) =
new(DataBags.contents(Dict{Symbol,Any}, args...; kwds...))
end
DataBags.contents(A::BagEx3) = Base.getfield(A, :data)
```*Et voilà!* That is all you need to create a new type, `BagEx3`, whose
instances behave like a dictionary with symbolic keys and any type of values,
implement the `obj.key` syntax to get/set the value of `key` (as a shortcut of
`obj[:key]`) and which can be constructed using keywords, *e.g.* `obj =
BagEx3(id=1, x=-3.14:0.1:3.14, units="µm")`.This usage is so common that a macro is provided by the `DataBags` package and
the above statements can be reduced to:```julia
using DataBags
DataBags.@newtype BagEx3
```using the macro not only saves typing (to encourage creating such data-bag
types) but also warrants that the implementation is correct and follows further
evolutions of the `DataBags` package.[doc-dev-img]: https://img.shields.io/badge/docs-dev-blue.svg
[doc-dev-url]: https://emmt.github.io/DataBags.jl/dev[license-url]: ./LICENSE.md
[license-img]: http://img.shields.io/badge/license-MIT-brightgreen.svg?style=flat[github-ci-img]: https://github.com/emmt/DataBags.jl/actions/workflows/CI.yml/badge.svg?branch=master
[github-ci-url]: https://github.com/emmt/DataBags.jl/actions/workflows/CI.yml?query=branch%3Amaster[appveyor-img]: https://ci.appveyor.com/api/projects/status/github/emmt/DataBags.jl?branch=master
[appveyor-url]: https://ci.appveyor.com/project/emmt/DataBags-jl/branch/master[codecov-img]: https://codecov.io/gh/emmt/DataBags.jl/branch/master/graph/badge.svg
[codecov-url]: https://codecov.io/gh/emmt/DataBags.jl[julia-url]: https://julialang.org/
[revise-url]: https://github.com/timholy/Revise.jl