https://github.com/ont-app/igraph

Defines a protocol for viewing data as Subject-Predicate-Object graphs, with a simple typedef implementing same
https://github.com/ont-app/igraph
Last synced: about 1 month ago
JSON representation
Defines a protocol for viewing data as Subject-Predicate-Object graphs, with a simple typedef implementing same
Host: GitHub
URL: https://github.com/ont-app/igraph
Owner: ont-app
License: epl-1.0
Created: 2018-12-05T23:46:26.000Z (almost 7 years ago)
Default Branch: master
Last Pushed: 2025-05-06T22:55:31.000Z (7 months ago)
Last Synced: 2025-08-01T00:31:10.685Z (4 months ago)
Language: Clojure
Homepage:
Size: 478 KB
Stars: 97
Watchers: 4
Forks: 2
Open Issues: 5
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
Awesome Lists containing this project

awesome-semantic-web - igraph - IGraph defines a protocol which aims to provide a general interface to a variety of graph-based representations. (Programming / Clojure)
README

          #  ont-app/igraph

IGraph defines a protocol which aims to provide a general interface to

a variety of graph-based representations (RDF, datascript, datomic,

loom, ...)

It also defines a `Graph` datatype which implements `IGraph`.

There is a [15-minute video introduction here](https://www.youtube.com/watch?v=BlH__4iNHZE&feature=youtu.be).

## Contents

- [Dependencies](#h2-dependencies)

- [Motivation](#h2-motivation)

- [The IGraph protocol](#h2-igraph-protocol)

  - [Methods summary](#h3-methods-summary)

  - [Member access](#Member_access)

    - [Normal form](#Normal_form)

    - [Tractability](#h4-tractability)

    - [`subjects`](#subjects_method)

    - [`get-p-o`](#get-p-o_method)

    - [`get-o`](#get-o_method)

    - [`ask`](#ask_method)

    - [`query`](#query_method)

    - [`invoke` for arities 0-3](#invoke_method)

  - [Content Manipulation](#Content_Manipulation)

    - [`mutability`](#mutability_method)

    - [The `add-to-graph` multimethod](#add-to-graph)

    - [The `remove-from-graph` multimethod](#remove-from-graph)

- [The IGraphImmutable protocol](#IGraphImmutable)

    - [`add`](#add_method)

    - [`subtract`](#subtract_method)

- [The IGraphMutable protocol](#IGraphMutable)

    - [`add!`](#add!_method)

    - [`subtract!`](#subtract!_method)

- [The IGraphAccumulateOnly protocol](#IGraphAccumulateOnly)

    - [`claim`](#claim_method)

    - [`retract`](#retract_method)

- [The IGraphSet protocol](#h2-igraphset-protocol)

  - [Methods summary](#h3-igraphset-methods-summary)

  - [`union`](#union_method)

  - [`intersection`](#intersection_method)

  - [`difference`](#difference_method)

- [Traversal](#Traversal)

  - [The `traverse` function](#traverse_function)

  - [Traversal functions](#Traversal_functions)

    - [Context](#h4-context)

    - [The queue](#queue)

  - [Traversal utilities](#Traversal_utilities)

    - [`transitive-closure`](#h4-transitive-closure)

    - [`traverse-link`](#h4-traverse-link)

    - [`maybe-traverse-link`](#h4-maybe-traverse-link)

    - [`traverse-or`](#h4-traverse-or)

  - [Traversal composition with `t-comp`](#Traversal_composition)

    - [short form](#h4-t-comp-short)

    - [long form](#h4-t-comp-long)

  - [Using traversal functions as a `p` argument to `invoke`](#traversal-fn-as-p)

- [Cardinality-1 utilites](#cardinality-1_utilities)

  - [`unique`](#h3-unique)

  - [`flatten-description`](#h3-flatten-description)

  - [`normalize-flat-description`](#h3-normalize-flat-description)

  - [`assert-unique`](#h3-assert-unique)

- [I/O](#i-o)

  - [`write-to-file`](#h3-write-to-file)

  - [`read-from-file`](#h3-read-from-file)

- [Other utilities](#Other_utilities)

  - [`reduce-spo`](#h3-reduce-spo)

- [Implementations](#h2-implementations)

  - [`ont-app.igraph.graph/Graph`](#Graph)

    - [Graph creation](#h4-graph-creation)

    - [Querying](#h4-querying)

  - [sparql-client](#h3-sparql-client)

  - [igraph-jena](#h3-igraph-jena)

  - [datascript-graph](#h3-datascript-graph)

  - [datomic-client](#h3-datomic-client)

- [Testing support](#testing-support)

- [Developer Notes](#developer-notes)

- [Future Work](#h2-future-work)

- [Acknowledgements](#h2-acknowledgements)

- [License](#h2-license)

---



## Installation

This is deployed to [clojars](https://clojars.org/ont-app/igraph):

[![Clojars

Project](https://img.shields.io/clojars/v/ont-app/igraph.svg)](https://clojars.org/ont-app/igraph)

Require thus:

```

(:require 

  [ont-app.igraph.core :as igraph] ;; for the IGraph protocol and related stuff

  [some.igraph.implementation ...] ;; implements IGraph

  )

           

```



## Motivation

One of the defining characteristics of Clojure is that it revolves

around a minimal set of basic data structures.

I think it can be argued that the collection primitives in Clojure can

be approximately ordered by their degree of expressiveness:

- seq - first/rest

- < set - membership

- < vector - indexable members

- < map - a collection of associations

The conceit of IGraph is that there is room for a new collection

primitive with one higher level of expressiveness:

- < graph - a collection of named relationships between named entities

This is informed to a large degree by the

[RDF](https://www.wikidata.org/wiki/Q54872) model, and aims to align

with [linked data](https://www.wikidata.org/wiki/Q515701) encoded in

RDF, while keeping direct dependencies to the RDF stack to a minimum.



## The IGraph protocol

This protocol defines the basic operations over a graph conceived of

as a set of triples S-P-O, where subject `S` and object `O` typically

name entities, and property `P` is a named relation that applies

between those entities.

This is directly inspired by the RDF model, but the requirement that

these identifiers adhere strictly to RDF specifications for URIs, and

that literal values be restricted to a small set of scalars is relaxed

quite a bit.



### Methods summary

The `IGraph` protocol specifies the following methods:

#### Member access

- `(normal-form g)` -> `{s {p #{o...}...}...}`

- `(subjects g)` -> `(s ...)`, a lazy sequence

- `(get-p-o g s)` -> `{p #{o...} ...}`

- `(get-o g s p)` -> `#{o ...}`

- `(ask g s p o)` -> truthy

- `(query g q)` -> implementation-dependent query results

#### Content manipulation

- `(mutability g)` -> One of `#{::read-only ::immutable ::mutable ::accumulate-only}`

#### `invoke` to support `IFn`

- `(g)` = `(normal-form g)`

- `(g s)` -> {p #{o...} ...}

- `(g s p)` -> #{o ...}

- `(g s p o)` -> truthy



### Member access



#### Normal form

Any implemetation of this protocol, regardless of its _native

representation_ must be expressable in IGraph's `Normal Form`.

As an example, let's start with a graph called 'eg' with four triples:

```

> (igraph/normal-form eg)

{:john 

  {:isa #{:person}, 

   :likes #{:beef}},

 :mary 

 {:isa #{:person}, 

  :likes #{:chicken}}}

>

```

These are facts about two subjects, :john and :mary with two facts

each.

John is a person who likes beef.

Mary is also a person, and likes chicken.

The normal form has three tiers. The "s-level" is a map from each

subject to a "p-level" `description` of that subject.  The normal form

for descriptions is a map from a property identifier to an "o-level"

set of objects for said subject and property.

What I'm aiming for here is a form that's

- extremely regular and simple

- lends itself to expressing and thinking about basic set operations

on graphs.

##### A note on the keyword identifiers used in these examples

To keep things simple and readable, none of the keywords used in these

examples are [namespaced](https://blog.jeaye.com/2017/10/31/clojure-keywords/#namespaced-keywords). 

In practice you will probably want to used namespaced keywords, and

some implementations of IGraph, e.g. those that interact directly with

RDF-based representations, will expect them.



#### Tractability

It is expected that while many implementations of IGraph will be

in-memory data structures of modest size, others might be huge

knowledge bases provided on a server somewhere

([Wikidata](https://www.wikidata.org/wiki/Q2013), for example). In the

latter case it is always acceptable throw an `::igraph/Intractable`

for any method that warrants it:

```

(throw (ex-info "Normal form for Wikidata is intractable" 

  {:type ::igraph/Intractable}))

```



#### `subjects`

The `subjects` method must return a lazy sequence of the complete set

of subjects in the graph (modulo tractability):

```

> (igraph/subjects eg)

`(:john :mary)

> (type (igraph/subjects eg))

clojure.lang.LazySeq

>

```



#### `get-p-o`

We must be able to get the p-level description of any subject with

`get-p-o`:

```

> (igraph/get-p-o eg :john)

{:isa #{:person}, :likes #{:beef}}

>

```



#### `get-o`

We must be able to get the o-level set of objects for any subject and

predicate with `get-o`:

```

> (igraph/get-o eg :john :isa)

#{:person}

>

``` 



#### `ask`

We must be able to test for whether any particular triple is in the

graph with `ask` (any truthy response will do).

``` 

> (igraph/ask eg :john :likes :beef) 

:beef 

> (igraph/ask eg :john :likes :chicken) 

nil

>

```



#### `query`

We must be able to query the graph using a format appropriate to the

native representation. This example uses the format expected by

`ont-app.igraph.graph/Graph`, described [below](#h4-querying):

```

> (igraph/query eg [[:?person :isa :person]])

#{{:?person :mary} {:?person :john}}

>

```

In this case, the result is a set of `binding maps`, mapping

:?variables to values, similar to the result set of a

[SPARQL](https://www.wikidata.org/wiki/Q54871) query.

For comparison, here is a sketch of an equivalent SPARQL query, which

would be appropriate if our IGraph protocol was targeted to a SPARQL

endpoint which we might call `sparql-eg`:

```

> (query sparql-eg 

  "PREFIX :  

   SELECT * WHERE 

   {

     ?person a :person

   }")

[{:person :mary} {:person :john}]

> 

```



#### `invoke` for arities 0-3

An instance of IGraph must provide `invoke` implementations as

follows:

Without arguments, it must return Normal Form (or throw an ::igraph/Intractable):

```

> (eg)

{:john {:isa #{:person}, :likes #{:beef}},

 :mary {:isa #{:person}, :likes #{:chicken}}}

>

```

With a single "s" argument, it must treat the argument as the subject

of get-p-o:

```

> (eg :john)

{:isa #{:person}, :likes #{:beef}},

>

```

With two arguments "s" and "p", a set of objects must be returned:

```

> (eg :mary :likes)

#{:chicken}

>

```

This will often be the value of `get-o`, but it may also accept as the

"p" argument a _traversal function_, described

[below](#traversal-fn-as-p).

With three arguments "s" "p" and "o", the response must be truthy:

```

> (eg :mary :likes :chicken)

:chicken

>

>

> (eg :mary :likes :beef)

nil

>

```

This will often be equivalent to `ask`, but again, the "p" argument

can be a traversal function, described [below](#traversal-fn-as-p).



### Content Manipulation

There a several factors to take into account when adding or removing

content from a graph.

Some graphs, (such as a public SPARQL endpoint to which one does not

have UPDATE permissions) may not be subject to modification. Other

native representations (such as a SPARQL endpoint with UPDATE

permissions) might best be treated as mutable graphs.

Naturally, other things being equal, the preferred solution is to use

immutable graphs when it is possible to do so. The examples in this

README will mostly be applied to immutable graphs unless stated otherwise.



#### `mutability`

The `mutability` method returns one of the following values

- `::igraph/read-only` - there is no means for altering the contents of

  the graph

- `::igraph/immutable` - the graph implements

  [IGraphImmutable](#IGraphImmutable)

- `::igraph/mutable` - the graph implements

  [IGraphMutable](#IGraphMutable)

- `::igraph/accumulate-only` - the graph implements [IGraphAccumulateOnly](#IGraphAccumulateOnly), the approach used in Datomic



#### The `add-to-graph` multimethod

IGraph defines a multimethod `add-to-graph`, dispatched on the type of

graph, and a function `triples-format`. This multimethod can inform

mutable, immutable and accumulate-only graphs.

Naturally Normal Form is one possible format:

```

> (igraph/triples-format {:john {:likes# #{:beef}}})

:normal-form

>

```

Another possible value is `:vector`, with a subject and at least one

P-O pair:

```

> (igraph/triples-format [:john :likes :beef])

:vector

> (igraph/triples-format [:john :isa :person :likes :beef])

:vector

>

``` 

Finally, we have `:vector-of-vectors`:

```

> (igraph/triples-format  [[:john :isa :person] [:mary :isa :person]])

:vector-of-vectors

>

```

Any implementation of IGraph should support adding to the graph in all

of these formats.



#### The `remove-from-graph` multimethod

IGraph also defines multimethod `remove-from-graph`, dispatched on the

graph types and a function `triples-removal-format`. This multimethod

can inform both mutable and immutable graphs.

The `triples-removal-format` function returns the same keywords as

`triples-format`, but adds one more: `:underspecified-triple`, a

vector with fewer than 3 elements:

```

> (igraph/triples-removal-format [:john])

:underspecified-triple

> (igraph/triples-removal-format [:john :likes])

:underspecified-triple

>

```

`triples-removal-format` assigns the :vector-of-vectors flag to a

vector of either :vector or :underspecified-vector. All

implementations of IGraph should support each of these flags.

This allows us to subtract any format that could also be added, plus

all `[s * *]` or all `[s p *]`.



## The IGraphImmutable protocol

An add or subtract operation to an immutable graph returns a cheap

copy of the original graph modified per the argument provided.



### `add`

Calling `(add g to-add)` must return an immutable graph such that the

graph now contains `to-add`. Any triples in `to-add` which are already

in the graph should be skipped.

See the notes above about the [add-to-graph](#add-to-graph)

multimethod.

Typically adding to a graph in code is most easily expressed using a

vector or a vector of vectors:

```

> (igraph/normal-form 

    (igraph/add 

      eg 

      [[:chicken :subClassOf :meat]

       [:beef :subClassOf :meat]

       ]))

{:john {:isa #{:person}, :likes #{:beef}},

 :mary {:isa #{:person}, :likes #{:chicken}},

 :chicken {:subClassOf #{:meat}},

 :beef {:subClassOf #{:meat}}}

>

```

We can use the Normal Form of one graph to add it to another:

```

> (meats)

{:chicken {:subClassOf #{meat}}

 :beef {:subClassOf #{meat}}}

>

> (igraph/normal-form (add eg (meats)))

{:john {:isa #{:person}, :likes #{:beef}},

 :mary {:isa #{:person}, :likes #{:chicken}},

 :chicken {:subClassOf #{:beef}},

 :beef {:subClassOf #{:beef}}}

> 

```



#### `subtract`

The multimethod `remove-from-graph` supports the `subtract` operation,

dispatched on the type of the graph and `triples-removal-format`,

described [above](#remove-from-graph):

```

> (igraph/normal-form (igraph/subtract eg [:john]))

{:mary {:isa #{:person}, :likes #{:chicken}}}

>

> (igraph/normal-form (igraph/subtract eg [:john :likes]))

{:john {:isa #{:person}}, 

 :mary {:isa #{:person}, :likes #{:chicken}}}

>

```



## The IGraphMutable protocol

Some graphs' native representations are implemented as mutable

repositories. To support this, the IGraphMutable protocol provides

methods `add!` and `subtract!`.

The [add-to-graph](#add-to-graph) and

[remove-from-graph](#remove-from-graph) multimethods should still

inform the logic here, and the behavior should be essentially the

same, with the exception that the graph returned is the same object,

mutated as specified.



### `add!`

`(add! g to-add)` -> g, where g is both the argument and return value.

An error should be thrown if `(mutablility g)` != ::igraph/mutable.



### `subtract!`

`(subtract! g to-subtract)` -> g, where g is both the argument and

return value.

An error should be thrown if `(mutablility g)` != ::igraph/mutable.



## The IGraphAccumulateOnly protocol

A graph whose native representation is based on

[Datomic](https://www.datomic.com/) implements what Datomic calls an

"Accumulate-only" approach to adding and removing from a graph. To

support this, the IGraphAccumulateOnly protocol provides methods

`claim` (corresponding to the datomic 'add' operation), and

`retract`. In this scheme the state of the graph can be rolled back to

any point in its history. See the [Datomic

documentation](https://docs.datomic.com/) for details.

The [add-to-graph](#add-to-graph) and

[remove-from-graph](#remove-from-graph) multimethods should still

inform the logic here, and the behavior should be essentially the

same, with the exception that the graph returned now points to the

most recent state of the graph after making the modification. Any

given instantiation of the graph will remain immutable.



### `claim`

`(claim g to-add)` -> g', where g is an append-only graph, and

g' now points to the most recent state of g's

[transactor](https://docs.datomic.com/on-prem/transactor.html).

An error should be thrown if `(mutablility g)` != ::igraph/accumulate-only.



### `retract`

`(retract g to-retract)` -> g', where g is an append-only graph, and

g' now points to the most recent state of g's

[transactor](https://docs.datomic.com/on-prem/transactor.html).

An error should be thrown if `(mutablility g)` != ::igraph/accumulate-only.



## The IGraphSet protocol

It will make sense for many implementations of IGraph also to

implement the basic set operations, defined in IGraphSet. Set

operations may not be suitable between very large graphs.

For purposes of demonstration, let's assume a second graph `other-eg`:

```

> (igraph/normal-form other-eg)

{:mary {:isa #{:person}, :likes #{:pork}},

 :waldo {:isa #{:person}, :likes #{:beer}}}

>

```

I think examples of each operation should serve to describe them.



### Methods summary

- `(union g1 g2)` -> A new graph with all triples from both graphs

- `(difference g1 g2)` -> A new graph with triples in g1 not also in

  g2

- `(intersection g1 g2)` -> A new graph with only triples shared in

  both graphs



### `union`

```

> (igraph/normal-form (igraph/union eg other-eg))

{:john {:isa #{:person}, :likes #{:beef}},

 :mary {:isa #{:person}, :likes #{:pork :chicken}},

 :waldo {:isa #{:person}, :likes #{:beer}}}

>

``` 



### `intersection`

```

> (igraph/normal-form (igraph/intersection eg other-eg)) 

{:mary {:isa #{:person}}

>

```



### `difference`

```

> (igraph/normal-form (igraph/difference eg other-eg))

{:john {:isa #{:person}, :likes #{:beef}}, 

 :mary {:likes #{:chicken}}}

>

> (igraph/normal-form (igraph/difference other-eg eg))

{:mary {:likes #{:pork}}, :waldo {:isa #{:person}, :likes #{:beer}}}

>

```



## Traversal

Clojure and other functional programming languages have a

[reduce](https://clojuredocs.org/clojure.core/reduce) idiom, which

allows the user to create aggregations over a sequence by providing a

"reducer function" expressing the relationship between each member of

that sequence and the resulting aggregation.

IGraph defines a `traverse` function to allow the user to create

aggregations over the contents of a graph by providing a `traversal

function`, which is analogous to a reducer function, but is

nessesarily a bit more involved.

- `(traverse g traversal context acc queue)` -> `acc'`

- `(traverse g traversal acc queue)` -> `acc'` ;; default context = {}

- `(traverse g traversal queue)` -> `acc'` ;; default context = {}, default acc = []

    ... traversing `g` per the `traversal` function, starting with the

    first element of `queue`, possibly informed by `context`.

This function will repeatedly call the `traversal` function until

`queue` is empty, returning the final value for `acc`. Each call to

the traversal function returns modified versions of `context`, `acc`

and `queue`.

To illustrate traversal, let's expand on our `eg` graph by adding some

type structure:

Assume we have a graph called 'eg-with-types':

```

> (def eg-with-types 

    (add eg

      [[:person :subClassOf :thing]

       [:beef :subClassOf :meat]

       [:chicken :subClassOf :meat]

       [:meat :subClassOf :food]

       [:beer :subClassOf :beverage]

       [:beverage :subClassOf :consumable]

       [:food :subClassOf :consumable]

       [:consumable :subClassOf :thing]]))

eg-with-types

> (eg-with-types)

{:consumable {:subClassOf #{:thing}},

 :beef {:subClassOf #{:meat}},

 :person {:subClassOf #{:thing}},

 :beer {:subClassOf #{:beverage}},

 :meat {:subClassOf #{:food}},

 :food {:subClassOf #{:consumable}},

 :beverage {:subClassOf #{:consumable}},

 :pork {:subClassOf #{:meat}},

 :john {:isa #{:person}, :likes #{:beef}},

 :mary {:isa #{:person}, :likes #{:chicken}},

 :chicken {:subClassOf #{:meat}}}

```

Our `eg-with-types` now provides a bit more context for what's going

on with our heroes John and Mary. Note that `:isa` and `:subClassOf`

differ in their _domain_. `:isa` relates an instance to its class,

while `:subClassOf` relates a class to its parent.



### The `traverse` function

Here's an example of how the `traverse` function works, starting with

a traversal function we'll call `subClassOf*`, which follows and

accumulates all :subClassOf links, starting with an initial queue of

say, `[:meat :beer]`:

```

> (igraph/traverse eg-with-types subClassOf* {} #{} [:meat :beer])

#{:consumable :beer :meat :food :beverage :thing}

>

```

The arguments for `traverse` are

- `g` - an invariant graph

- `traversal-fn` - A function `[g c acc q]` -> `[c' acc' q']`,

  defining the logic of each step in the traversal

- `context` - (optional) a map holding the traversal history plus

  whatever `traversal-fn` may want to track. Default is {}

- `acc` - (optional) accumulates the resulting value of the traversal. Default is `[]`.

- `queue` - the starting queue



#### Traversal functions

The traversal function takes 4 arguments and returns a vector of

length 3.

```

> (subClassOf* eg-with-types {} #{} [:meat])

[{} #{:meat} (:food)]

>

> (subClassOf* eg-with-types {} #{:meat} '(:food))

[{} #{:meat :food} (:consumable)]

>

```

The first argument is the invariant graph itself.

The second argument (and first element returned) is the context, which

subClassOf* leaves unchanged.  Context is used by `traverse` to avoid

cycles, and will be explained in detail [below](#h4-context). More

sophisticated traversal functions may use the context as a kind of

blackboard.

The third argument (and precursor to the second element returned) is

the value to be accumulated, identical to its counterpart in the

_reduce_ idiom.

The fourth argument (and precursor to the third element returned) is

the traversal queue. It must be sequential, and may be ordered in any

way that makes sense. An empty queue signals and end of the traversal,

at which point `traverse` will return the value of the accumulator.

Here's a possible definition of subClassOf*:

```

(defn subClassOf* [g c acc q]

  "Traversal function to accumulate super-classes."

  (let [elt (first q)]

    [c                      ;; context is unchanged

     (conj acc elt)         ;; updating the accumulator

     (reduce conj 

       (rest q) 

       (g elt :subClassOf)) ;; adding super-classes to the queue

       ]))

```



#### Context

The `context` argument to `traverse` and its traversal function is a

map containing key-values which may inform the course of the

traversal, but are not part of the accumulated value. This will

include:

- `:history` set by `traverse`, and updated to hold all elements

  encountered in the course of the traversal. In order to avoid

  cycles, any element in the history will be skipped should it ever

  re-appear at the head of the queue.

  

The `traverse` function also supports these optional keys in the

context:

- `:skip?` (optional), a function (fn[x] ...) -> truthy, applicable to

  the head of the queue, which will override `:history`.

  

- `:seek` (optional), a function `(fn [context acc]...)` -> `acc'`. If

  specified, this function will be called at the beginning of each

  traversal, and if truthy and non-empty, the traversal will end

  immediately with the value of `acc`.

In addition, the traversal function may use the context as a

blackboard to communicate between iterations of the traversal. For

example, you may want to prune and re-order your queue based on a set

of heuristics, details of which are stored in the context.



#### The queue

The `queue` argument must be sequential, but is otherwise

unrestricted. An empty queue signals the end of the traversal, at

which point `traverse` will return the accumulated value.

Note that conj-ing to a vector in the traversal function suggests a

breadth-first traversal, while conj-ing to a seq suggests a

depth-first tranversal.

More sophisticated traversal functions may use the context to inform

logic to prune and re-order the queue to optimize the traversal.



### Traversal utilities

IGraph provides utilities to express several common types of traversal

functions.



#### `transitive-closure`

- `(trasitive-closure p)` -> `(fn [g context acc to-visit] ...) ->

  [context' acc' queue']`,

  

  This returns a traversal function which will accumulate all _o_

  s.t. any _s_ in the queue is associated with _o_ through zero or

  more _p_ links.

So in the example above, the `subClassOf*` function could be defined

thus:

```

(def subClassOf* (igraph/transitive-closure :subClassOf))

```



### `traverse-link`

- `(traverse-link p)` -> (fn [g context acc queue] ...) -> [context

  acc' []],

The function returned here will accumulate all _o_ s.t. for all _s_ in

_queue_, (g s p o) is truthy:

```

> (igraph/traverse 

    eg-with-types 

    (igraph/traverse-link :isa) 

    #{} 

    [:john :mary])

#{:person}

>

```



#### `maybe-traverse-link`

- `(maybe-traverse-link p)` -> (fn [g context acc queue] ...) ->

  [context acc' []]

Matches 0 or 1 occurrences of _p_:

```

> (igraph/traverse eg-with-types 

    (igraph/maybe-traverse-link :isa) 

    #{} 

    [:john :mary])

#{:person :john :mary}

>

```



#### `traverse-or`

- `(traverse-or & ps)` -> (fn [g context acc queue] ...) -> [context

  acc' []],

Where _ps_ is one or more traversal functions, merging all of their outputs.

Keyword arguments are interpreted as an implicit `traverse-link`.

```

> (def subsumed-by (igraph/traverse-or :isa :subClassOf))

subsumed-by

> (igraph/traverse eg-with-types subsumed-by #{} [:john])

#{:person}

>

> (igraph/traverse eg-with-types subsumed-by #{} [:meat])

#{:food}

>

```



### Traversal composition with `t-comp`

Composition functions are composable with a 'short form' and a 'long

form'.



#### short form

Short-form composition can be used when the traversal function meets

the following criteria:

- None of the component functions manipulate the traversal context

- Each component function accumulates a sequential value suitable to

  serve as the initial queue of the component function that follows

  it.

Such functions can be called as a simple vector:

```

> (def instance-of 

    (igraph/t-comp [:isa (igraph/transitive-closure :subClassOf)]))

>

> (igraph/traverse eg-with-types instance-of #{} [:john])

#{:person :thing}

>

```



#### long form

In cases where you want to compose a traversal function that cannot

meet the criteria above, then instead of passing to `traversal-comp` a

vector of traversal functions, you pass in a map with the following

keys:

```

{ :path  [: : ...]

   : {:fn 

                         :doc  (optional)

                         :into  (default [])

                         :local-context-fn  (default nil)

                         :update-global-context (default nil)

                      }

   : ...

   ...

 }

 ```

 

A call to `(t-comp [:a :b :c])` is equivalent to calling `(t-comp

{:path [:a :b :c]})`.

These parameters should allow you as much control as you need over the

flow of contexts between each stage of traversal, and over the flow of

outputs from any one stage into the input queue of its next stage.

However, most of the time, the short form is sufficient, and at this point,

the long form has not been tested heavily.

##### the :path parameter

This is a vector of traversal function specifications. Each traversal

function specification must be either:

- A traversal function

- A keyword with an entry in the long-form map

- A keyword eligible as an implicit [traverse-link](#h4-traverse-link)

If the traversal function specification is itself a function, it will

be applied directly.

If the traversal function specification is a keyword, and the t-comp

map has a matching entry for that keyword, it will look for and

interpret a map with the parameters described in the next section.

If the spec is a keyword without an entry in the long-form map, it is

assumed to be a candidate for an implicit traverse-link, i.e. a graph

element in 'p' position in _g_.

##### traversal specification parameters

- :fn - a traversal function

- :doc (optional) - a docstring

- :into (optional) - a container to which the output should be coerced

  (default [])

- :local-context-fn (optional) - a function [global-context] -> `local

  context` producing the context for this stage of the traversal.

- :update-global-context (optional) - a function [global-context

  local-context] -> `global-context'`, capturing whatever aspects of

  the current stage of traversal may be of interest to subsequent

  stages.



### Using traversal functions as a `p` argument to `invoke`

Recall that implementations of IGraph should provide `invoke`

functions with 0-3 arguments.

Two of these functions involve specification of a _p_ parameter:

```

(g s p) -> {...}

(g s p o) -> truthy.

```

This is informed by a multimethod dispatched on whether _p_ is a

function.

- `(match-or-traverse g s p)` -> #{...}  

- `(match-or-traverse g s p o)` -> truthy

A typical declaration for an IGraph implementation will contain

these two method declarations:

```

  #?(:clj clojure.lang.IFn

     :cljs cljs.core/IFn)

  ...

  (invoke [g s p] (igraph/match-or-traverse g s p))

  (invoke [g s p o] (igraph/match-or-traverse g s p o))

  ...

```

If the _p_ argument is a function, then _p_ will be expected to match

the signature of a traversal function, and the output of the method

will be the value of its traversal, starting with queue [_s_].

If _p_ is not a function it will be matched directly against elements

of the graph.

So given the traversal functions in the examples above:

```

> (eg-with-types :beef subClassOf*)

#{:consumable :beef :meat :food :thing}

>

> (eg-with-types :beef subClassOf* :food)

:food

>

> (eg-with-types :john (igraph/t-comp [:likes subClassOf*]))

#{:consumable :beef :meat :food :thing} ;; classes of stuff John likes

>

```



## cardinality-1 utilites

Requiring normal form to provide a set as its 3rd-tier representation

has the advantage of ensuring that the normal form is as simple and

regular as possible, and makes it easy to think about set operations

over graphs. However, it can be a bit unwieldy when dealing with the

many cases where the descriptive map's keys reliably map to a single

scalar value.

The following utilities are provided to help with this:

- `(unique [x]) -> x` - translates a singleton sequence to its only

  value

- `(flatten-description (g s))` Automatically translates the p-o

  description into a simple k-v mappings wherever only a single _v_

  exists.

- `(normalize-flat-description m)` is the inverse of

  `flatten-description`.

- `(assert-unique g s p o)` - replaces one singleton object with

  another.

  - there are analogous functions `assert-unique!` and `claim-unique`

    for Mutable and Accumlate-only protocols respectively



### `unique`

The `unique` function takes a sequence and an optional `on-ambiguity`

argument. Default on-ambiguity throws ex-info of type

`::igraph/Non-unique`.

```

> (eg-with-types :john :isa)

{:person}

>

> (igraph/unique (eg-with-types :john :isa))

:person

>

> (igraph/unique (eg-with-types :beef subClassOf*))

Execution error (ExceptionInfo) at ont-app.igraph.core/unique$fn (core.cljc:640).

Unique called on non-unique collection

>

> (igraph/unique (eg-with-types :beef subClassOf*)

                 first) ;; arbitrary disambiguation

:consumable

```

Sometimes defining `the` as an alias for `unique` reads better, and is

easier to type:

```

> (def the igraph/unique)

> (the (eg-with-types :john :isa))

:person

>

```



### `flatten-description`

```

(igraph/flatten-description (eg-with-types :john))

{:isa :person, :likes :beef}

>

> (let [g (igraph/add 

             eg 

            [:john :likes :beer :has-vector [1 2 3]])

        ]

    (igraph/flatten-description (g :john)))

{:isa :person, :likes #{:beef :beer}, :has-vector [1 2 3]}

>

```



### `normalize-flat-description`

This is the inverse of `flatten-description`:

```

> (igraph/normalize-flat-description 

    {:isa :person, :likes #{:beef :beer}, :has-vector [1 2 3]})

{:isa #{:person}, :likes #{:beef :beer}, :has-vector #{[1 2 3]}}

>

> (let [g (igraph/add 

             eg 

             {:john (igraph/normalize-flat-description {:likes :beer})})

        ]

    (g :john))

{:isa #{:person}, :likes #{:beef :beer}}

>

```



### `assert-unique`

We can replace one singleton value with another using `(assert-unique

g s p o) -> g'`:

```

> (let [g (igraph/assert-unique eg :john :isa :man)]

    (g :john))

{:likes #{:beef}, :isa #{:man}}

>

```



## I/O

In general writing the normal form of a graph to a stream and applying

the reader to it on the other end should be fairly

straightforward. Any predicates bearing reader-choking objects will of

course need to be filtered out.

At this point, only the :clj platform is directly supported with a

pair of functions to read/write to the file system.



### `write-to-file`

`(write-to-file [path g] ...) -> path`

Will write an edn file with the normal form contents of _g_.



### `read-from-file`

`(read-from-file [g path] ...) -> g'`

Will read the normal form contents of _path_ into _g_.



## Other utilities



### `reduce-spo`

- `(reduce-spo f acc g)` -> `acc'`, such that _f_ is called on each

triple in _g_.  Where _f_ := `(fn [acc s p o]...) ->

acc'`. Cf. [reduce-kv](https://clojuredocs.org/clojure.core/reduce-kv).

```

> (defn tally-triples [tally s p o]

    (inc tally))

> (igraph/reduce-spo tally-triples 0 eg)

4

```



## Implementations

The `ont-app.igraph.graph` module makes one implementation of IGraph

available without any additional dependencies, and so far there are

four other libraries in the ont-app project which implement this

protocol.

Other implementations are planned, and I'd be interested to learn of

any implementations published by other parties.



### `ont-app.igraph.graph/Graph`

The IGraph library comes with `ont-app.igraph.graph`, whose `Graph`

deftype is a very lightweight implementation of IGraph.

Its native representation is just Normal Form. Any hashable object can

technically be provided for any _s_, _p_, or _o_, but be advised that

other IGraph implementations often expect to keep non-identifiers a

the "o" level.

```

(require '[ont-app.igraph.graph :as g])

```



#### Graph creation

Use `make-graph` to create a new graph, with an optional `:contents`

argument.

```

> (def eg (g/make-graph))

eg

> (eg)

{}

>

> (def eg

    (g/make-graph 

      :contents {:john {:isa #{:person}, :likes #{:beef}},

                 :mary {:isa #{:person}, :likes #{:chicken}}}

eg

> (eg)

{:john {:isa #{:person}, :likes #{:beef}},

 :mary {:isa #{:person}, :likes #{:chicken}}}

>

```

The `:contents` argument must be in Normal Form.



#### Querying

Querying is done with a very simple vector-of-triples graph pattern

using keywords starting with ":?" to serve as variables. It returns an

unordered set of binding maps. This is very minimalistic. Any

selecting, ordering, grouping or aggregation needs to be done

downstream from the call.

```

> (igraph/query eg [[:?liker :likes :?liked]])

#{{:?liker :john, :?liked :beef} 

  {:?liker :mary, :?liked :chicken}}

>

```

Traversal functions can be specified in _p_ position:

```

> (igraph/query eg-with-types [[:?liker :likes ?liked]

                               [?liked subClassOf* :?liked-class]])

#{{:?liked :beef, :?liked-class :consumable, :?liker :john}

  {:?liked :beef, :?liked-class :beef, :?liker :john}

  {:?liked :chicken, :?liked-class :food, :?liker :mary}

  {:?liked :chicken, :?liked-class :chicken, :?liker :mary}

  {:?liked :chicken, :?liked-class :consumable, :?liker :mary}

  {:?liked :beef, :?liked-class :meat, :?liker :john}

  {:?liked :beef, :?liked-class :food, :?liker :john}

  {:?liked :beef, :?liked-class :thing, :?liker :john}

  {:?liked :chicken, :?liked-class :thing, :?liker :mary}

  {:?liked :chicken, :?liked-class :meat, :?liker :mary}}

>

```



### sparql-client

Implements a mutable IGraph for a [SPARQL

endpoint](https://www.wikidata.org/wiki/Q26261192). Initializtion

requires configuring query and update endpoints, and the query

language is [SPARQL](https://www.wikidata.org/wiki/Q54871).

Keyword identifiers are expected to be namespaced, and rely on the [ont-app/vocabulary](https://github.com/ont-app/vocabulary) library, which uses namespace metadata to intercede between Clojure namespaces and RDF namespaces.

Set operations are not supported.



### igraph-jena

Implements a mutable IGraph for [Apache Jena](https://jena.apache.org/). 

Keyword identifiers are expected to be namespaced, and rely on the [ont-app/vocabulary](https://github.com/ont-app/vocabulary) library, which uses namespace metadata to intercede between Clojure namespaces and RDF namespaces.

Set operations are supported.

Currently supports Jena version 4.



### datascript-graph

This implements IGraph for a

[datascript](https://github.com/tonsky/datascript) native

representation, and may as such may need to be initialized with some

schema declarations. Query language is datalog. Immutable, with set

operations.



### datomic-client

https://github.com/ont-app/datomic-client

This implements IGraph for the [Datomic Client API](https://docs.datomic.com/cloud/client/client-api.html). The query language is datalog. Mutability model is Accumulate Only. Set operations are not supported.

### igraph-grafter

https://github.com/ont-app/igraph-grafter

A port of the IGraph protocols to

[Grafter](https://github.com/Swirrl/grafter).

## Testing support

The `ont-app.igraph.test-support` module provides utilities to

developers of downstream implementations to confirm compliance with

the various protocols defined here.

The `graph_test.cljc` file should serve as an example.

It starts with an initialized instance of `ont_app.igraph.graph.Graph`

intended to hold a report containing the results of a battery of

tests. This report should be initialized with a triple to declare a

function `[data] -> test-graph`, which should take various bodies of

canonical test data and return an instance of the graph under

examination....

```clj

(require [ont-app.igraph.test-support :as ts])

(defn make-test-graph

  "Creates an instance of the graph I want to test"

  ^ont_app.igraph.graph.Graph [data]

  (g/make-graph :contents data))

(defn make-standard-report

  "Creates a configured report graph."

  []

  (-> 

   (g/make-graph)

   (igraph/add [::ts/StandardIGraphImplementationReport

                ::ts/makeGraphFn make-test-graph])))

```

Where:

- the `::ts/makeGraphFn` property ranges over `fn [data] -> graph`

- `data` is an appropriate argument to one of the add methods

- `graph` is an instance of the IGraph impelementation to be tested

Then we can apply a battery of standard tests, passing in the report

and collecting test results.

```clj

(deftest standard-implementation-tests

  "Standard tests against examples in the IGraph README for immutable set-enabled graphs"

  (let [report (-> (make-standard-report)

                   (ts/test-readme-eg-access)

                   (ts/test-readme-eg-mutation)

                   (ts/test-readme-eg-set-operations)

                   (ts/test-readme-eg-traversal)

                   (ts/test-cardinality-1)

                   ;; Watch this space for bug-fix tests

                   )

     

        ]

    ;; You could also run (ts/run-standard-implementation-tests))

    ;; `report` with be a graph of test results, some of which might be of type Failed...

    ...

```

The we can query for errors in the report...

```clj

    (is (empty? (ts/query-for-failures report)))))

```

If it's not empty, the bindings returned will contain descriptions of

the failures with hopefully helpful information.

### When your graph requires a schema

Some graph representations such as datascript and datomic require that

you provide a schema which may be part of your graph content.

In such cases you should provide a triple like this:

```clj

(add report [::ts/StandardIGraphImplementationReport ::ts/schemaGraph ])

```

Where `schema-graph` contains a graph with just the schema

content. Various tests will use this to filter out this stuff when

comparing test results to canonical data.

## Developer notes

The Makefile has targets for most of the usual stuff.

It presumes that your `~/.clojure/deps.edn` has aliases as follows:

```

{

 :aliases {

           ;; Borkdude's Kondo linter

           ;; typical usage:

           ;; clojure -M:kondo --lint src

           ;; For help: clojure -M:kondo  --help

           :kondo

           {:extra-deps {clj-kondo/clj-kondo {:mvn/version "RELEASE"}}

            :main-opts  ["-m" "clj-kondo.main"]

            }

           ;; Document generator

           ;; Call with clojure -X:codox

           :codox

           {

            :extra-deps {codox/codox {:mvn/version "0.10.8"}}

            :exec-fn codox.main/generate-docs

            :exec-args {:output-path "doc"}

            }

           ;; outdated

           ;; c.f. Lein ancient

           ;; clojure -M:outdated --help for help

           ;; Typical usage: clojure -M:outdated

           :outdated

           {

            ;; Note that it is `:deps`, not `:extra-deps`

            :deps {com.github.liquidz/antq {:mvn/version "RELEASE"}}

            :main-opts ["-m" "antq.core"]

            }

           } ;; /end aliases

}

```

The `nvd` target presumes installation of https://github.com/rm-hull/nvd-clojure



## Future work

- Ports to loom, ubergraph, and other graph-oriented libraries 

- There will be a regime for providing annotations for reified triples

  (for weights and such).

- Ports to table-based representations

- `igraph.graph` will have query planning and indexing

- Some kind of a scheme to bring all the various query formats under a

  single tent



## Acknowledgements

Thanks to [Ram Krishnan](https://github.com/kriyative) for his

feedback and advice.



## License

Copyright © 2019-22 Eric D. Scott

Distributed under the Eclipse Public License either version 1.0 or (at

your option) any later version.

 

Natural Lexicon logo - Copyright © 2020 Eric D. Scott. Artwork by Athena M. Scott.

Released under Creative Commons Attribution-ShareAlike 4.0 International license. Under the terms of this license, if you display this logo or derivates thereof, you must include an attribution to the original source, with a link to https://github.com/ont-app, or  http://ericdscott.com.
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/ont-app/igraph

Awesome Lists containing this project

README