Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/mdub/arboreal

Efficient tree structures for ActiveRecord
https://github.com/mdub/arboreal

Last synced: about 2 months ago
JSON representation

Efficient tree structures for ActiveRecord

Awesome Lists containing this project

README

        

= Arboreal

Arboreal is yet another extension to ActiveRecord to support tree-shaped
data structures.

Arboreal surfaces relationships within the tree like +children+,
+ancestors+, +descendants+, and +siblings+ as scopes, so that additional
filtering/pagination can be performed.

It delegates as much work as possible to the underlying DBMS, making it efficient to:

* fetch all ancestors, descendants or siblings of a node
* move nodes (or subtrees) around
* prevent loops
* rebuild the hierarchy

== Getting started

First, install the "arboreal" gem, and add it to your Rails project's config/environment.rb.

Next, you'll need a migration to add +parent_id+ and +ancestry_string+ columns, and indices:

class MakeThingsArboreal < ActiveRecord::Migration

def self.up
add_column "things", "parent_id", :integer
add_index "things", ["parent_id"]
add_column "things", "ancestry_string", :string
add_index "things", ["ancestry_string"]
end

def self.down
remove_index "things", ["ancestry_string"]
remove_column "things", "ancestry_string"
remove_index "things", ["parent_id"]
remove_column "things", "parent_id"
end

end

Finally, you can declare your model arboreal:

class Thing < ActiveRecord::Base

acts_arboreal

# .. etc etc ...

end

== Navigating the tree

Arboreal adds the basic relationships you'd expect:

* parent
* children

In addition, it provides the following handy methods on each tree-node:

* ancestors
* descendants
* subtree (the node itself, plus descendants)
* siblings
* root (the topmost ancestor)

The first four return scopes, to which additional filtering, ordering or limits may be applied.

At the class-level:

* roots is a named-scope returning all the nodes without parents
* rebuild_ancestry rebuilds the ancestry cache, as described below

== Rebuilding the ancestry cache

Internally, Arboreal uses the +ancestry_string+ column to cache the path down the tree to each node (or more correctly, it's parent. This technique - a variant of "path enumeration" or "materialized paths" - allows efficient retrieval of both ancestors and descendants.

It's conceivable that the computed ancestry-string values may get out of whack, particularly if changes are made directly to the database. If you suspect corruption, you can restore sanity using rebuild_ancestry, e.g

Thing.rebuild_ancestry

The ancestry rebuild is implemented in SQL to leverage the underlying DBMS, and so is pretty efficient, even on large trees.