Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/pfrazee/crdt_talk
https://github.com/pfrazee/crdt_talk
Last synced: 2 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/pfrazee/crdt_talk
- Owner: pfrazee
- Created: 2014-11-18T00:28:53.000Z (about 10 years ago)
- Default Branch: master
- Last Pushed: 2015-01-22T15:21:42.000Z (almost 10 years ago)
- Last Synced: 2024-10-17T17:35:09.301Z (2 months ago)
- Language: JavaScript
- Size: 148 KB
- Stars: 8
- Watchers: 4
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.txt
Awesome Lists containing this project
README
CRDTs 2 part 2
-
(of n)
-
you're building a dist sys
either because of scale or arch...
nodes are not able to sync in real time-
nodes act on their own
then reconcile state afterwardsthis is called "eventual consistency"
-
the sign of EC: are changes allowed...
without the nodes talking to each other first?
"concurrent changes"if so, you need...
-
"convergence"
nodes must come to the same state
regardless of the order they receive information-
so
two nodes make concurrent changes
how do we reconcile them?-
what if...
we made data structures
that could NEVER CONFLICT-
like idempotence
idemptotent values cant have conflicting changes
because you can only change them once
and, if it's already changed, it stays changedthat's why we like them
-
so can we get more stuff like idempotence?
-
yep: CRDTs
-
let's explore some properties
-
monotonic
in math: a function on an ordered set which preserves ordermonotonically increasing:
if x <= y
then f(x) <= f(y)-
example graphs
*please refer to whiteboard*
-
a monotonic data type
type's values are ordered
all of the type's methods are monotonic- an int where `++` is the only op
- an idempotent value (aka a bool where `set(true)` is the only op)
- the levels in super mario-
join-semilattice
a partially ordered set
has a join operation called LUB"least upper bound"
join( x, y ) =
- the least element of the semilattice
- which is >= both x and y-
example
given A,B,C,D, where:A < B
A < C
B < D
C < D(use whiteboard)
join( A, B ) = B
join( A, C ) = C
join( B, C ) = D-
CRDTs
monotonic semilattice data types
"conflict free" - always merge determinstically-
monotonic?
you know that the other nodes are never going to "back-track"
if you have a counter, and the node Bob says...
"4, 5, 6, 7, 4"you say, "7 then 4? No way, that's old"
"value is still 7"because it's monotonic
-
semilattice?
if you have concurrent changes...
and, thus, divergent state...
you know how to reconcile ityou use the least upper bound
-
this is great for EC...
because it's order independent!thus intent is preserved
-
lets look at some CRDTs
-
the growset CRDT
a set where state only grows
join( S1, S2 ) = union( S1, S2 )
join( [a,b], [c] ) = [a,b,c]
join( [a,b,c], [c] ) = [a,b,c]one op: add
add( [a,b], c ) = [a,b,c]
add( [a,b,c], c ) = [a,b,c]-
the 2P-set CRDT
a set where elements can be removed...
...but never re-added
uses two growsets: elements and tombstonesjoin( S1, S2 ) = [
join( S1.elements, S2.elements ),
join( S1.tombstones, S2.tombstones )
]two ops: add, remove
add( S, x ) = add( S.elements, x )
remove( S, x ) = add( S.tombstones, x )value = S.elements - S.tombstones
-
the observed-remove set CRDT
a set where elements can freely add and remove
each element is tagged with a unique idjoin(S1, S2) = [
join( S1.element-tags, S2.element-tags ),
join( S1.tombstone-tags, S2.tombstone-tags )
]two ops: add, remove
add( S, x ) = add( S.element-tags, x ++ gen_tag() )
remove( S, x ) = add( S.tombstone-tags, findTagsOf( x ) )value = unique(S.element-tags - S.tombstone-tags)
-
what else is there?
-
register CRDT
"last writer wins (LWW)"
not very fun
you use a clock
(eg lamport clock)
and take the last writebob sets X to "foo" at seq:5
alice sets X to "bar" at seq:6
sync...
X = "bar"-
register CRDT
"multi value (MV)"
use a vector clock
when neither vector-stamp dominates
use both valuesbob sets X to "foo" at [5,6]
alice sets X to "bar" at [6,5]
sync...
x = "foo" & "bar"(couch db)
-
map CRDT
kind of like sets + registers...
each element in the set is a tuple
(key, value)if you have two valid values for a given key, either do
- last-writer wins, or
- multivalue-
counter CRDT
lots of options, here's one:
a set of counters
one for each nodevalue() = sum( set )
-
counter CRDT w/decrementing
two counter CRDTs
"inc" counter and "dec" countervalue() = value( inc ) - value( dec )
-
let's talk overhead
-
so far, we've been shipping state
and the state can get fatthink of the OR-Set
...all those tombstones!-
alternative: state-delta shipping
rather than ship the whole state...
...ship state deltas which also mergeand ship the whole state occasionally too
-
example: delta-shipping a counter
delta's only ship the node's own dimension in the vector
a,b,c,d
full state: [5,6,5,2]
delta: b=6
delta: b=7
delta: b=8
delta: b=9
full state: [5,9,5,2]-
so that's state-based
what else is there?
-
operation-based CRDTS
rather than ship state, ship the ops
requirement:
- exactly-once deliver
- guaranteed causal delivery-
an op-based counter
every node emits "inc" and "dec"
and receiving nodes just... do it
because we have exactly-once delivery
this is safe
a inc or dec will never get double-counted-
an op-based OR set
rather than ship entire element & tombstone set
we ship add(element, tag) and remove(tags)commutative:
add( x )+add( y ) = add( y )+add( x )
rem( x )+rem( y ) = rem( x )+rem( y )
rem( x )+rem( x ) = rem( x )
add( x )+rem( x ) = noop()
rem( x )+add( x ) = noop()
add( x )+add( x ) = --cant happen--(remember, x and y are actually globally-unique tags)
-
exactly once guarantees idempotence
but op-based also requires "guaranteed causal delivery"
what's that about?
-
causal ordering
guarantees:
an operation always arrives
after any operations it causally depends oneg:
removes show up after adds-
causal ordering
is weaker than total ordering
but achievable in an EC system
-
how is it done?
senders responsibility
if the wire guarantees order
just make sure you emit dependencies first-
how is it done?
receivers responsibility
"Tagged Reliable Causal Broadcast"
each message has a unique tag
each message lists its dependency's tagson receive, if dependencies are not met
the message is buffered locally-
now we can reduce our tombstone-set significantly
add(tagA, 'a') // tagA added to elements
remove(tagA) // tagA removed from elementsremove(tagB) // tagA added to tombstones
add(tagB, 'a') // tagA removed from tombstonesadd(tagC, 'a') // tagC added to elements
remove(tagC) // tagC removed from elements
remove(tagC) // tagC added to tombstones (edge-case accumulation)-
great! getting pretty efficient
what if we want total order?
-
ok, so let's say we:
keep doing ops-shipping
keep doing causal orderingand create a totally-ordered ID-space which is infinitely divisible?
-
logoot: an ordered list
uses totally-ordered positions
in a continuous spacemeaning...
for two positions, A and B,
we can always find a C,
which is between themA < C < B
-
how?
lists of integers
if
A = 0
B = 1
then we choose
C = 0.5-
and we just keep doing that
between( 0.5, 1 ) = 0.5.5
0.5 < 0.5.5 < 0.6
0.6 < 0.6.5 < 0.70 < 0.5
0.5 < 0.5.5
0.5.5 < 0.5.5.5
0.5.5.5 < 0.6-
there are infinite implicit zeroes
0.5 == 0.5.0.0.0.0.0.0.0.0.0.....
-
so
0.5 < 0.5.0.0.0.0.0.0.0.0.1
-
how to insert?
insertBetween( A, B, value ):
1. generate a position between A and B
2. append the node id to avoid conflicts
3. write value to list at that generated position-
now you have total order
-
and that's all for CRDTs
-
(for now)
-
questions?