Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/bretthoerner/timak
Timelines (activity streams) backed by Riak
https://github.com/bretthoerner/timak
Last synced: 16 days ago
JSON representation
Timelines (activity streams) backed by Riak
- Host: GitHub
- URL: https://github.com/bretthoerner/timak
- Owner: bretthoerner
- License: other
- Created: 2011-08-08T15:36:14.000Z (over 13 years ago)
- Default Branch: master
- Last Pushed: 2011-09-14T20:37:44.000Z (about 13 years ago)
- Last Synced: 2024-10-13T21:43:54.767Z (about 1 month ago)
- Language: Python
- Homepage:
- Size: 116 KB
- Stars: 55
- Watchers: 5
- Forks: 5
- Open Issues: 0
-
Metadata Files:
- Readme: README.rst
- License: LICENSE
Awesome Lists containing this project
README
=====
timak
=====timak is a Python library for storing timelines (activity streams) in Riak. It is very alpha and rough around the edges.
It is loosely based on my understanding of Yammer's `Streamie `_.
Example
-------Timelines are unique sets of objects (unique by the ID you provide) ordered by a datetime (that you also provide). They are bounded, so items fall off the end when a (user defined) capacity is reached.
>>> from datetime import datetime
>>> import riak
>>> from timak.timelines import Timeline>>> conn = riak.RiakClient()
>>> tl = Timeline(connection=conn, max_items=3)
>>> # t1.add("key", "unique_id", "score")
>>> tl.add("brett:tweets", 1, datetime(2011, 1, 1))
[1]
>>> tl.add("brett:tweets", 2, datetime(2011, 1, 2))
[2, 1]
>>> tl.add("brett:tweets", 3, datetime(2011, 1, 3))
[3, 2, 1]
>>> tl.add("brett:tweets", 4, datetime(2011, 1, 4))
[4, 3, 2]
>>> tl.delete("brett:tweets", 2, datetime(2011, 1, 2))
[4, 3]If you provide a ``datetime.datetime`` value to score Timak will automatically convert to a sortable score value.
As you can see the default order is descending by the date you provide, and the object IDs are returned by default. You can also provide an ``obj_data`` argument (must be JSON serializable) which will be returned instead.
>>> tl.add("brett:tweets", 5, datetime(2011, 1, 5), obj_data={'body': 'Hello world, this is my first tweet'})
[{'body': 'Hello world, this is my first tweet'}, 4, 3]Why?
----I needed *highly available*, *linearly scalable* timelines where readers and writers *don't block* one another. Because Riak is a Dynamo based system, multiple writers can update a single value and I can merge the conflicts on a later read. I can also add a machine to the cluster for more throughput, and since it's simply fetching denormalized timelines by key it should be incredibly performant.
So what? I could write this in...
---------------------------------PostgreSQL or MySQL
```````````````````This would be a very simple table in a RDBMS. It could even be boundless (though without some PLSQL hackery large ``OFFSETS`` are very expensive). You'd be hitting large indexes instead of fetching values directly by key. The biggest problem is it all has to fit on a single system, unless you manually shard the data (and re-shard if you ever grew out of that size). Plus you'd have to deal with availability using read slaves and failover.
MongoDB
```````The only possible difference I see from the RDBMSs above is that you could use Mongo's "auto-sharding." If that's your thing, and you trust it, then I wish you the best of luck. You may want to `read this `_.
Redis
`````You can fake timelines in Redis using a list or sorted set. Like RDBMS you have to handle all of the sharding yourself, re-shard on growth, and use slaves and failover for availability. In addition to these, and even more critical for my use case: all of your timelines would have to fit in RAM. If you have this problem and that kind of money please send me some.
Cassandra
`````````Probably another great fit. You could even store much longer timelines, though I'm not sure what the cost is of doing a ``SELECT`` with ``OFFSET`` equivalent on the columns in a Cassandra row.
TODO
----1. Add better API with cursors (last seen ``obj_date``?) for pagination.
2. Built-in Django support for update on ``post_save`` and ``post_delete``.
3. Compress values.