Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/atl/thrifty-p2p

A very simple python peer-to-peer implementation using consistent hashing (and an example distributed key-value store layered atop it) using the Thrift protocol and its basic RPC capabilities.
https://github.com/atl/thrifty-p2p

Last synced: about 1 month ago
JSON representation

A very simple python peer-to-peer implementation using consistent hashing (and an example distributed key-value store layered atop it) using the Thrift protocol and its basic RPC capabilities.

Awesome Lists containing this project

README

        

# thrifty-p2p #

A very simple peer-to-peer implementation using consistent hashing, the
[Thrift][] protocol and its basic RPC capabilities. It is not expected to be
generically usable by arbitrary projects: thrifty-p2p implements a flat
network, which is justifiable for the author's purposes (distributing
processing amongst dozens of data-rich nodes), but will encounter trouble
scaling.

[Thrift]: http://incubator.apache.org/thrift/

## Basic usage ##

python location.py [-h peer_node] [-p port_num] [--help]

Initiates and/or joins a simple peer-to-peer network. Default port\_num
is 9900. Absent a peer\_node (which is the peer initially contacted for
joining the network), the command initiates a network.

Example usage, in different terminal windows:

python location.py

python location.py --host localhost:9900 --port 9901

python location.py -h localhost:9901 -p 9902

... etc. ...

## Dependencies & Requirements ##

The basic server uses consistent hashing via [hash_ring.py][], developed by
[Amir Salihefendic][]. For better and for worse, I modified the code to
allow for easier node set manipulations (as if it were a list) and
different node resolution behavior so that the internal keys are more
directly related to md5 hashes.

The library requires the [Apache Thrift][] [Python libraries][] to be
installed, but -- because the thrift compiler has been run already -- does
not currently require the Thrift compiler or any other language libraries.

[hash_ring.py]: http://pypi.python.org/pypi/hash_ring/
[Amir Salihefendic]: http://amix.dk/blog/viewEntry/19367
[Apache Thrift]: http://incubator.apache.org/thrift/
[python libraries]: http://pypi.python.org/pypi/thrift/1.0

## Design priorities ##

I'm using [Thrift][] as much for its simple RPC underpinnings as the message
format. I wanted to consistently distribute some processing and bother
as little as possible with node lookup. This is the simplest thing that
I came up with that sort-of works: a flat peer-to-peer network with each
node keeping up with the state of all the nodes as well as it can. This
necessarily limits the ring's ability to scale beyond dozens of nodes.

# diststore #

Diststore is an example of an application that can be layered atop the
thrifty-p2p location service. It's a toy version of a distributed key-value
store of the sort that all the cool kids are doing these days. The underlying
peer-to-peer implementation makes no attempt to optimize for large scales, so
don't get too excited about deploying this in production. Rather, it's
provided for instruction and testing.

Although diststore.thrift uses locator.thrift for a lot of its interfaces,
the implementation overloads some of the base methods as an illustration
of how this simple library might be used in other projects.

Example usage, run this in a couple terminal windows:

python storeserver.py

Note how the second and following servers auto-discover the other servers
you've initiated, join them as peers on the network, and find an open port.
Then in another window start populating the store:

python storeput.py a apple
python storeput.py b banana
python storeput.py c crayon
python storeput.py d dinosaur
python storeprimer.py

You can then query the store:

python storeget.py c
python storetest.py

or start a new server, and see how a minimum of the existing keys
redistribute themselves and an example of consistent hashing:

python storeserver.py

or even stop a server with ctrl-C or a `kill -INT` signal and see
how the server hands its items off gracefully. However, the system
is not robust to any more aggressive termination: keys will go missing.

What's happening here? Because every node has a full model of the network, it
knows which node to forward a `get()` request to, or where to hand off its
items when it leaves the network. Key methods here are overridden from
location.LocatorHandler: `add()` and `cleanup()`.

## Programming usage ##

There are two primary thrift interfaces exposed in locator.thrift:
`locator.Base` and `locator.Locator`. `Locator.Base` supports primal methods
such as `ping()`, `service\_type()`, `service\_types()`, and `die()`, that
don't rely on any of the locator services. `Locator.Locator` implements basic
node location and management: propagating and dropping services on the
network. When creating your own thrift definition, your service should import
locator.thrift and extend one of the two service classes.

When creating a thrift handler implementation, the python class should
inherit from location.BaseHandler or location.LocatorHandler and the
Iface stub from your own class generated from your thrift file.