{"id":21526986,"url":"https://github.com/lambdaclass/riak_core_tutorial","last_synced_at":"2026-03-04T15:32:17.505Z","repository":{"id":37580339,"uuid":"114264805","full_name":"lambdaclass/riak_core_tutorial","owner":"lambdaclass","description":"An up to date riak_core tutorial, using basho's riak_core, Erlang/OTP 23-24-25 and rebar3.","archived":false,"fork":false,"pushed_at":"2022-06-14T19:40:20.000Z","size":2994,"stargazers_count":150,"open_issues_count":5,"forks_count":10,"subscribers_count":13,"default_branch":"master","last_synced_at":"2025-12-20T18:58:43.643Z","etag":null,"topics":["distributed","distributed-systems","elixir","erlang","functional-programming","riak","riak-kv"],"latest_commit_sha":null,"homepage":"","language":"Erlang","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/lambdaclass.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2017-12-14T15:13:14.000Z","updated_at":"2025-01-16T14:17:05.000Z","dependencies_parsed_at":"2022-08-29T09:42:09.320Z","dependency_job_id":null,"html_url":"https://github.com/lambdaclass/riak_core_tutorial","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/lambdaclass/riak_core_tutorial","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lambdaclass%2Friak_core_tutorial","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lambdaclass%2Friak_core_tutorial/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lambdaclass%2Friak_core_tutorial/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lambdaclass%2Friak_core_tutorial/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/lambdaclass","download_url":"https://codeload.github.com/lambdaclass/riak_core_tutorial/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lambdaclass%2Friak_core_tutorial/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":30084971,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-03-04T13:22:36.021Z","status":"ssl_error","status_checked_at":"2026-03-04T13:20:45.750Z","response_time":59,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["distributed","distributed-systems","elixir","erlang","functional-programming","riak","riak-kv"],"created_at":"2024-11-24T01:47:25.335Z","updated_at":"2026-03-04T15:32:17.478Z","avatar_url":"https://github.com/lambdaclass.png","language":"Erlang","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Riak Core Tutorial [![Build Status](https://travis-ci.org/lambdaclass/riak_core_tutorial.svg?branch=master)](https://travis-ci.org/lambdaclass/riak_core_tutorial)\n\nThis repository contains an example riak_core application using the most\nrecent version of [riak_core](https://hex.pm/packages/riak_core)\nand running on Erlang/OTP 25 with rebar3.\n\nBelow is a detailed [tutorial](/#riak-core-tutorial) that explains the step-by-step process to\nproduce the same code base from scratch.\n\nThe project and tutorial structure were largely based on the\n[Little Riak Core Book](https://marianoguerra.github.io/little-riak-core-book/)\nand the\n[Create a riak_core application in Elixir](https://medium.com/@GPad/create-a-riak-core-application-in-elixir-part-1-41354c1f26c3)\nseries.\n\n## Contents\n\n  * [Example application usage](#example-application-usage)\n  * [Riak Core Tutorial](#riak-core-tutorial)\n     * [When to use Riak Core](#when-to-use-riak-core)\n     * [About this tutorial](#about-this-tutorial)\n     * [Useful links](#useful-links)\n     * [0. Riak Core overview](#0-riak-core-overview)\n     * [1. Setup](#1-setup)\n     * [2. The vnode](#2-the-vnode)\n        * [The riak_vnode behavior](#the-riak_vnode-behavior)\n        * [Application and supervisor setup](#application-and-supervisor-setup)\n        * [Sending commands to the vnode](#sending-commands-to-the-vnode)\n     * [3. Setting up the cluster](#3-setting-up-the-cluster)\n     * [4. Building a distributed Key/Value store](#4-building-a-distributed-keyvalue-store)\n     * [5. Testing](#5-testing)\n        * [Test implementations](#test-implementations)\n        * [ct_slave magic](#ct_slave-magic)\n     * [6. Coverage commands](#6-coverage-commands)\n        * [Handle coverage commands in the vnode](#handle-coverage-commands-in-the-vnode)\n        * [The coverage FSM](#the-coverage-fsm)\n        * [Coverage FSM Supervision](#coverage-fsm-supervision)\n        * [Putting it all together](#putting-it-all-together)\n        * [Coverage test](#coverage-test)\n     * [7. Redundancy and fault-tolerance](#7-redundancy-and-fault-tolerance)\n     * [8. Handoff](#8-handoff)\n        * [When does handoff occur?](#when-does-handoff-occur)\n        * [Vnode implementation](#vnode-implementation)\n        * [Ownership handoff example](#ownership-handoff-example)\n        * [Hinted handoff example](#hinted-handoff-example)\n\n\n## Example application usage\nRun on three separate terminals:\n\n``` shell\nmake dev1\nmake dev2\nmake dev3\n```\n\nJoin the nodes and ping:\n\n``` erlang\n(rc_example2@127.0.0.1)1\u003e riak_core:join('rc_example1@127.0.0.1').\n(rc_example2@127.0.0.1)2\u003e rc_example:ping().\n```\n\n``` erlang\n(rc_example3@127.0.0.1)1\u003e riak_core:join('rc_example1@127.0.0.1').\n(rc_example3@127.0.0.1)2\u003e rc_example:ping().\n```\n\nCheck the ring status:\n\n``` erlang\n(rc_example3@127.0.0.1)3\u003e rc_example:ring_status().\n```\n\nTry the key/value commands:\n\n``` erlang\n(rc_example1@127.0.0.1)1\u003e rc_example:put(k1, v1).\nok\n(rc_example1@127.0.0.1)2\u003e rc_example:put(k2, v2).\nok\n(rc_example2@127.0.0.1)1\u003e rc_example:get(k2).\nv2\n```\n\n## Riak Core Tutorial\n\n[Riak Core](https://github.com/basho/riak_core) is the distributed\nsystems framework used by the [Riak data store](https://github.com/basho/riak)\nto distribute data and scale. More generally, it can be thought of as\na toolkit for building distributed, scalable, fault-tolerant\napplications. In practical terms, Riak Core is an Erlang/OTP application, and most\nof the user defined work is done in the `riak_core_vnode` behavior.\n\n### When to use Riak Core\n\nWhat makes Riak Core so interesting and useful is that it implements\nthe ideas of the\n[Amazon's Dyamo](https://en.wikipedia.org/wiki/Dynamo_(storage_system))\narchitecture and exposes its infrastructure as\na reusable library, allowing to easily apply them in any context that\ncan benefit from decentralized distribution of work (including but not\nlimited to data stores).\n\nAs you will see, it provides the basic blocks to build distributed services, consistent hashing, routing, support for sharding and replicating, distributed queries, etc. They need not all be used. For example, a game server which handles requests from players could partition players to handle load, and ensure that players requests are always handled on the same vnode to ensure data locality.\n\nA distributed batch job handling system could also use consistent hashing and routing to ensure jobs from the same batch are always handled by the same node, or distribute the jobs across several partitions and then use the distributed map-reduce queries to gather results.\n\n### About this tutorial\n\nWe're using Basho's Riak Core for this tutorial,\nyou can check it here [riak_core](https://github.com/basho/riak_core),\nas it seems to be maintained at the time of writing this.\n\nAs part of our interest in this technology and our intention to use it\nin new projects we had to struggle a bit with scarce and outdated\ndocumenatation, stale dependencies, etc. The intention is thus to\nprovide a tutorial on how to use Riak Core today, on an Erlang 25\nand rebar3 project, with minimal dependencies and operational\nsugar. You'll notice the structure borrows heavily from\nthe\n[Little Riak Core Book](https://marianoguerra.github.io/little-riak-core-book/)\nand the\n[riak_core in Elixir](https://medium.com/@GPad/create-a-riak-core-application-in-elixir-part-1-41354c1f26c3)\nseries, which were our main references.\n\n### Useful links\n* [Introducing Riak Core](http://basho.com/posts/business/introducing-riak-core/)\n* [Riak Core Wiki](https://github.com/basho/riak_core/wiki)\n* [Masterless Distributed Computing with Riak Core](http://www.erlang-factory.com/upload/presentations/294/MasterlessDistributedComputingwithRiakCore-RKlophaus.pdf)\n* Ryan Zezeski's \"working\" blog:\n  [First, multinode](https://github.com/rzezeski/try-try-try/tree/master/2011/riak-core-first-multinode) and\n  [The vnode](https://github.com/rzezeski/try-try-try/tree/master/2011/riak-core-the-vnode)\n* [Little Riak Core Book](https://marianoguerra.github.io/little-riak-core-book/)\n* riak_core in Elixir:\n  [Part I](https://medium.com/@GPad/create-a-riak-core-application-in-elixir-part-1-41354c1f26c3),\n  [Part II](https://medium.com/@GPad/create-a-riak-core-application-in-elixir-part-2-88bdec73f368),\n  [Part III](https://medium.com/@GPad/create-a-riak-core-application-in-elixir-part-3-8bac36632be0),\n  [Part IV](https://medium.com/@GPad/create-a-riak-core-application-in-elixir-part-4-728512ece224) and\n  [Part V](https://medium.com/@GPad/create-a-riak-core-application-in-elixir-part-5-86cd9d2c6b92)\n* [A Gentle Introduction to Riak Core](http://efcasado.github.io/riak-core_intro)\n* Understanding Riak Core:\n  [Handoff](http://basho.com/posts/technical/understanding-riak_core-handoff/),\n  [Building Handoff](http://basho.com/posts/technical/understanding-riak_core-building-handoff/)\n  and\n  [The visit fun](http://basho.com/posts/technical/understanding-riak_core-visitfun/)\n* [udon_ng](https://github.com/mrallen1/udon_ng) example application.\n\n\n### 0. Riak Core overview\n\nRiak Core is based on the Dynamo architecture, meaning it\nscales and distributes the work in a decentralized manner, using\n[Consistent Hashing](https://en.wikipedia.org/wiki/Consistent_hashing).\n\nMost operations are applied to an object which is identified by some\ndata value. In the context of a Key/Value store, for example, the\nidentifier is the Key used in get, put and delete operations.\n\nBefore performing the operation, a hashing function is applied to\nthe key. The key hash will be used to decide which node in the\ncluster should be responsible for executing the operation. The range of\npossible values the key hash can take (the keyspace, usually\ndepicted as a ring), is partitioned in equally sized buckets, which\nare assigned to virtual nodes, also known as vnodes.\n\n![The Ring](ring.png)\n\nThe number of vnodes is fixed at cluster creation and a given hash value will\nalways belong to the same partition (i.e. the same vnode). The vnodes in\nturn are evenly distributed across all available physical nodes.\nNote this distribution isn't fixed as the keyspace partitioning\nis: the vnode distribution can change if a physical node is added\nto the cluster or goes down.\n\nYou can find a more detailed demonstration of consistent hashing [here](http://blog.carlosgaldino.com/consistent-hashing.html).\n\nThis architecture enables several desirable properties in our system: high\navalability, incremental scalability and decentralization, with a low operational\ncost. You can find a detailed discussion of these properties in the [Dynamo paper](https://www.allthingsdistributed.com/files/amazon-dynamo-sosp2007.pdf).\n\n### 1. Setup\n\nIn this tutorial we'll build an in-memory, distributed key/value\nstore. Let's start by creating a new project with rebar3:\n\n``` shell\n$ rebar3 new app rc_example\n===\u003e Writing rc_example/src/rc_example_app.erl\n===\u003e Writing rc_example/src/rc_example_sup.erl\n===\u003e Writing rc_example/src/rc_example.app.src\n===\u003e Writing rc_example/rebar.config\n===\u003e Writing rc_example/.gitignore\n===\u003e Writing rc_example/LICENSE\n===\u003e Writing rc_example/README.md\n```\n\nNote there's a\n[rebar3 template for riak core](https://github.com/marianoguerra/rebar3_template_riak_core). The\nreason we don't use\nit here is that it's outdated and it generates a lot of operational\ncode that would take a lot of effort to figure out and fix. Instead,\nwe'll start with an empty project and build our way up, although the\ncode generated by the template can serve as a good reference along the\nway.\n\nNext up we'll fill up some of the rebar.config file. We'll add the\nriak_core dependency and lager, which we'll use for logging:\n\n``` erlang\n{erl_opts, [debug_info]}.\n{deps, [{riak_core, {git, \"https://github.com/basho/riak_core\", {branch, \"develop\"}}}]}.\n```\n\nAt this point you should be able to compile your project running `rebar3 compile`.\n\nNow that the project compiles, let's try to build and run a\nrelease. First we need to add lager and riak_core to\n`src/rc_example.app.src`, so they're started along with our\napplication. We also need to add compiler and cuttlefish, which is a system riak uses\nfor its internal configuration:\n\n``` erlang\n  {applications,\n   [kernel,\n    stdlib,\n    lager,\n    compiler,\n    cuttlefish,\n    riak_core\n   ]}\n```\n\nThen, add the [release configuration](https://www.rebar3.org/docs/releases) for\ndevelopment in `rebar.config`:\n\n``` erlang\n{relx, [{release, {rc_example, \"0.1.0\"}, [rc_example]},\n        {dev_mode, true},\n        {include_erts, false},\n        {sys_config, \"conf/sys.config\"},\n        {vm_args, \"conf/vm.args\"},\n        {extended_start_script, false}]}.\n```\n\nNote we won't be using the `rebar3 shell` command, which doesn't play\nalong nicely with riak_core; we need a proper release instead (although we can\nuse dev_mode). Thus, we can build and run the release with:\n\n    $ rebar3 release \u0026\u0026 _build/default/rel/rc_example/bin/rc_example\n\nIf you go ahead and run that you'll see an error like `Failed to load\nring file: \"no such file or directory\"`. We need to add some configuration to\n`conf/sys.config` and `conf/vm.args` to properly start riak_core:\n\n``` erlang\n%% vm.args\n-name rc_example@127.0.0.1\n\n%% conf/sys.config\n[{riak_core,\n  [{ring_state_dir, \"./data/ring\"},\n   {web_port, 8098},\n   {handoff_port, 8099},\n   {schema_dirs, [\"lib/rc_example-0.1.0/priv\"]}\n  ]}].\n```\n\n`vm.args` just sets the node name; in `sys.config` we set a data\ndirectory for riak core (`ring_state_dir`) and a\ncouple of ports; we also need to point riak to its schema (by setting\n`schema_dirs`). For this to work we have to copy\n[this file](https://github.com/Kyorai/riak_core/blob/fifo-merge/priv/riak_core.schema)\nto `priv/riak_core.schema`.\n\nAt this point we should have a runnable release (if you see errors,\ntry removing the _build directory):\n\n    $ rebar3 release \u0026\u0026 _build/default/rel/rc_example/bin/rc_example\n\n### 2. The vnode\n\nSo far we've got a single Erlang node running a release with riak_core\nin it, but we didn't really write any code to test it. So, before\ngetting into the distributed aspects of riak_core, let's add the\nsimplest possible functionality: a ping command.\n\nRecall from the [overview](/#0-riak-core-overview), that the keyspace (the range of all possible\nresults of hashing a key) is partitioned, and each partition is assigned to a\nvirtual node. The vnode is a worker process which\nhandles incoming requests known as commands and is implemented as\nan OTP behavior. In our initial example\nwe'll create an empty vnode that only knows how to handle a ping\ncommand. A detailed explanation of vnodes can be found [here](https://github.com/rzezeski/try-try-try/tree/master/2011/riak-core-the-vnode).\n\n#### The riak_vnode behavior\n\nLet's add a `src/rc_example_vnode.erl` module that will implement the\n`riak_core_vnode` behavior:\n\n``` erlang\n-module(rc_example_vnode).\n-behaviour(riak_core_vnode).\n\n-export([start_vnode/1,\n         init/1,\n         terminate/2,\n         handle_command/3,\n         is_empty/1,\n         delete/1,\n         handle_handoff_command/3,\n         handoff_starting/2,\n         handoff_cancelled/1,\n         handoff_finished/2,\n         handle_handoff_data/2,\n         encode_handoff_item/2,\n         handle_coverage/4,\n         handle_exit/3]).\n\nstart_vnode(I) -\u003e\n    riak_core_vnode_master:get_vnode_pid(I, ?MODULE).\n\ninit([Partition]) -\u003e\n    {ok, #{partition =\u003e Partition}}.\n\nhandle_command(ping, _Sender, State = #{partition := Partition}) -\u003e\n  log(\"Received ping command ~p\", [Partition], State),\n  {reply, {pong, Partition}, State};\n\nhandle_command(Message, _Sender, State) -\u003e\n    log(\"unhandled_command ~p\", [Message], State),\n    {noreply, State}.\n\n```\n\nFirst off, the `start_vnode` function. This is not a riak_vnode\nbehavior callback, but it's nevertheless required for the vnode to work.\nThis function isn't documented, and to my knowledge it will always\nhave the same implementation: `riak_core_vnode_master:get_vnode_pid(I,\n?MODULE).`, so it could probably be handled internally by\nriak\\_core. Since it isn't, we copy paste that line everytime ¯\\\\\\_(ツ)_/¯\n\nThe `init` callback initializes the state of the vnode, much like in a\ngen_server. In the code above we intialize a state map that only\ncontains the id of the partition assigned to the vnode.\n\nThe next interesting callback is `handle_command`, which as you may\nexpect handles the requests that are assigned to the vnode. The nature\nof the command will be defined by the Message parameter. In the case of\nour simple ping command, we add a new `handle_command` clause that\njust replies with the partition id of the vnode.\n\nThat's all we need to get started, the rest of the `riak_vnode`\ncallbacks will have dummy implementations. We'll get back at those in the\nfollowing sections.\n\n``` erlang\nhandle_handoff_command(_Message, _Sender, State) -\u003e\n    {noreply, State}.\n\nhandoff_starting(_TargetNode, State) -\u003e\n    {true, State}.\n\nhandoff_cancelled(State) -\u003e\n    {ok, State}.\n\nhandoff_finished(_TargetNode, State) -\u003e\n    {ok, State}.\n\nhandle_handoff_data(_Data, State) -\u003e\n    {reply, ok, State}.\n\nencode_handoff_item(_ObjectName, _ObjectValue) -\u003e\n    \u003c\u003c\u003e\u003e.\n\nis_empty(State) -\u003e\n    {true, State}.\n\ndelete(State) -\u003e\n    {ok, State}.\n\nhandle_coverage(_Req, _KeySpaces, _Sender, State) -\u003e\n    {stop, not_implemented, State}.\n\nhandle_exit(_Pid, _Reason, State) -\u003e\n    {noreply, State}.\n\nterminate(_Reason, _State) -\u003e\n    ok.\n\n%% internal\n\n%% same as logger:info but prepends the partition\nlog(String, State) -\u003e\n  log(String, [], State).\n\nlog(String, Args, #{partition := Partition}) -\u003e\n  String2 = \"[~.36B] \" ++ String,\n  Args2 = [Partition | Args],\n  logger:info(String2, Args2),\n  ok.\n```\n\nWe also added a small `log` helper that prepends the partition to all\nthe vnode logs.\n\n#### Application and supervisor setup\nBefore moving on we need to add some boilerplate code for riak_core to\nfind and manage our example vnode. Update the `start` callback in\n`src/rc_example_app.erl`:\n\n``` erlang\nstart(_StartType, _StartArgs) -\u003e\n  ok = riak_core:register([{vnode_module, rc_example_vnode}]),\n  ok = riak_core_node_watcher:service_up(rc_example, self()),\n\n  rc_example_sup:start_link().\n```\n\nThe first line initialises the ring telling riak_core to use\n`rc_example_vnode` as a vnode module. The second one starts the\nnode_watcher, a process responsible for tracking the status of nodes within a riak_core cluster.\n\nWe also need to update the supervisor in `src/rc_example_sup.erl`, to\nstart the vnode_master, the process that coordinates the\ndistribution of work within the physical node: it starts all the\nworker vnodes, receives all the requests on that particular physical\nnode and routes each of them to the vnode that should handle it.\n\n``` erlang\ninit([]) -\u003e\n  VMaster = {rc_example_vnode_master,\n             {riak_core_vnode_master, start_link, [rc_example_vnode]},\n             permanent, 5000, worker, [riak_core_vnode_master]},\n\n  {ok, {{one_for_one, 5, 10}, [VMaster]}}.\n```\n\n#### Sending commands to the vnode\n\nSo far we have a vnode that knows how to respond to an incoming ping\nrequest, but we still need an API to be able to send that\nrequest. We'll add a `src/rc_example.erl` file that will contain the\npublic interface to our application:\n\n``` erlang\n-module(rc_example).\n\n-export([ping/0]).\n\nping()-\u003e\n  Key = os:timestamp(),\n  DocIdx = hash_key(Key),\n  PrefList = riak_core_apl:get_apl(DocIdx, 1, rc_example),\n  [IndexNode] = PrefList,\n  Command = ping,\n  riak_core_vnode_master:sync_spawn_command(IndexNode, Command, rc_example_vnode_master).\n\n%% internal\n\nhash_key(Key) -\u003e\n  riak_core_util:chash_key({\u003c\u003c\"rc_example\"\u003e\u003e, term_to_binary(Key)}).\n```\n\nLet's go over the `ping()` implementation line by line. As stated\nbefore, most operations will be performed over a single object (with the\nexception of aggregation operations, like listing all available keys\nin a key/value store). That object is usually identified by some key,\nwhich will be hashed to decide what partition (that is what vnode at\nwhat physical node) should receive\nthe request. In the case of `ping`, there isn't any actual object\ninvolved, and thus no key, but we make a random one by using\n`os:timestamp()`. The nature of the hashing algorithm\ndistributes values uniformly over the ring, so each new timestamp\nshould be assigned to a random partition of the ring.\n\nThe `hash_key` helper calls `riak_core_util:chash_key` to produce the\nhash of the key. Note `chash_key` receives a tuple of two binaries;\nthe first element is called the bucket, a value\nriak_core will use to namespace your keys; you can choose to have a\nsingle one per application, or many, according to your needs.\n\nThe result of the hash is passed to `riak_core_apl:get_apl`\nwhich returns an Active Preference List (APL) for the given key, this\nis a list of active vnodes that can handle that request. The amount of\noffered vnodes will be determined by the second argument of the\nfunction. We can try these functions in the release shell to get a\nbetter sense of how they work:\n\n``` erlang\n(rc_example@127.0.0.1)1\u003e riak_core_util:chash_key({\u003c\u003c\"rc_example\"\u003e\u003e, term_to_binary(os:timestamp())}).\n\u003c\u003c233,235,224,243,192,63,109,102,255,125,189,206,164,247,\n  117,34,94,199,14,184\u003e\u003e\n(rc_example@127.0.0.1)2\u003e K1 = riak_core_util:chash_key({\u003c\u003c\"rc_example\"\u003e\u003e, term_to_binary(os:timestamp())}).\n\u003c\u003c190,175,151,200,144,123,229,205,94,16,209,140,252,108,\n  247,20,238,31,6,82\u003e\u003e\n(rc_example1@127.0.0.1)3\u003e riak_core_apl:get_apl(K1, 1, rc_example).\n[{1096126227998177188652763624537212264741949407232,\n  'rc_example@127.0.0.1'}]\n(rc_example1@127.0.0.1)4\u003e K2 = riak_core_util:chash_key({\u003c\u003c\"rc_example\"\u003e\u003e, term_to_binary(os:timestamp())}).\n\u003c\u003c113,53,13,80,4,131,62,95,63,164,211,74,145,83,189,77,\n  254,224,190,198\u003e\u003e\n(rc_example@127.0.0.1)5\u003e riak_core_apl:get_apl(K2, 1, rc_example).\n[{662242929415565384811044689824565743281594433536,\n  'rc_example@127.0.0.1'}]\n(rc_example@127.0.0.1)6\u003e riak_core_apl:get_apl(K2, 3, rc_example).\n[{662242929415565384811044689824565743281594433536,\n  'rc_example@127.0.0.1'},\n {685078892498860742907977265335757665463718379520,\n  'rc_example@127.0.0.1'},\n {707914855582156101004909840846949587645842325504,\n  'rc_example@127.0.0.1'}]\n```\n\nWe get different partitions every time, always on the same physical\nnode (because we're still running a single one).\n\nThe last line of `ping/0` sends the `ping` command to the selected\nvnode through the `riak_core_vnode_master`. The function used to do so\nis `sync_spawn_command`, which acts a bit like a `gen_server:call` in\nthe sense that it blocks the calling process waiting for the\nresponse. There are other functions to send commands to a vnode:\n`riak_core_vnode_master:command/3` (which works asynchronously like\n`gen_server:cast`) and `riak_core_vnode_master:sync_command/3` (which\nis like `sync_spawn_command` but blocks the vnode_master process).\n\nYou can find more details of the functions used in this\nsection [here](http://efcasado.github.io/riak-core_intro). To wrap up\nlet's run our `ping` function from the shell:\n\n``` erlang\n(rc_example@127.0.0.1)1\u003e rc_example:ping().\n10:19:00.903 [info] Received ping command 479555224749202520035584085735030365824602865664\n{pong,479555224749202520035584085735030365824602865664}\n(rc_example@127.0.0.1)2\u003e rc_example:ping().\n10:19:01.503 [info] Received ping command 479555224749202520035584085735030365824602865664\n{pong,502391187832497878132516661246222288006726811648}\n```\n\n### 3. Setting up the cluster\n\nAt this point we can execute a simple command, but none of the previous\neffort would make any sense if we keep running stuff on a single\nnode. The whole point of riak_core is to distribute work in a\nfault-tolerant and decetralized manner. In this section we'll update\nour configuration so we can run our\nproject in a three-node Erlang cluster. For practical reasons all of\nthe nodes will reside on our local machine, but moving them to separate\nservers should fairly simple.\n\nIf you review our codebase you'll note that the one spot that\nhas fixed node configuration is `conf/vm.args`, where we set the\nnode name to `rc_example@127.0.0.1`. We want to have `rc_example1`,\n`rc_example2` and `rc_example3` instead. We'll be running our\nnodes in the same machine so we also need to use different ports for\nriak_core in each node (the `web_port` and `handoff_port` tuples in `conf/sys.config`).\n\nSince we'll have an almost identical configuration in all of the nodes,\nwe'll use the overalys feature that rebar3 inherits from relx. You can\nread about\nit [here](https://www.rebar3.org/docs/deployment/releases/#overlays-build-time-configuration),\nalthough it's not strictly necessary for the\npurposes of this tutorial. First we tell rebar3 that `conf/sys.config`\nand `conf/vm.args` should be treated as templates by adding an\n`overlay` tuple in the `relx` configuration:\n``` erlang\n{overlay, [{template, \"conf/sys.config\", \"releases/{{release_version}}/sys.config\"},\n           {template, \"conf/vm.args\", \"releases/{{release_version}}/vm.args\"}]}\n```\n\nIf you're having problems with the templates, check your rebar3 version \nand [this github issue](https://github.com/erlang/rebar3/issues/2710).\n\nThe template variables' values will be taken from `overlay_vars` files. We will\ndefine three\ndifferent [rebar profiles](https://www.rebar3.org/docs/configuration/profiles/) in\n`rebar.config`, each pointing to a different `overaly_vars` file:\n\n``` erlang\n{profiles, [{dev1, [{relx, [{overlay_vars, \"conf/vars_dev1.config\"}]}]},\n            {dev2, [{relx, [{overlay_vars, \"conf/vars_dev2.config\"}]}]},\n            {dev3, [{relx, [{overlay_vars, \"conf/vars_dev3.config\"}]}]}]}\n```\n\n Now create\n`vars_dev1.config`, `vars_dev2.config` and `vars_dev3.config` in the\n`conf` directory as follows:\n\n``` erlang\n%% conf/vars_dev1.config\n{node, \"rc_example1@127.0.0.1\"}.\n\n{web_port,          8198}.\n{handoff_port,      8199}.\n\n%% conf/vars_dev2.config\n{node, \"rc_example2@127.0.0.1\"}.\n\n{web_port,          8298}.\n{handoff_port,      8299}.\n\n%% conf/vars_dev3.config\n{node, \"rc_example3@127.0.0.1\"}.\n\n{web_port,          8398}.\n{handoff_port,      8399}.\n```\n\nLastly, update `sys.config` and `vm.args` to refer to template\nvariables instead of concrete values:\n\n``` erlang\n%% conf/sys.config\n[{riak_core,\n  [{ring_state_dir, \"./data/ring\"},\n   {web_port, {{web_port}}},\n   {handoff_port, {{handoff_port}}},\n   {schema_dirs, [\"lib/rc_example-0.1.0/priv\"]}]}].\n\n%% conf/vm.args\n-name {{node}}\n```\n\nTo run the release we need to tell rebar which profile to use, for\nexample:\n\n``` shell\nrebar3 as dev1 release \u0026\u0026 _build/dev1/rel/rc_example/bin/rc_example\n```\n\nLet's add a Makefile to easily run any of the nodes:\n\n``` makefile\n.PHONY: dev1 dev2 dev3 clean_data\n\ndev1:\n\t./rebar3 as dev1 release \u0026\u0026 _build/dev1/rel/rc_example/bin/rc_example\n\ndev2:\n\t./rebar3 as dev2 release \u0026\u0026 _build/dev2/rel/rc_example/bin/rc_example\n\ndev3:\n\t./rebar3 as dev3 release \u0026\u0026 _build/dev3/rel/rc_example/bin/rc_example\n\nclean_data:\n\trm -rf _build/dev1/rel/rc_example/data* ; rm -rf _build/dev2/rel/rc_example/data* ; rm -rf _build/dev3/rel/rc_example/data*\n```\n\nCurrently, the latest rebar3 release is not built with \nWe also include a `clean_data` target, for the cases when we want to start\nwith a fresh cluster (riak_core persists cluster\ninformation between runs, so you may need to remove it when you make\nchanges to your configuration).\n\nBefore testing our cluster, let's add a function to inspect its status\nin `src/rc_example.erl`:\n\n``` erlang\n-export([ping/0,\n         ring_status/0]).\n\nring_status() -\u003e\n  {ok, Ring} = riak_core_ring_manager:get_my_ring(),\n  riak_core_ring:pretty_print(Ring, [legend]).\n```\n\nNow open three terminals and run one of these commands on each:\n\n``` shell\n$ make dev1\n$ make dev2\n$ make dev3\n```\n\nIf you try the `ring_status` function, you'll see something like:\n\n```erlang\n(rc_example1@127.0.0.1)1\u003e rc_example:ring_status().\n==================================== Nodes ====================================\nNode a: 64 (100.0%) rc_example1@127.0.0.1\n==================================== Ring =====================================\naaaa|aaaa|aaaa|aaaa|aaaa|aaaa|aaaa|aaaa|aaaa|aaaa|aaaa|aaaa|aaaa|aaaa|aaaa|aaaa|\nok\n```\n\nEach node only knows about itself. We can fix that by making node 2 and\n3 join node 1. `riak_core:join` is used for a single node to join a\ncluster:\n\n``` erlang\n%% node 2\n(rc_example2@127.0.0.1)1\u003e riak_core:join('rc_example1@127.0.0.1').\n18:46:45.409 [info] 'rc_example2@127.0.0.1' changed from 'joining' to 'valid'\n\n%% node 3\n(rc_example3@127.0.0.1)1\u003e riak_core:join('rc_example1@127.0.0.1').\n18:46:47.120 [info] 'rc_example3@127.0.0.1' changed from 'joining' to 'valid'\n```\n\nNow `ring_status()` should show the three nodes with a third of the\nkeyspace each (it may take some seconds for the percentages to settle):\n\n``` erlang\n(rc_example1@127.0.0.1)2\u003e rc_example:ring_status().\n==================================== Nodes ====================================\nNode a: 21 ( 32.8%) rc_example1@127.0.0.1\nNode b: 22 ( 34.4%) rc_example2@127.0.0.1\nNode c: 21 ( 32.8%) rc_example3@127.0.0.1\n==================================== Ring =====================================\nabcc|abcc|abcc|abcc|abcc|abcc|abcc|abcc|abcc|abcc|abbc|abba|abba|abba|abba|abba|\nok\n```\n\nIf you call `rc_example:ping()` a couple of times, you should see that\nthe log output (`received ping command`) is printed in a different\nterminal every time, because vnodes from any of the physical nodes can\nreceive the command.\n\n### 4. Building a distributed Key/Value store\n\nNow that we have the project layout and distribution setup we can\nstart working on our in-memory Key/Value store. As you may imagine,\nthis means modifying our worker vnode to support a new set of\ncommands: `put`, `get` and `delete`. Here are the relevant parts:\n\n``` erlang\ninit([Partition]) -\u003e\n  {ok, #{partition =\u003e Partition, data =\u003e #{}}}.\n\nhandle_command({put, Key, Value}, _Sender, State = #{data := Data}) -\u003e\n  log(\"PUT ~p:~p\", [Key, Value], State),\n  NewData = Data#{Key =\u003e Value},\n  {reply, ok, State#{data =\u003e NewData}};\n\nhandle_command({get, Key}, _Sender, State = #{data := Data}) -\u003e\n  log(\"GET ~p\", [Key], State),\n  {reply, maps:get(Key, Data, not_found), State};\n\nhandle_command({delete, Key}, _Sender, State = #{data := Data}) -\u003e\n  log(\"DELETE ~p\", [Key], State),\n  NewData = maps:remove(Key, Data),\n  {reply, maps:get(Key, Data, not_found), State#{data =\u003e NewData}};\n```\n\nIn `init`, we update our state map to include a `data` map; we'll use\nit as our humble data store. Then we\nadd new `handle_command` clauses for each operation: put, get,\nset. The command is received as a named tuple and the result is returned in\na `reply`, like in a gen_server.\n\nJust like we did with `ping`, we'll create public functions in\n`src/rc_example.erl` to execute the new commands:\n\n``` erlang\n-module(rc_example).\n\n-export([ping/0,\n         ring_status/0,\n         put/2,\n         get/1,\n         delete/1]).\n\nping() -\u003e\n  sync_command(os:timestamp(), ping).\n\nring_status() -\u003e\n  {ok, Ring} = riak_core_ring_manager:get_my_ring(),\n  riak_core_ring:pretty_print(Ring, [legend]).\n\nput(Key, Value) -\u003e\n  sync_command(Key, {put, Key, Value}).\n\nget(Key) -\u003e\n  sync_command(Key, {get, Key}).\n\ndelete(Key) -\u003e\n  sync_command(Key, {delete, Key}).\n\n%% internal\nhash_key(Key) -\u003e\n  riak_core_util:chash_key({\u003c\u003c\"rc_example\"\u003e\u003e, term_to_binary(Key)}).\n\nsync_command(Key, Command) -\u003e\n  DocIdx = hash_key(Key),\n  PrefList = riak_core_apl:get_apl(DocIdx, 1, rc_example),\n  [IndexNode] = PrefList,\n  riak_core_vnode_master:sync_spawn_command(IndexNode, Command, rc_example_vnode_master).\n```\n\nThe hashing, vnode selection and command execution is the same\nin all cases, so it was extracted into its own `sync_command` helper\nfunction.\n\nLet's test the new commands. Stop the three nodes if you are still\nrunning them, and run `make dev1`, `make dev2` and `make dev3` to refresh\nthe code of the releases (you'll note that the nodes join the cluster\nwithout the need to call `riak_core:join` again). In any of the\nshells you can try our Key/Value store:\n\n``` erlang\n(rc_example1@127.0.0.1)1\u003e rc_example:put(key1, value1).\n13:12:53.291 [info] PUT key1:value1\nok\n(rc_example1@127.0.0.1)2\u003e rc_example:put(key2, value2).\n13:12:59.602 [info] PUT key2:value2\nok\n(rc_example1@127.0.0.1)3\u003e rc_example:get(key1).\n13:13:30.160 [info] GET key1\nvalue1\n(rc_example1@127.0.0.1)4\u003e rc_example:delete(key1).\n13:13:37.984 [info] DELETE key1\nvalue1\n(rc_example1@127.0.0.1)5\u003e rc_example:get(key1).\nnot_found\n13:13:43.392 [info] GET key1\n(rc_example1@127.0.0.1)6\u003e rc_example:put(key3453, value3453).\n(rc_example1@127.0.0.1)7\u003e rc_example:get(key3453).\nvalue3453\n```\n\nIn the example above, keys `key1` and `key2` are stored on vnodes that\nreside in the first node (and thus we get the log output in the\nshell of rc_example1), while `key3453` is on another one.\n\nAs you can see, even if the initial setup can be a little burdensome,\nyou can get distribution of work and fault-tolerance with little\neffort by just handling your application-specific logic in a vnode module.\n\n### 5. Testing\n\nOur application already has some basic functionality so we should\nstart thinking about how to test it. This is a\ndistributed system that requires multiple nodes to work and manual tests\nwill become more difficult as it grows; moreover, since most of the\ncomplexity resides in the interaction of its components,\nwe won't benefit much from isolated unit tests, instead we should\nwrite an integration suite that provides end-to-end verification of each\nfeature. To accomplish that we will use\n[Common Tests](http://erlang.org/doc/apps/common_test/basics_chapter.html),\nand a combination of\n[ct_slave](http://erlang.org/doc/man/ct_slave.html)\nand [rpc](http://erlang.org/doc/man/rpc.html) to start multiple nodes\nand interact with them.\n\n#### Test implementations\n\nTo start off, let's add a new make target that runs the tests with\nrebar3 (remember to add it in the .PHONY targets):\n\n``` makefile\n.PHONY: dev1 dev2 dev3 dev4 clean_data test\n\ntest:\n\t./rebar3 ct --name test@127.0.0.1\n```\n\nNow create a test directory with a single module\n`test/key_value_SUITE.erl`:\n\n``` erlang\n-module(key_value_SUITE).\n\n-include_lib(\"common_test/include/ct.hrl\").\n\n-compile(export_all).\n\nall() -\u003e\n  [ping_test,\n   key_value_test].\n\ninit_per_suite(Config) -\u003e\n  Node1 = 'node1@127.0.0.1',\n  Node2 = 'node2@127.0.0.1',\n  Node3 = 'node3@127.0.0.1',\n  start_node(Node1, 8198, 8199),\n  start_node(Node2, 8298, 8299),\n  start_node(Node3, 8398, 8399),\n\n  build_cluster(Node1, Node2, Node3),\n\n  [{node1, Node1},\n   {node2, Node2},\n   {node3, Node3} | Config].\n\nend_per_suite(Config) -\u003e\n  Node1 = ?config(node1, Config),\n  Node2 = ?config(node2, Config),\n  Node3 = ?config(node3, Config),\n  stop_node(Node1),\n  stop_node(Node2),\n  stop_node(Node3),\n  ok.\n```\n\nWe include the ct header file and declare two tests in the `all()`\ncallback, which we'll define shortly. In `init_per_suite`\n we create three nodes using the `start_node` helper, and make them\n join a cluster with `build_cluster`; we keep the node names in the\n test configuration so we can later use it to remotely execute\n functions in those nodes; finally we stop the nodes in\n    `end_per_suite` using another helper. We'll leave the\n implementations, which contain most of the ct_slave magic, for the end\n of this section. Now let's focus in the tests:\n\n ```erlang\nping_test(Config) -\u003e\n  Node1 = ?config(node1, Config),\n  Node2 = ?config(node2, Config),\n  Node3 = ?config(node3, Config),\n\n  {pong, _Partition1} = rc_command(Node1, ping),\n  {pong, _Partition2} = rc_command(Node2, ping),\n  {pong, _Partition3} = rc_command(Node3, ping),\n\n  ok.\n\nkey_value_test(Config) -\u003e\n  Node1 = ?config(node1, Config),\n  Node2 = ?config(node2, Config),\n  Node3 = ?config(node3, Config),\n\n  ok = rc_command(Node1, put, [k1, v1]),\n  ok = rc_command(Node1, put, [k2, v2]),\n  ok = rc_command(Node1, put, [k3, v3]),\n\n  %% get from any of the nodes\n  v1 = rc_command(Node1, get, [k1]),\n  v2 = rc_command(Node1, get, [k2]),\n  v3 = rc_command(Node1, get, [k3]),\n  not_found = rc_command(Node1, get, [k10]),\n\n  v1 = rc_command(Node2, get, [k1]),\n  v2 = rc_command(Node2, get, [k2]),\n  v3 = rc_command(Node2, get, [k3]),\n  not_found = rc_command(Node2, get, [k10]),\n\n  v1 = rc_command(Node3, get, [k1]),\n  v2 = rc_command(Node3, get, [k2]),\n  v3 = rc_command(Node3, get, [k3]),\n  not_found = rc_command(Node3, get, [k10]),\n\n  %% test reset and delete\n  ok = rc_command(Node1, put, [k1, v_new]),\n  v_new = rc_command(Node1, get, [k1]),\n\n  v_new = rc_command(Node1, delete, [k1]),\n  not_found = rc_command(Node1, get, [k1]),\n\n  ok = rc_command(Node1, put, [k1, v_new]),\n  v_new = rc_command(Node1, get, [k1]),\n\n  ok.\n ```\n\nThe `ping_test` just sends a `ping` command to each of the nodes, and\nmakes sure it gets a `pong` response every time. Note we use an\n`rc_command` helper, which executes a command of our rc_example\napplication in the given node. The `key_value_test` puts some keys in\nthe store through the first node, then makes sure those keys can be\nretrieved from any of the nodes (regardless of where they are actually\nstored), then it tests the delete command and makes sure the store\ngenerally works as expected.\n\nThere's nothing special about these tests when we abstract away the\ndetails of setting up the nodes and the riak_core cluster.\n\n#### ct_slave magic\n##### Note for OTP 25:\nThis module is being deprecated since OTP 25, and will be\nremoved in OTP 27, in its place, you can use the peer module,\nfor which we wrote a section for below. So, if you're\nusing OTP25 or above, skip this section and if not, keep reading.\nIf you want to know more about this, check out \n[this thread](https://erlangforums.com/t/how-do-i-replace-ct-slave-start-with-ct-peer-or-the-peer-module/1494)\nfrom the Erlang Forums.\n\nLet's look at the implementation of the different helpers we used in the\nprevious section. We need the `start_node` helper to\ncreate a new Erlang node with a given name, and we need it to start our\nrc_example application, much like what happens when we run our\ndevelopment releases; for this to work we should also set up the\nrequired riak_core application environment:\n\n\n``` erlang\nstart_node(NodeName, WebPort, HandoffPort) -\u003e\n  %% need to set the code path so the same modules are available in the peer\n  CodePath = code:get_path(),\n  PathFlag = \"-pa \" ++ lists:concat(lists:join(\" \", CodePath)),\n  {ok, _} = ct_slave:start(NodeName, [{erl_flags, PathFlag}]),\n\n  %% set the required environment for riak core\n  DataDir = \"./data/\" ++ atom_to_list(NodeName),\n  rpc:call(NodeName, application, load, [riak_core]),\n  rpc:call(NodeName, application, set_env, [riak_core, ring_state_dir, DataDir]),\n  rpc:call(NodeName, application, set_env, [riak_core, platform_data_dir, DataDir]),\n  rpc:call(NodeName, application, set_env, [riak_core, web_port, WebPort]),\n  rpc:call(NodeName, application, set_env, [riak_core, handoff_port, HandoffPort]),\n  rpc:call(NodeName, application, set_env, [riak_core, schema_dirs, [\"../../lib/rc_example/priv\"]]),\n\n  %% start the rc_example app\n  {ok, _} = rpc:call(NodeName, application, ensure_all_started, [rc_example]),\n\n  ok.\n\nstop_node(NodeName) -\u003e\n  ct_slave:stop(NodeName).\n```\n\nct_slave makes it pretty simple to manage erlang nodes with the\n[`ct_slave:start`](http://erlang.org/doc/man/ct_slave.html#start-2)\nand [`ct_slave:stop`](http://erlang.org/doc/man/ct_slave.html#stop-1)\nfunctions. The gotcha is that when we start a new node we need to\npoint the code path to Erlang, in order for the node to know where to\nlook for code dependencies. The best way I've found to do it, based on\n[this thread](http://erlang.org/pipermail/erlang-questions/2016-March/088428.html),\nis to get the path from the master node that runs the test, and pass\nit to Erlang with the `-pa` flag. There is probably\na more succint way to do this, for example using `code:set_path`, but\nI couldn't make it work.\n\n#### Start Node implementation using Peer.\nFirst, change the `init_per_suite` function like this:\n\n```erlang\ninit_per_suite(Config) -\u003e\n    Host = \"127.0.0.1\",\n    Node1 = start_node('node1', Host, 8198, 8199),\n    Node2 = start_node('node2', Host, 8298, 8299),\n    Node3 = start_node('node3', Host, 8398, 8399),\n\n    build_cluster(Node1, Node2, Node3),\n\n    [{node1, Node1},\n     {node2, Node2},\n     {node3, Node3} | Config].\n\n```\n\nThe change of arguments being passed to start_node\nis because we will be using the `?CT_PEER` macro, which receives a map\nlike in [here](https://www.erlang.org/doc/man/peer.html#type-start_options), and\nbehaves like if we were using `peer:start`, but adapted to Common Tests.\n\nThen, you should change `start_node` to this:\n```erlang\nstart_node(Name, Host, WebPort, HandoffPort) -\u003e\n    %% Need to set the code path so the same modules are available in the slave\n    CodePath = code:get_path(),\n    %% Arguments to set up the node\n    NodeArgs = #{name =\u003e Name, host =\u003e Host, args =\u003e [\"-pa\" | CodePath]},\n    %% Since OTP 25, ct_slaves nodes are deprecated\n    %% (and to be removed in OTP 27), so we're\n    %% using peer nodes instead, with the CT_PEER macro.\n    {ok, Peer, Node} = ?CT_PEER(NodeArgs),\n    unlink(Peer),\n    DataDir = \"./data/\" ++ atom_to_list(Name),\n\n    %% set the required environment for riak core\n    ok = rpc:call(Node, application, load, [riak_core]),\n    ok = rpc:call(Node, application, set_env, [riak_core, ring_state_dir, DataDir]),\n    ok = rpc:call(Node, application, set_env, [riak_core, platform_data_dir, DataDir]),\n    ok = rpc:call(Node, application, set_env, [riak_core, web_port, WebPort]),\n    ok = rpc:call(Node, application, set_env, [riak_core, handoff_port, HandoffPort]),\n    ok = rpc:call(Node, application, set_env, [riak_core, schema_dirs, [\"../../lib/rc_example/priv\"]]),\n\n    %% start the rc_example app\n    {ok, _} = rpc:call(Node, application, ensure_all_started, [rc_example]),\n\n    Node.\n```\nOne important thing to note is that we unlink the peer. This is due to\n`start_node` being executed with a process that dies before the test\nactually runs. \n\nA convenient thing about the `?CT_PEER` macro is that it kills the \nNode when the test ends, so we don't need to manually kill the nodes\nanymore. So, go ahead and delete the `stop_node` function and redefine\nend_per_suite as: \n```erlang\nend_per_suite(_) -\u003e ok.\n```\n### Node Communication\nOnce the node is up, we can start running functions on it with\n[`rpc:call`](http://erlang.org/doc/man/rpc.html#call-4). In order for\nriak_core to work, we need to load the\napplication and fill its environment with `application:set_env`; we\nset the same variables as we did in `conf/sys.config`, with the addition of\n`platform_data_dir` (this is a directory that riak_core uses to store\nmetadata; we need to set it explicitly here because otherwise the\nthree nodes would conflict trying to write in the same default directory). With\nthe configuration in place, we can start the rc_example app remotely\ncalling `application:ensure_all_started`. Lastly, the `stop_node`\nhelper just needs to call `ct_slave:stop`.\n\nWhen our three nodes are up with the application running, we need to\nconnect them to build the cluster, like we did from the shell:\n\n``` erlang\nbuild_cluster(Node1, Node2, Node3) -\u003e\n  rpc:call(Node2, riak_core, join, [Node1]),\n  rpc:call(Node3, riak_core, join, [Node1]),\n  ok.\n```\n\nThe last helper, `rc_command`, is a very simple one, it just remotely\ncalls one of the functions in the `rc_example` module:\n\n``` erlang\nrc_command(Node, Command) -\u003e\n  rc_command(Node, Command, []).\nrc_command(Node, Command, Arguments) -\u003e\n  rpc:call(Node, rc_example, Command, Arguments).\n```\n\n### 6. Coverage commands\n\nSo far we've been working with commands that operate over a single\nobject, like a single Key in a Key/Value store. In those cases the Key\nwas hashed and the key hash determined the vnode responsible for\nhandling the operation. In the case of the ping command, as discussed,\nwe didn't have a key but we faked one by using the current timestamp.\n\nThere is another kind of command, one that involves all the vnodes in\nthe ring. What happens, for example, if we want to list all the\nKeys in our Key/Value store? Each vnode contains a subset of the Keys\nso the to get the full list we need to ask all the vnodes and join the\nresults. This is what coverage commands consist of: riak_core sends a\ncommand to all of the vnodes then process the results as they arrive.\n\nIn this section we're going to implement two new commands: `keys` and\n`values`, which as you may guess return the list of keys and values\ncurrently present in the datastore.\n\n#### Handle coverage commands in the vnode\n\nThe vnode is the easy part. Each vnode just needs to return the list\nof Keys or Values it contains in the `data` field of its state. This\nis done in the `handle_coverage` callback:\n\n``` erlang\nhandle_coverage(keys, _KeySpaces, {_, ReqId, _}, State = #{data := Data}) -\u003e\n  log(\"Received keys coverage\", State),\n  Keys = maps:keys(Data),\n  {reply, {ReqId, Keys}, State};\n\nhandle_coverage(values, _KeySpaces, {_, ReqId, _}, State = #{data := Data}) -\u003e\n  log(\"Received values coverage\", State),\n  Values = maps:values(Data),\n  {reply, {ReqId, Values}, State}.\n```\n\n#### The coverage FSM\nWe need to introduce a new component, the one that will be in charge\nof managing the coverage command, that is of starting it and gathering\nthe results sent from each of the vnodes. riak_core provides the\n`riak_core_coverage_fsm` behavior for this purpose (a finite state\nmachine). Let's create a `src/rc_example_coverage_fsm.erl` module implementing\nthat behavior and go over each of its functions:\n\n``` erlang\n-module(rc_example_coverage_fsm).\n\n-behaviour(riak_core_coverage_fsm).\n\n-export([start_link/4,\n         init/2,\n         process_results/2,\n         finish/2]).\n\nstart_link(ReqId, ClientPid, Request, Timeout) -\u003e\n  riak_core_coverage_fsm:start_link(?MODULE, {pid, ReqId, ClientPid},\n  [Request, Timeout]).\n```\n\nSo far, nothing very special: the start_link will be called by a\nsupervisor to start the process (see next section) and the\nparameters are more or less forwarded to\n`riak_core_coverage_fsm:start_link`.\n\n``` erlang\ninit({pid, ReqId, ClientPid}, [Request, Timeout]) -\u003e\n  logger:info(\"Starting coverage request ~p ~p\", [ReqId, Request]),\n\n  State = #{req_id =\u003e ReqId,\n            from =\u003e ClientPid,\n            request =\u003e Request,\n            accum =\u003e []},\n\n  {Request, allup, 1, 1, rc_example, rc_example_vnode_master, Timeout, State}.\n```\n\nIn `init`, we initialize the process state as usual. We create a state\nmap where we put request metadata, the client process id (so we can\nlater reply to it with the result of the command) and an accumulator list\nthat we will update with the results coming from each vnode.\n\n`init` returns a big tuple with a bunch of parameters that control how\nthe coverage command should work. Let's briefly explain each of\nthem (mostly taken from\n[here](https://github.com/basho/riak_core/blob/762ec81ae9af9a278e853f1feca418b9dcf748a3/src/riak_core_coverage_fsm.erl#L45-L63);\nyou'll have to dig around for more details):\n\n* Request: an opaque data structure representing the command to be\n  handled by the vnodes. In our case it will be either of the `keys` or\n  `values` atoms.\n* VNodeSelector: an atom that specifies whether we want to run the\n  command in all vnodes (`all`) or only in those reachable (`allup`).\n* ReplicationFactor: used to accurately create a minimal covering set\n  of vnodes.\n* PrimaryVNodeCoverage: The number of primary VNodes from the\n  preference list to use in creating the coverage plan. Typically just\n  1.\n* NodeCheckService: the service used to check for available\n  nodes. This is the same as the atom passed to the node_watcher\n  at application startup.\n* VNodeMaster: The atom to use to reach the vnode master module (`rc_example_vnode_master`).\n* Timeout: timeout of the coverage request.\n* State: the initial state for the module.\n\n\n``` erlang\nprocess_results({{_ReqId, {_Partition, _Node}}, []}, State ) -\u003e\n  {done, State};\n\nprocess_results({{_ReqId, {Partition, Node}}, Data},\n                State = #{accum := Accum}) -\u003e\n  NewAccum = [{Partition, Node, Data} | Accum],\n  {done, State#{accum =\u003e NewAccum}}.\n```\n\nThe `process_results` callback gets called when the coverage module\nreceives a set of results from a vnode. For our `keys` and `values`\ncommands we store the partition\nand node identifiers along with the data, so we can see where each\npiece came from in the final result. Since in our\ntests most of the vnodes will be empty, we filter them out\nby handling the empty list case in a separate `process_results` clause\nthat leaves the accumulator unchanged.\n\n``` erlang\nfinish(clean, State = #{req_id := ReqId, from := From, accum := Accum}) -\u003e\n  logger:info(\"Finished coverage request ~p\", [ReqId]),\n\n  %% send the result back to the caller\n  From ! {ReqId, {ok, Accum}},\n  {stop, normal, State};\n\nfinish({error, Reason}, State = #{req_id := ReqId, from := From, accum := Accum}) -\u003e\n  logger:warning(\"Coverage query failed! Reason: ~p\", [Reason]),\n  From ! {ReqId, {partial, Reason, Accum}},\n  {stop, normal, State}.\n```\n\nFinally, the `finish` function will be called when the coverage\ncommand is done. If it goes well, the first argument will be `clean`;\nin that case we reply the accumulated data to the caller Pid (stored\nin `from`). If there's an error we handle it in the second `finish` clause.\n\n#### Coverage FSM Supervision\n\nWe need to supervise our `rc_example_coverage_fsm` processes. Since\nthese are created on demand, one per each command that needs\nto be executed, we are going to use the `simple_one_for_one`\n[supervisor strategy](http://erlang.org/doc/man/supervisor.html). Create a\n`src/rc_example_coverage_fsm_sup.erl` module:\n\n``` erlang\n-module(rc_example_coverage_fsm_sup).\n\n-behavior(supervisor).\n\n-export([start_link/0,\n         start_fsm/1,\n         init/1]).\n\nstart_link() -\u003e\n  supervisor:start_link({local, ?MODULE}, ?MODULE, []).\n\ninit([]) -\u003e\n  CoverageFSM = {undefined,\n                 {rc_example_coverage_fsm, start_link, []},\n                 temporary, 5000, worker, [rc_example_coverage_fsm]},\n\n  {ok, {{simple_one_for_one, 10, 10}, [CoverageFSM]}}.\n\nstart_fsm(Args) -\u003e\n  supervisor:start_child(?MODULE, Args).\n```\n\nWhen a coverage command needs to be executed, `start_fsm` is\ncalled to create a new child of this supervisor.\n\nWe also need to add `rc_example_coverage_fsm_sup` to the main\napplication supervisor in `src/rc_example_sup.erl`:\n\n``` erlang\ninit([]) -\u003e\n  VMaster = {rc_example_vnode_master,\n             {riak_core_vnode_master, start_link, [rc_example_vnode]},\n             permanent, 5000, worker, [riak_core_vnode_master]},\n\n  CoverageFSM = {rc_example_coverage_fsm_sup,\n                 {rc_example_coverage_fsm_sup, start_link, []},\n                 permanent, infinity, supervisor, [rc_example_coverage_fsm_sup]},\n\n  {ok, {{one_for_one, 5, 10}, [VMaster, CoverageFSM]}}.\n```\n\n#### Putting it all together\n\nNow that we have all the components in place, let's add the `keys` and\n`values` functions to `src/rc_example.erl`:\n\n``` erlang\nkeys() -\u003e\n  coverage_command(keys).\n\nvalues() -\u003e\n  coverage_command(values).\n\n%% internal\n\ncoverage_command(Command) -\u003e\n  Timeout = 5000,\n  ReqId = erlang:phash2(erlang:monotonic_time()),\n  {ok, _} = rc_example_coverage_fsm_sup:start_fsm([ReqId, self(), Command, Timeout]),\n\n  receive\n    {ReqId, Val} -\u003e Val\n  end.\n```\n\nWe create a ReqId to identify the request and call\n`rc_example_coverage_fsm_sup:start_fsm` to create a new child for the\nsupervisor, passing all the parameters the `rc_example_coverage_fsm`\nneeds to execute the command. Finally, we receive the result value,\nidentified by the ReqId.\n\nRestart your releases, fill the store with some values and try the new\ncommands:\n\n``` erlang\n(rc_example1@127.0.0.1)1\u003e rc_example:put(k1, v1).\nok\n(rc_example1@127.0.0.1)2\u003e rc_example:put(k100, v100).\n12:17:25.001 [info] [RKFE4TBE76PR5N05XGK31C8TPLJGJY8] PUT k100:v100\nok\n(rc_example1@127.0.0.1)3\u003e rc_example:put(k101, v101).\nok\n(rc_example1@127.0.0.1)4\u003e rc_example:put(k444, v444).\nok\n(rc_example1@127.0.0.1)5\u003e rc_example:put(k4445, v4445).\nok\n(rc_example1@127.0.0.1)6\u003e rc_example:keys().\n12:17:45.916 [info] Starting coverage request 63138856 keys\n12:17:45.921 [info] [RKFE4TBE76PR5N05XGK31C8TPLJGJY8] Received keys coverage\n12:17:45.921 [info] [TFPL6YYQR76578P6XWBOLKEAZ9KS1S0] Received keys coverage\n12:17:45.921 [info] [0] Received keys coverage\n12:17:45.921 [info] [S18XWCQ8C6TUO1FF6KHZFEA710JSFEO] Received keys coverage\n12:17:45.921 [info] [IOTYLKHHK4JWG0YA4DNZM9ISOOD6Y9S] Received keys coverage\n(...)\n{ok,[{707914855582156101004909840846949587645842325504,\n      'rc_example3@127.0.0.1',\n      [k101]},\n     {1141798154164767904846628775559596109106197299200,\n      'rc_example2@127.0.0.1',\n      [k1]},\n     {890602560248518965780370444936484965102833893376,\n      'rc_example3@127.0.0.1',\n      [k444]},\n     {981946412581700398168100746981252653831329677312,\n      'rc_example3@127.0.0.1',\n      [k4445]},\n     {1347321821914426127719021955160323408745312813056,\n      'rc_example1@127.0.0.1',\n      [k100]}]}\n```\n\nAs expected, the result contains all the inserted keys and what vnode\nand physical node they come from. We also see the `received keys\ncoverage` output from every vnode when they receive the command.\n\n#### Coverage test\n\nWe need to test our new functionality. Before writing an\nintegration test, let's quickly add a `clear` coverage command to\nempty our database, which will come in handy next. Add a new\n`handle_coverage` clause in `src/rc_example_vnode.erl`:\n\n``` erlang\nhandle_coverage(clear, _KeySpaces, {_, ReqId, _}, State) -\u003e\n  log(\"Received clear coverage\", State),\n  NewState = State#{data =\u003e #{}},\n  {reply, {ReqId, []}, NewState}.\n```\n\nWhen the `clear` command is received, the vnode will empty the data\nmap in its internal state. Note we return an empty list because our\ncoverage fsm expects to get a list from each vnode; if the new command\nrequired a different result manipulation we could consider adding a another\n`process_results` clause in the fsm or even an entirely separate fsm\nmodule; in this case we don't really care about results processing,\njust the side effect of clearing the vnode data.\n\nWe will also add the `clear` function to the public API in\n`src/rc_example.erl`. It will just call `coverage_command` and return `ok`.\n\n``` erlang\n-export([clear/0]).\n\nclear() -\u003e\n  {ok, []} = coverage_command(clear),\n  ok.\n```\n\nWith that in place let's test our coverage commands in\n`test/key_value_SUITE.erl`:\n\n``` erlang\nall() -\u003e\n  [ping_test,\n   key_value_test,\n   coverage_test].\n\n%% ...\n\ncoverage_test(Config) -\u003e\n  Node1 = ?config(node1, Config),\n  Node2 = ?config(node2, Config),\n\n  %% clear, should contain no keys and no values\n  ok = rc_command(Node1, clear),\n  [] = rc_coverage(Node1, keys),\n  [] = rc_coverage(Node1, values),\n\n  ToKey = fun (N) -\u003e \"key\" ++ integer_to_list(N) end,\n  ToValue = fun (N) -\u003e \"value\" ++ integer_to_list(N) end,\n  Range = lists:seq(1, 100),\n  lists:foreach(fun(N) -\u003e\n                    ok = rc_command(Node1, put, [ToKey(N), ToValue(N)])\n                end, Range),\n\n  ActualKeys = rc_coverage(Node2, keys),\n  ActualValues = rc_coverage(Node2, values),\n\n  100 = length(ActualKeys),\n  100 = length(ActualValues),\n\n  true = have_same_elements(ActualKeys, lists:map(ToKey, Range)),\n  true = have_same_elements(ActualValues, lists:map(ToValue, Range)),\n\n  %% store should be empty after a new clear\n  ok = rc_command(Node1, clear),\n  [] = rc_coverage(Node1, keys),\n  [] = rc_coverage(Node1, values),\n\n  ok.\n\n%% internal\nrc_coverage(Node, Command) -\u003e\n  {ok, List} = rc_command(Node, Command),\n  %% convert the coverage result to a plain list\n  lists:foldl(fun({_Partition, _Node, Values}, Accum) -\u003e\n                  lists:append(Accum, Values)\n              end, [], List).\n\nhave_same_elements(List1, List2) -\u003e\n  S1 = sets:from_list(List1),\n  S2 = sets:from_list(List2),\n  sets:is_subset(S1, S2) andalso sets:is_subset(S2, S1).\n```\n\nThe `coverage_test` first calls `clear` to empty the store, and makes\nsure `keys` and `values` return an empty result. Note that our\nintegration suite is not ideal in the sense that the tests are not\nisolated from each other: they share the same data store and they\ncouldn't, for example, be run concurrently. In a real world\nproject we could consider creating new nodes for each test\n(although this could be slow) or more likely introduce some sort of\nnamespacing in our data store (perhaps through the use of buckets). For\nthe purposes of this tutorial, though, it's enough to clear the store\nin this particular test.\n\nThe test continues by storing a range of 100 keys and values in our\ndatabase and calling the `keys` and `values` commands. We assert that\nthe results contain 100 elements each and we use some sets logic to check that\nthe elements are the same that we originally inserted. Finally, we\nclear the store again and check `keys` and `values` come back\nempty. The `rc_coverage` helper just calls a command and cleans the\nresult by removing the partition and node annotations.\n\n### 7. Redundancy and fault-tolerance\n\nIn the non-ideal world of distributed systems we need to account for\nthe fact that software and hardware can fail and that networks are\nunreliable. In other words, we need to build our distributed system such that it\nkeeps functioning when one or more of the nodes becomes\nunavailable. riak_core provides some useful building blocks to achieve\nit: it will monitor the cluster, redistribute partitions when\nnodes go down and even expose a mechanism to move data\naround when they come back online (see [handoff](#8-handoff)). But\nsome of the work will be implementation specific.\n\nIf your system works as a distributed database, that is if your\nvnodes hold state that should survive node outages, then you'll have\nto replicate each piece of data to multiple vnodes, so a fallback vnode\ncan take over when the primary is not available. In our Key/Value\nstore example, this means that `put` commands should\nbe sent to multiple vnodes to replicate the data, and `delete` commands\nshould be sent to all the replicas. This is introduces room for a lot\nof design decisions, each with their own tradeoffs:\n* How many physical nodes should a cluster consist of?\n* How many replicas of each key should be stored?\n* How many successful responses are required for a write operation to succeed?\n* How many to read data?\n* How to handle write conflicts between replicas?\n* etc.\n\n\nThese are more related to database design and tuning than to\nriak_core itself; riak_core is about distribution mechanics, so we\nwon't go too fair into the specifics here. For the sake of\ncompleteness, let's briefly mention how the riak_core\nAPI allows us to introduce redundancy. If you review the\n`src/rc_example.erl` module, you'll recal that we use\n`:riak_core_apl.get_apl` to obtain a list of vnodes that can handle a\ngiven command; let's say we want to replicate our data to three nodes,\nthen we can request for that amount:\n\n``` erlang\n(rc_example1@127.0.0.1)2\u003e K = riak_core_util:chash_key({\u003c\u003c\"rc_example\"\u003e\u003e, term_to_binary(os:timestamp())}).\n(rc_example1@127.0.0.1)3\u003e riak_core_apl:get_apl(K, 3, rc_example).\n[{1073290264914881830555831049026020342559825461248,\n  'rc_example1@127.0.0.1'},\n {1096126227998177188652763624537212264741949407232,\n  'rc_example1@127.0.0.1'},\n {1118962191081472546749696200048404186924073353216,\n  'rc_example2@127.0.0.1'}]\n```\n\nThen to actually send a command, instead of using\n`riak_core_vnode_master:sync_spawn_command`, we turn to the more\ngeneric `riak_core_vnode_master:command` which takes a Preference List\ninstead of a single target vnode:\n\n``` erlang\nreplicated_command(Key, Command) -\u003e\n  DocIdx = hash_key(Key),\n  PrefList = riak_core_apl:get_apl(DocIdx, 3, rc_example),\n\n  ReqId = erlang:phash2(erlang:monotonic_time()),\n  Sender = {raw, ReqId, self()},\n  riak_core_vnode_master:command(PrefList, Command, Sender, rc_example_vnode_master),\n  receive\n    {ReqId, Reply} -\u003e Reply\n  end.\n```\n\nNote we need to create a request id and pass the current process in\nthe `Sender` argument so riak_core knows where to send the reply to.\nIn this case, for demonstration purposes, we just do a blocking\n`receive` and return the first message that arrives; a more serious\nimplementation could use a gen_server or a fsm to gather the responses\nand achieve some sort of quorum. If you are interested in the\n  topic, you can review the\n  [Little Riiak Core book](https://marianoguerra.github.io/little-riak-core-book/tolerating-node-failures.html#quorum-based-writes-and-deletes) and\n  the\n  [Elixir series](https://medium.com/@GPad/create-a-riak-core-application-in-elixir-part-5-86cd9d2c6b92),\n  both of which implement solutions to this problem.\n\n### 8. Handoff\n\nPart of the strength of the Dynamo architectures (and thus, of Riak Core) is\nhow it enables scalability with small operational effort. Because\nthe keyspace is designed as a ring of virtual nodes, adding or\nremoving physical nodes to a cluster means changing the distribution\nof the vnodes across the physical nodes: a vnode will always handle\nthe same segment of the keyspace (the same chunk of the key hashes),\nbut where does that vnode resides physycally can change.\n\nFor example: if we have a one-node cluster, it will necessarily\ncontain the entire ring, that is all of the vnodes. If we start a second\nphysical node and join that cluster, half of the vnodes will be \"handed\nover\" to the new physical node, so the keyspace is kept evenly\ndistributed across the cluster.\n\nRiak Core provides the necessary infrastructure to decide where and\nwhen a vnode needs to be moved. We only need to fill in the\nspecifics of how to iterate over our particualr vnodes' state, encode it in the\ngiving end, and decode it in the receiving vnode. This process is\ncalled handoff. We'll go over all the required steps to support this scenario in our\napplication. For a walkthrough of how handoff is implemented internally,\ncheck\n[the riak_core wiki](https://github.com/basho/riak_core/wiki/Handoffs).\n\nNote that if your vnodes are \"stateless\", for example if you just use\nriak_core as a mechanism to distribute work and don't need to keep\ninternal state, you don't need to worry about handoff and can just\nleave the related callbacks empty.\n\n#### When does handoff occur?\n* An `ownership` handoff happens when a physical node joins or leaves\nthe cluster. In this scenario, riak_core reassigns the physical nodes\nresponsible for each vnode and it executes the handoff to move the\nvnode data from its old home to its new home.\n* `hinted` handoffs can occur if there's vnode redundancy (see\n  previous section). When the primary vnode for a particular part of\n  the ring is offline, riak_core still accepts operations on it and\n  routes those to a secondary vnode. When the primary vnode comes back\n  online, riak_core uses handoff to sync the current vnode state from\n  the secondary to the primary. Once the primary is synchronized,\n  operations are routed to it once again.\n\nThere are also `repair` and `resize` related handoffs, which are a advanced\ntopics that we won't cover. You can read about\nthem\n[here](http://basho.com/posts/technical/understanding-riak_core-handoff/),\n[here](https://github.com/rzezeski/try-try-try/tree/master/2011/riak-core-conflict-resolution) and [here](https://github.com/basho/riak_core/commit/036e409eb83903315dd43a37c7a93c9256863807).\n\n#### Vnode implementation\nIf you check our vnode implementation, you'll notice that half of the\ncallbacks deal with handoff. Let's go over their\nimplementation, in the same order as they are called.\n\nFirst, we need to include the `riak_core_vnode` header file, because\nwe will refer to a macro defined there:\n\n``` erlang\n-include_lib(\"riak_core/include/riak_core_vnode.hrl\").\n```\n\n`handoff_starting` is called on the sending vnode before the handoff\nbegins. If the function returns true, the handoff will proceed through\nthe normal path. If it returns false, the handoff will be\ncancelled. We don't need any special action here, so we just log and\nmove forward:\n\n``` erlang\nhandoff_starting(_TargetNode, State) -\u003e\n  log(\"starting handoff\", State),\n  {true, State}.\n```\n\n`handoff_cancelled` is called on the sending vnode in case the process\nis cancelled (usually explcitly by an admin tool). Again, we just log:\n\n``` erlang\nhandoff_cancelled(State) -\u003e\n  log(\"handoff cancelled\", State),\n  {ok, State}.\n```\n\n`is_empty` should return a boolean informing if there's any data to\nmigrate in the vnode; if there's not handoff is finished (calling `handoff_finished`).\n\n``` erlang\nis_empty(State = #{data := Data}) -\u003e\n  IsEmpty = maps:size(Data) == 0,\n  {IsEmpty, State}.\n```\n\nThe bulk of the work is done in the `handle_handoff_command`\ncallback. This function can be a bit confusing, because it serves\ntwo different purposes depending on its calling arguments: to handle the\nrequest to fold over the vnode's data that needs to be transferred,\nand to handle regular vnode commands (e.g. `ping`, `put`, etc.) that arrive\nduring handoff (and would otherwise be passed to `handle_command`).\n\nLet's focus on the first of those cases. riak_core knows what it needs\nto do with each piece of data the vnode holds (encode it, transfer it\nover the network to the new vnode and decode it there), but not what that data\nlooks like or how it's stored (in our case Key/Value pairs on a map),\nso it gives us a function that encapsulates the processing and we need\nto apply it to our data:\n\n``` erlang\nhandle_handoff_command(?FOLD_REQ{foldfun=FoldFun, acc0=Acc0}, _Sender,\n                       State = #{data := Data}) -\u003e\n  log(\"Received fold request for handoff\", State),\n  Result = maps:fold(FoldFun, Acc0, Data),\n  {reply, Result, State};\n```\n\nNevermind the weird macro wrapper: `?FOLD_REQ` is just a record and we\nonly care to extract the fold function (FoldFun) and the initial\naccumulator (Acc0). When a command with this shape arrives, we\niterate over our vnodes' data, applying the given fold function. Note\nthat the this function expects to be passed three arguments: key, value,\nand accumulator. This means that if your data structure doesn't\nalready support this form of fold function you'll have to wrap it; in\nour case we just need to call `maps:fold/3` since our data is a\nmap. The result of the fold is included in the `reply` tuple.\n\nFoldFun is synchronous and in our case the result\nof the command is replied right away, but there's also the option to\nreturn an `async` tuple; you can check\n[riak_kv](https://github.com/basho/riak_kv/blob/develop/src/riak_kv_vnode.erl#L1997-L2011) and\n[riak_search](https://github.com/basho/riak_search/blob/develop/src/riak_search_vnode.erl#L178-L194) implementations\nfor reference. Note that if you go down this route, you may need to handle incoming commands that\nmodify your vnodes' data while you are iterating over it.\n\nThe second situation in which `handle_handoff_command` can be called\nis when a regular command arrives during handoff. If you check the\n[callback specification](https://github.com/Kyorai/riak_core/blob/3.0.9/src/riak_core_vnode.erl#L104-L110) you'll\nsee that the result can be the same as in `handle_command`, with two\nadditional return types: `forward` and `drop`. The forward reply will\nsend the request to the target node, while the drop reply\nsignifies that you won't even attempt to fulfill it. Which one to use\ndepends on your application and the nature of the command.\n\nLet's reason about the possible situations in the case of our\nKey/Value store. When a `handle_handoff_command` arrives we can't tell\nif handoff has just started or is about to finish; we can't tell if\nthe value associated with the command's key has been migrated to the\nreceiving vnode already or the only copy is in this one. So the strategy we can\ntake to stay consistent and avoid unnecessary effort is: when the\ncommand is a write (a `put` or a `delete`), we change our local copy\nof the code _and_ we forward it to the receiving vnode (that way, if it\nwas already migrated, the change is applied in that copy too); if the\ncommand is read (a `get`), we reply with our local copy of the data\n(we know it's up to date because we applied all the writes\nlocally). Let's see how this looks in the code:\n\n``` erlang\nhandle_handoff_command({get, Key}, Sender, State) -\u003e\n  log(\"GET during handoff, handling locally ~p\", [Key], State),\n  handle_command({get, Key}, Sender, State);\n\nhandle_handoff_command(Message, Sender, State) -\u003e\n  {reply, _Result, NewState} = handle_command(Message, Sender, State),\n  {forward, NewState}.\n```\n\nWe added extra `handle_handoff_command` clauses for each of those\ncases. The first one handles `get`, a read operation; the\nimplementation just calls `handle_command` since we\nwant to reply with the local copy of the data, as usual.\n\nThe second clause catches the rest of the commands, `put` and\n`delete`, which are write operations. In these cases we call\n`handle_command` as well, to modify our local copy of the data, but\ninstead of using the result, we return `forward`, so the command is\nsent to the receiving vnode as well.\n\nThat's it for `handle_handoff_command`. For a deeper understanding of\nthe different  scenarios we suggest checking [this](https://github.com/Kyorai/riak_core/blob/faf04f4820aff5bc876f79609fa838e1c86c0fb0/src/riak_core_vnode.erl#L312-L339) and [this](https://github.com/basho/riak_kv/blob/d5cfe62d8f0ff36ead2019bde7a08cdd33fd3764/src/riak_kv_vnode.erl#L974-L984) comments, along with the\nrelevant code.\n\nMoving on to the remaining callbacks. `encode_handoff_item` is called\non the sending vnode, each time a Key/Value pair is about to be sent\nover the wire; we use `term_to_binary` to encode it. On the other end,\n`handle_handoff_data` will be called on the receiving vnode to decode\nthe Key and Value; we use `binary_to_term` and update the data\nmap with the new pair:\n\n``` erlang\nencode_handoff_item(Key, Value) -\u003e\n  erlang:term_to_binary({Key, Value}).\n\nhandle_handoff_data(BinData, State = #{data := Data}) -\u003e\n  {Key, Value} = erlang:binary_to_term(BinData),\n  log(\"received handoff data ~p\", [{Key, Value}], State),\n  NewData = Data#{Key =\u003e Value},\n  {reply, ok, State#{data =\u003e NewData}}.\n```\n\nFinally, when handoff is done `handoff_finished` is called. After\nthat, the sending vnode should be deleted; any necessary cleanup can\nbe done in the `delete` callback. We don't do any special work in\nthese two callbacks, just log and return:\n\n``` erlang\nhandoff_finished(_TargetNode, State) -\u003e\n  log(\"finished handoff\", State),\n  {ok, State}.\n\ndelete(State) -\u003e\n  log(\"deleting the vnode\", State),\n  {ok, State#{data =\u003e #{}}}.\n```\n#### Ownership handoff example\nHandoff is a slow process, so it would be inconvenient to test it as\npart of our integration suite. Instead, let's do some shell\nexperiments to see it in action. Clean the cluster and start three\nnodes:\n\n``` shell\n# terminal 1\n$ make clean_data\n$ make dev1\n\n# terminal 2\n$ make dev2\n\n# terminal 3\n$ make dev3\n```\n\nWhen the nodes are running, join the cluser as we did before:\n\n``` erlang\n%% node 2\n(rc_example2@127.0.0.1)1\u003e riak_core:join('rc_example1@127.0.0.1').\n18:46:45.409 [info] 'rc_example2@127.0.0.1' changed from 'joining' to 'valid'\n\n%% node 3\n(rc_example3@127.0.0.1)1\u003e riak_core:join('rc_example1@127.0.0.1').\n18:46:47.120 [info] 'rc_example3@127.0.0.1' changed from 'joining' to 'valid'\n```\n\nYou may see a bunch of handoff messages now. Eventually the cluster\nwill settle, with the ring evenly distributed across nodes:\n\n``` erlang\n(rc_example1@127.0.0.1)3\u003e rc_example:ring_status().\n==================================== Nodes ====================================\nNode a: 21 ( 32.8%) rc_example1@127.0.0.1\nNode b: 22 ( 34.4%) rc_example2@127.0.0.1\nNode c: 21 ( 32.8%) rc_example3@127.0.0.1\n==================================== Ring =====================================\nabcc|abcc|abcc|abcc|abcc|abcc|abcc|abcc|abcc|abcc|abbc|abba|abba|abba|abba|abba|\nok\n```\n\nLet's add a key, and use the `keys` function to find out what node\nit ends up in:\n\n\n``` erlang\n(rc_example1@127.0.0.1)4\u003e rc_example:put(k1, hello).\nok\n(rc_example1@127.0.0.1)5\u003e rc_example:keys().\n{ok,[{1141798154164767904846628775559596109106197299200,\n      'rc_example2@127.0.0.1', [k1]}]}\n```\n\nIn my case, `k1` is routed to the second node `\n'rc_example2@127.0.0.1'` (it can be a different one in your\nmachine). Let's see what happens if we make that node (the one that\nholds the Key/Value pair) leave the cluster:\n\n``` erlang\n(rc_example2@127.0.0.1)2\u003e riak_core:leave().\nok\n```\n\nYou'll get a bunch of handoff logs again in your screen. During this\nperiod, `rc_example:ring_status()` will show the\npercentage of the ring assigned to the node decreasing until it\nreaches zero. After this the node will be shutdown, and from any of\nthe remaining nodes you'll see something like this:\n\n``` erlang\n14:53:24.322 [info] 'rc_example2@127.0.0.1' changed from 'leaving' to 'exiting'\n14:53:24.351 [info] 'rc_example2@127.0.0.1' removed from cluster\n(previously: 'exiting')\n(rc_example1@127.0.0.1)6\u003e rc_example:ring_status().\n==================================== Nodes ====================================\nNode a: 32 ( 50.0%) rc_example1@127.0.0.1\nNode b: 32 ( 50.0%) rc_example3@127.0.0.1\n==================================== Ring =====================================\nabab|abab|abab|abab|abab|abab|abab|abab|abab|abab|abab|abab|abab|abab|abab|abab|\nok\n```\n\nNow the entire ring is distributed among the two remaining nodes. If\nwe now query for `k1`, we'll confirm that another one took ownership\nof that key:\n\n``` erlang\n(rc_example1@127.0.0.1)7\u003e rc_example:get(k1).\nhello\n{ok,[{1141798154164767904846628775559596109106197299200,\n      'rc_example1@127.0.0.1', [k1]}]}\n```\n\nIn my case, it is `'rc_example1@127.0.0.1'` that took over that part\nof the ring.\n\n#### Hinted handoff example\n\nThe previous section demonstrated what happens when we intentionally\nchange the cluster by removing a node. Now let's see what happens when\nthere's a failure and a node becomes unexpectedly unavailable. We\ndidn't add replication as discussed in\nthe [fault-tolerance section](#7-redundancy-and-fault-tolerance), so\nwe can't expect to preserve data from the failing node, but we can see\nthe hinted handoff mechanics anyway: the failing node won't lose\nownership of its partitions, but the commands that arrive while it's down will\nhave to be temporarily routed to available nodes. When the\nfailing node comes back online, it will receive handoffs with the data\ncreated while it was down.\n\nRepeat the steps from previous section to clean the data, restart the\nnodes and join the cluster. Set a key again, and check in which node\nit resides:\n\n``` erlang\n(rc_example1@127.0.0.1)3\u003e rc_example:put(k1, hello).\nok\n(rc_example1@127.0.0.1)4\u003e rc_example:keys().\n{ok,[{1141798154164767904846628775559596109106197299200,\n      'rc_example2@127.0.0.1', [k1]}]}\n```\n\nIn this case, `k1` physically resides in\n`'rc_example2@127.0.0.1'`. Kill that node with `ctrl-g q` or a similar\ncommand. At this point the `k1` key and its value will be lost,\nbecause we don't have any kind of data replication; but if you put the\nkey again, you'll notice it will be saved in one of the live nodes:\n\n``` erlang\n(rc_example1@127.0.0.1)6\u003e rc_example:get(k1).\nnot_found\n(rc_example1@127.0.0.1)7\u003e rc_example:put(k1, newvalue).\nok\n(rc_example1@127.0.0.1)8\u003e rc_example:get(k1).\nnewvalue\n```\n\nNow start the killed node again, and try to retrieve `k1`:\n\n``` erlang\n(rc_example1@127.0.0.1)9\u003e rc_example:get(k1).\nnot_found\n```\n\nThe second node recovered ownership of the partition to which `k1`\nbelongs, but doesn't (yet) have any value for it. If you wait a while\n(around a minute in my laptop), you should see something along these lines:\n\n```\n16:20:56.845 [info] [ND1G8YLUY5OVK16UNH2ZITWHUUGHOU8] starting handoff\n16:20:56.860 [info] Starting hinted transfer of rc_example_vnode from 'rc_example1@127.0.0.1' 1141798154164767904846628775559596109106197299200 to 'rc_example2@127.0.0.1' 1141798154164767904846628775559596109106197299200\n16:20:56.860 [info] [ND1G8YLUY5OVK16UNH2ZITWHUUGHOU8] Received fold request for handoff\n16:20:56.862 [info] hinted transfer of rc_example_vnode from 'rc_example1@127.0.0.1' 1141798154164767904846628775559596109106197299200 to 'rc_example2@127.0.0.1' 1141798154164767904846628775559596109106197299200 completed: sent 32.00 B bytes in 1 of 1 objects in 0.00 seconds (63.78 KB/second)\n16:20:56.862 [info] [ND1G8YLUY5OVK16UNH2ZITWHUUGHOU8] finished handoff\n```\n\nThe fallback node that temporarily held `k1` handed over its data back\nto the vnode in rc_example2. If you get the key again, you should see\nthe new value, this time coming from rc_example2:\n\n``` erlang\n(rc_example1@127.0.0.1)10\u003e rc_example:get(k1).\nnewvalue\n(rc_example1@127.0.0.1)11\u003e rc_example:keys().\n{ok,[{1141798154164767904846628775559596109106197299200,\n      'rc_example2@127.0.0.1', [k1]}]}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flambdaclass%2Friak_core_tutorial","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flambdaclass%2Friak_core_tutorial","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flambdaclass%2Friak_core_tutorial/lists"}