{"id":15208889,"url":"https://github.com/miguno/wirbelsturm","last_synced_at":"2025-10-03T01:31:08.018Z","repository":{"id":15109146,"uuid":"17836008","full_name":"miguno/wirbelsturm","owner":"miguno","description":"[PROJECT IS NO LONGER MAINTAINED] Wirbelsturm is a Vagrant and Puppet based tool to perform 1-click local and remote deployments, with a focus on big data tech like Kafka.","archived":true,"fork":false,"pushed_at":"2022-02-21T09:38:50.000Z","size":316,"stargazers_count":328,"open_issues_count":0,"forks_count":72,"subscribers_count":36,"default_branch":"master","last_synced_at":"2025-01-19T19:57:42.638Z","etag":null,"topics":["apache-kafka","apache-spark","apache-storm","kafka","puppet","spark","storm","vagrant"],"latest_commit_sha":null,"homepage":"http://www.michael-noll.com/blog/2014/03/17/wirbelsturm-one-click-deploy-storm-kafka-clusters-with-vagrant-puppet/","language":"Shell","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/miguno.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2014-03-17T17:02:51.000Z","updated_at":"2025-01-03T21:43:59.000Z","dependencies_parsed_at":"2022-08-26T20:30:48.774Z","dependency_job_id":null,"html_url":"https://github.com/miguno/wirbelsturm","commit_stats":null,"previous_names":[],"tags_count":2,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/miguno%2Fwirbelsturm","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/miguno%2Fwirbelsturm/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/miguno%2Fwirbelsturm/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/miguno%2Fwirbelsturm/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/miguno","download_url":"https://codeload.github.com/miguno/wirbelsturm/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":235059234,"owners_count":18929279,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["apache-kafka","apache-spark","apache-storm","kafka","puppet","spark","storm","vagrant"],"created_at":"2024-09-28T07:03:12.538Z","updated_at":"2025-10-03T01:31:07.724Z","avatar_url":"https://github.com/miguno.png","language":"Shell","funding_links":[],"categories":[],"sub_categories":[],"readme":"# THIS PROJECT IS NO LONGER MAINTAINED\n\n# Wirbelsturm\n\nWirbelsturm is a [Vagrant](http://www.vagrantup.com/) and [Puppet](https://puppetlabs.com/) based tool to perform\n1-click local and remote deployments, with a focus on big data related infrastructure.\n\n**Wirbelsturm's goal is to make tasks such as \"I want to deploy a multi-node Storm cluster\" _simple_, _easy_, and**\n**_fun_**.\n\nIt has been called the \"Cluster Vagrant\" and \"Big Data Vagrant\" by some of its users, albeit in our opinion that makes\nWirbelsturm appear to be more than it really is, and it doesn't give enough [credit](#credits) to the tools on which it\nis based.\n\nIts direct value proposition is two-fold:\n\n1. ***Provide a working integration of [Vagrant](http://www.vagrantup.com/) and [Puppet](https://puppetlabs.com/).***\n   Vagrant is used to create and manage machines, Puppet is used for provisioning those machines (e.g. to install and\n   configure software packages).  Because Wirbelsturm uses Vagrant you can basically deploy to any target platform\n   that Vagrant supports -- local VMs, AWS, OpenStack, etc. -- although Wirbelsturm does not support all of those out\n   of the box yet.  While Wirbelsturm's Puppet setup is slightly opinionated with its preference for\n   [Hiera](http://docs.puppetlabs.com/hiera/1/) and with its notion of _environments_ and _roles_, these conventions\n   should help to jumpstart new users and, of course, you can change this behavior if needed.\n2. ***Add a thin wrapper layer around Vagrant to simplify deploying multiple machines of the same kind.***\n   This is very helpful when deploying software such as [Storm](http://storm.apache.org/),\n   [Kafka](http://kafka.apache.org/) and [Hadoop](http://hadoop.apache.org/) clusters, where most of the machines look\n   the same.  In native Vagrant you would be required to (say) manually maintain 30 configuration sections in\n   `Vagrantfile` for deploying 30 Storm slave nodes, even though only their hostnames and IP addresses would change from\n   one to the next.\n\nThere is also an indirect, third value proposition:\n\n* Because we happen to maintain Wirbelsturm-compatible Puppet modules such as\n  [puppet-kafka](https://github.com/miguno/puppet-kafka), [puppet-graphite](https://github.com/miguno/puppet-graphite),\n  and [puppet-storm](https://github.com/miguno/puppet-storm), you can benefit from Wirbelsturm's ease of use to\n  conveniently deploy those software packages.  As you may have noticed most of these Puppet modules are related to\n  large-scale data processing infrastructure and to DevOps tools for operating and monitoring such infrastructures, all\n  of which are based on free and open source software.  See [Supported Puppet modules](#supported-puppet-modules) for\n  details.\n\nWe hope you find Wirbelsturm [as useful as we do](#is-it-for-me).  And most importantly, **have fun!**\n\n---\n\nTable of Contents\n\n* \u003ca href=\"#quick-start\"\u003eQuick start\u003c/a\u003e\n* \u003ca href=\"#features\"\u003eFeatures\u003c/a\u003e\n* \u003ca href=\"#is-it-for-me\"\u003eIs Wirbelsturm for me?\u003c/a\u003e\n* \u003ca href=\"#default-configuration\"\u003eDefault configuration\u003c/a\u003e\n* \u003ca href=\"#getting-started\"\u003eGetting started\u003c/a\u003e\n    * \u003ca href=\"#install-prerequisites\"\u003eInstall prerequisites\u003c/a\u003e\n    * \u003ca href=\"#install-wirbelsturm\"\u003eInstall Wirbelsturm\u003c/a\u003e\n* \u003ca href=\"#usage\"\u003eUsage\u003c/a\u003e\n    * \u003ca href=\"#deploying\"\u003ePerforming a deployment\u003c/a\u003e\n    * \u003ca href=\"#access\"\u003eAccessing deployed machines\u003c/a\u003e\n    * \u003ca href=\"#shutdown\"\u003eShutting down the deployed environment\u003c/a\u003e\n    * \u003ca href=\"#ansible\"\u003eAnsible support\u003c/a\u003e\n* \u003ca href=\"#configuration\"\u003eConfiguration\u003c/a\u003e\n    * \u003ca href=\"#cfg-big-picture\"\u003eThe big picture\u003c/a\u003e\n    * \u003ca href=\"#cfg-machine-creation\"\u003eDefining which machines will be created\u003c/a\u003e\n    * \u003ca href=\"#cfg-provisioning\"\u003eProvisioning\u003c/a\u003e\n        * \u003ca href=\"#cfg-puppet-modules\"\u003eDefining which Puppet modules you require\u003c/a\u003e\n        * \u003ca href=\"#cfg-hiera\"\u003eDefining configuration data for Puppet via Hiera\u003c/a\u003e\n* \u003ca href=\"#supported-deployment-platforms\"\u003eSupported deployment platforms\u003c/a\u003e\n    * \u003ca href=\"#platform-support-overview\"\u003ePlatform support overview\u003c/a\u003e\n    * \u003ca href=\"#deploying-locally\"\u003eLocal deployment (VMs)\u003c/a\u003e\n    * \u003ca href=\"#wirbelsturm-less-deployment\"\u003eWirbelsturm-less deployment\u003c/a\u003e\n    * \u003ca href=\"#aws\"\u003eAmazon AWS/EC2 (Beta)\u003c/a\u003e\n    * \u003ca href=\"#openstack\"\u003eOpenStack\u003c/a\u003e\n* \u003ca href=\"#supported-puppet-modules\"\u003eSupported Puppet modules\u003c/a\u003e\n    * \u003ca href=\"#puppet-wirbelsturm-compatibility\"\u003eWhen is a Puppet module compatible with Wirbelsturm?\u003c/a\u003e\n    * \u003ca href=\"#available-puppet-modules\"\u003eAvailable Puppet modules\u003c/a\u003e\n* \u003ca href=\"#known-issues\"\u003eKnown issues and limitations\u003c/a\u003e\n* \u003ca href=\"#faq\"\u003eFAQ\u003c/a\u003e\n* \u003ca href=\"#how-it-works\"\u003eHow it works\u003c/a\u003e\n* \u003ca href=\"#wishlist\"\u003eWishlist\u003c/a\u003e\n* \u003ca href=\"#appendix\"\u003eAppendix\u003c/a\u003e\n    * \u003ca href=\"#appendix-storm-topology\"\u003eSubmitting an example Storm topology\u003c/a\u003e\n* \u003ca href=\"#changelog\"\u003eChange log\u003c/a\u003e\n* \u003ca href=\"#contributing\"\u003eContributing to Wirbelsturm\u003c/a\u003e\n* \u003ca href=\"#license\"\u003eLicense\u003c/a\u003e\n* \u003ca href=\"#credits\"\u003eCredits\u003c/a\u003e\n\n---\n\n\n\u003ca name=\"quick-start\"\u003e\u003c/a\u003e\n\n# Quick start (local Storm cluster)\n\nAssuming you are using a reasonably powerful computer and have already installed [Vagrant](http://www.vagrantup.com/)\n(1.7.2+) and [VirtualBox](https://www.virtualbox.org/) you can launch a multi-node\n[Apache Storm](http://storm.apache.org/) cluster on your local machine with the following commands.  This\nStorm cluster is the default configuration example that ships with Wirbelsturm.  Note that the `bootstrap` command\nneeds to be run only once, after a fresh checkout.\n\n```shell\n$ git clone https://github.com/miguno/wirbelsturm.git\n$ cd wirbelsturm\n$ ./bootstrap     # \u003c\u003c\u003c May take a while depending on how fast your Internet connection is.\n$ vagrant up      # \u003c\u003c\u003c ...and this step also depends on how powerful your computer is.\n```\n\nDone -- you now have a fully functioning Storm cluster up and running on your computer!  The deployment should have\ntaken you less time and effort than brewing yourself an espresso. :-)\n\n\u003e Tip: If you run into networking related issues (e.g. \"unknown host\" errors), try to deploy the cluster via our\n\u003e `./deploy` script instead of running `vagrant up`.  The only additional prerequisite for `./deploy` is the\n\u003e installation of the GNU `parallel` tool -- see section [Install Prerequisites](#install-prerequisites) for details.\n\nLet's take a look at which virtual machines back this cluster behind the scenes:\n\n```\n$ vagrant status\nCurrent machine states:\n\nzookeeper1                running (virtualbox)\nnimbus1                   running (virtualbox)\nsupervisor1               running (virtualbox)\nsupervisor2               running (virtualbox)\n```\n\nStorm also ships with a web UI that shows you the cluster's state, e.g. how many nodes it has, whether any processing\njobs (topologies) are being executed, etc.  Wait 20-30 seconds after the deployment is done and then open the Storm UI\nat [http://localhost:28080/](http://localhost:28080/).\n\nWhat's more, Wirbelsturm also allows you to use [Ansible](http://www.ansible.com/) to interact with the deployed\nmachines via our [ansible](ansible) wrapper script:\n\n```\n$ ./ansible all -m ping\nzookeeper1 | success \u003e\u003e {\n    \"changed\": false,\n    \"ping\": \"pong\"\n}\n\nsupervisor1 | success \u003e\u003e {\n    \"changed\": false,\n    \"ping\": \"pong\"\n}\n\nnimbus1 | success \u003e\u003e {\n    \"changed\": false,\n    \"ping\": \"pong\"\n}\n\nsupervisor2 | success \u003e\u003e {\n    \"changed\": false,\n    \"ping\": \"pong\"\n}\n```\n\nWant to run more Storm slaves?  As long as your computer has enough horsepower you only need to change a single number\nin `wirbelsturm.yaml`:\n\n```yaml\n# wirbelsturm.yaml\nnodes:\n  ...\n  storm_slave:\n      count: 2     # \u003c\u003c\u003c changing 2 to 4 is all it takes\n  ...\n```\n\nThen run `vagrant up` again and shortly after `supervisor3` and `supervisor4` will be up and running.\n\nWant to run a [Kafka](http://kafka.apache.org/) broker?  Uncomment the `kafka_broker` section in your\n`wirbelsturm.yaml` (only remove the leading `#` characters, do not remove any whitespace) then run `vagrant up kafka1`.\n\nOnce you have finished playing around, you can stop the cluster again by executing `vagrant destroy`.\n\nNote that running a small, local Storm cluster is just the default example.  You can do much more with Wirbelsturm than\nthis.\n\n\n\u003ca name=\"features\"\u003e\u003c/a\u003e\n\n# Features\n\n* **Launching machines:**  Wirbelsturm uses Vagrant to launch the machines that make up your infrastructure\n    as VMs running locally in VirtualBox (default) or remotely in Amazon AWS/EC2 (OpenStack support is in the works).\n* **Provisioning machines:**  Machines are provisioned via Puppet.\n    * Wirbelsturm uses a master-less Puppet setup, i.e. provisioning is ultimately performed through `puppet apply`.\n    * Puppet modules are managed via [librarian-puppet](https://github.com/rodjek/librarian-puppet).\n* **(Some) batteries included:**  We maintain a number of standard Puppet modules that work well with Wirbelsturm, some\n  of which are included in the default configuration of Wirbelsturm.  However you can use any Puppet module with\n  Wirbelsturm, of course.  See [Supported Puppet modules](#supported-puppet-modules) for more information.\n* **Ansible support:** The [Ansible](http://www.ansible.com/) aficionados amongst us can use Ansible to interact with\n  machines once deployed through Wirbelsturm and Puppet.\n* **Host operating system support:** Wirbelsturm has been tested with Mac OS X 10.8+ and RHEL/CentOS 6 as host machines.\n  Debian/Ubuntu should work, too.\n* **Guest operating system support:** The target OS version for deployed machines is RHEL/CentOS 6 (64-bit).  Amazon\n  Linux is supported, too.\n    * For local deployments (via VirtualBox) and AWS deployments Wirbelsturm uses a\n      [CentOS 6 box created by PuppetLabs](http://puppet-vagrant-boxes.puppetlabs.com/).\n    * Switching to RHEL 6 only requires specifying a different [Vagrant box](http://docs.vagrantup.com/v2/boxes.html)\n      in [bootstrap](bootstrap) (for VirtualBox) or a different AMI image in `wirbelsturm.yaml` (for Amazon\n      AWS).\n* **When using tools other than Vagrant to launch machines:**  Wirbelsturm-compatible Puppet modules are standard Puppet\n  modules, so of course they can be used standalone, too.  This way you can deploy against bare metal machines even if\n  you are not able to or do not want to run Wirbelsturm and/or Vagrant directly.\n  See [Wirbelsturm-less deployment](docs/Wirbelsturm-less_deployment.md) documentation for details.\n\n\n\u003ca name=\"is-it-for-me\"\u003e\u003c/a\u003e\n\n# Is Wirbelsturm for me?\n\nHere are some ideas for what you can do with Wirbelsturm:\n\n* Evaluate new technologies such as Kafka and Storm in a temporary environment that you can set up and tear\n  down at will. Without having to spend hours and stay late figuring out how to install those tools.\n  Then tell your boss how hard you worked for it.\n* Provide your teams with a consistent look and feel of infrastructure environments from initial prototyping\n  to development \u0026 testing and all the way to production.  Banish \"But it does work fine on _my_ machine!\" remarks\n  from your daily standups.  Well, hopefully.\n* Save money if (at least some of) these environments run locally instead of in an IAAS cloud or on bare-metal\n  machines that you would need to purchase first.  Make Finance happy for the first time.\n* Create production-like environments for training classes.  Use them to get new hires up to speed.  Or unleash a\n  [Chaos Monkey](http://techblog.netflix.com/2012/07/chaos-monkey-released-into-wild.html) and check how well your\n  applications, DevOps tools, or technical staff can handle the mess.  Bring coke and popcorn.\n* Create sandbox environments to demo your product to customers.  If Sales can run it, so can they.\n* Develop and test-drive your or other people's Puppet modules.  But see also\n  [beaker](https://github.com/puppetlabs/beaker) and [serverspec](http://serverspec.org/) if your focus is on\n  testing.\n\nLet us know how _you_ are using Wirbelsturm!\n\n\n\u003ca name=\"default-configuration\"\u003e\u003c/a\u003e\n\n# Default configuration\n\n_The default configuration is what you get when you run `vagrant up` or `./deploy` without any config customizations,_\n_e.g. after a fresh checkout._\n\nThe purpose of the default configuration is to provide you with a simple yet non-trivial (local) deployment example that\nwill work out of the box on a reasonably modern computer.  For that purpose we opted to create a default configuration\nwhich will deploy a functional, multi-node Storm cluster that runs as multiple virtual machines locally.\n\nThe default cluster setup defined in [wirbelsturm.yaml.template](wirbelsturm.yaml.template) consists of\nfour virtual machines:\n\n* 1 ZooKeeper server\n* 1 Storm master node running Nimbus and Storm UI daemons\n* 2 Storm slave nodes, each running two Storm Supervisor daemons for a total of 4 \"slots\" aka worker processes\n\nThe machines are aptly named:\n\n    zookeeper1\n    nimbus1\n    supervisor1\n    supervisor2\n\nThe default Java version in Wirbelsturm is OpenJDK 7.  That means, for instance, that you must compile your own (Kafka,\nStorm, Hadoop, ...) code with Java 7, too.  If needed you can change the JDK package via the Puppet class parameter\n`$java_package_name` of [puppet-wirbelsturm_common](https://github.com/miguno/puppet-wirbelsturm_common).  Here is how\nto do this via Hiera (the example below modifies [common.yaml](puppet/manifests/hieradata/common.yaml)):\n\n```yaml\n---\nclasses:\n  - wirbelsturm_common\n\n# The config value must match the (RPM) package name of the desired JRE/JDK version\nwirbelsturm_common::java_package_name: 'java-1.7.0-sun'\n```\n\n_Important: When deploying Storm in production it is recommended to use Oracle JRE/JDK 7 instead of OpenJDK 7._\n\nThe default configuration sets the Java heap size of the various Storm processes (Nimbus, UI, Supervisor, worker\nprocesses) to 256MB each.  This is enough to play around with Storm but of course not sufficient to perform large-scale\ndata processing.  Make sure you use a powerful host machine and a customized configuration (see below) if you want to do\nmore.\n\nNote: In most cases changes to the configurations of Storm, Kafka, ZooKeeper etc. will automatically trigger a restart\nof the respective processes once you re-deploy.\n\n\n\u003ca name=\"getting-started\"\u003e\u003c/a\u003e\n\n# Getting started\n\nThis section brings you up to speed from zero to a running cluster.  Here, we will show how to use the default cluster\nconfiguration of Wirbelsturm to deploy a Storm cluster locally on your host machine (e.g. your laptop).  If you are\ndeploying Wirbelsturm remotely -- such as on Amazon AWS -- the instructions are very similar.  In the latter cases\nyou should first read the respective sections (e.g. on deploying to AWS) further down in this document and then come\nback to this section because Wirbelsturm works 99% in the same way no matter to where you deploy.\n\n\n\u003ca name=\"install-prerequisites\"\u003e\u003c/a\u003e\n\n## Install prerequisites\n\nWirbelsturm depends on the following software packages _on the host machine from which you run Wirbelsturm_, i.e. the\nmachine on which you execute commands such as `vagrant up`.  So if you are running Wirbelsturm on your laptop, you must\ninstall those packages on that laptop.\n\n1. [Vagrant](http://www.vagrantup.com/) 1.7.2+\n2. [VirtualBox](https://www.virtualbox.org/) 4.3.x\n3. Optional but recommended: [GNU parallel](http://www.gnu.org/software/parallel/).\n    * This step is only needed if you want to benefit from parallel provisioning via our [deploy](deploy) script to\n      speed up deployments.  If in doubt, do install `parallel` because `./deploy` is superior to the standard\n      `vagrant up`.\n\nPreferably Mac OS X users should also:\n\n* have a working [Homebrew](http://brew.sh/) or [MacPorts](http://www.macports.org/) setup\n* have [bash](https://www.gnu.org/software/bash/) as their default shell\n\n\n### Install Vagrant\n\n* [Download version 1.7.2 of Vagrant](https://www.vagrantup.com/downloads.html) for your OS and install accordingly.\n\nVerify the installation of Vagrant:\n\n```shell\n$ vagrant -v\nVagrant version 1.7.2\n```\n\n**Note for Mac OS X users:** To uninstall Vagrant run the `uninstall.tool` script that is included in the `.dmg` file.\n\n\n### Install VirtualBox\n\n* [Download a VirtualBox 4.3.x platform package](https://www.virtualbox.org/wiki/Downloads) for your OS and install\n  accordingly.\n\n**Note for Mac OS X users:**  To uninstall VirtualBox run the `VirtualBox_Uninstall.tool` script that is included in the\n`.dmg` file.\n\n\n### Install GNU parallel (optional but recommended)\n\n_You only need to install GNU parallel if you like to start your clusters via [deploy](deploy) to benefit from parallel\nand thus faster provisioning.  If you do not you can safely omit the installation of GNU parallel.  If in doubt, do_\n_install `parallel` because you will most likely prefer to deploy via the `./deploy` script._\n\nInstall `parallel` on the _host_ machine:\n\n```shell\n# Mac\n# - Homebrew\n$ brew install parallel\n# - MacPorts\n$ sudo port install parallel\n\n# RHEL/CentOS/Fedora\n$ sudo yum install parallel\n$ sudo vi /etc/parallel/config  # and change '--tollef' to '--gnu'\n\n# Debian/Ubuntu\n$ sudo apt-get install parallel # requires Ubuntu 13.04; earlier versions may work, too\n$ sudo vi /etc/parallel/config  # and change '--tollef' to '--gnu'\n```\n\n\n\u003ca name=\"install-wirbelsturm\"\u003e\u003c/a\u003e\n\n## Install Wirbelsturm\n\nClone this repository and then bootstrap Wirbelsturm:\n\n```shell\n$ git clone https://github.com/miguno/wirbelsturm.git\n$ cd wirbelsturm\n$ ./bootstrap     # \u003c\u003c\u003c May take a while depending on how fast your Internet connection is.\n```\n\nThe bootstrapping step will prepare the local environment of your host machine so that it can properly run Wirbelsturm.\nThis includes, for instance, installing a compatible version of Ruby via rvm, required Ruby gems, Vagrant plugins and\nVagrant boxes, as well as any Puppet modules that are included in Wirbelsturm out of the box (see\n[Puppetfile](puppet/Puppetfile)).\n\nAdvanced users also have the option to skip the Ruby-related part of the bootstrapping process, e.g. if you prefer to\nstick to a different Ruby version.  Here, replace the `./bootstrap` command above with a sequence such as:\n\n```shell\n$ bundle install\n$ ./bootstrap --skip-ruby\n```\n\nThe bootstrapping step will also create a `wirbelsturm.yaml` from the included\n[wirbelsturm.yaml.template](wirbelsturm.yaml.template).  This YAML configuration file controls which machines will be\nlaunched and what their configuration will be.\n\n\n\u003ca name=\"usage\"\u003e\u003c/a\u003e\n\n# Usage\n\n\n\u003ca name=\"deploying\"\u003e\u003c/a\u003e\n\n## Performing a deployment\n\n_This section uses the default configuration of Wirbelsturm as a running example._\n\nTo perform a local deployment on your host machine with the default settings you only need to run one of the\nfollowing two commands:\n\n```shell\n# Option 1 (recommended):\n#           Deploy with parallel provisioning (faster).\n#           Logs are stored under `provisioning-logs/`.\n$ ./deploy\n\n# Option 2: Deploy with sequential provisioning, using native Vagrant\n#           (You must use this if you haven't installed the `parallel` tool)\n$ vagrant up\n```\n\nThe `deploy` script is a simple wrapper for `vagrant up`.  In contrast to the standard `vagrant up` behavior\nit will speed up the deployment by running the _provisioning step_ in parallel.  The script will launch the cluster in\ntwo distinct phases:\n\n1. First, it will boot the virtual machines (but not provision them yet).  When deploying locally via VirtualBox this\n   step will _sequentially_ boot the VMs.  Other providers such as AWS support launching machines in parallel.\n2. Once all VMs are running it will then trigger provisioning (via Puppet) in parallel.\n\nThe script stores per-node provisioning log files under `provisioning-logs/`.  Existing log files are purged when you\nre-run `deploy`.\n\n_Tip: You can also re-run `deploy` if you just want to re-provision the cluster in parallel (e.g. because_\n_you changed a single configuration file) without destroying/recreating the virtual machines from scratch.  This saves_\n_you a lot of time because recreating the VMs usually takes several minutes per VM._\n\nFeel free to run `vagrant status` while Vagrant is doing its magic to see which virtual machines are already running.\nNote that the \"running\" state of a VM only means that it is booted -- it does not necessarily mean it has already been\nfully provisioned.\n\nYou can also instruct Wirbelsturm/Vagrant to use a file other than the default `wirbelsturm.yaml`.  You only need to\nset the `WIRBELSTURM_CONFIG_FILE` environment variable appropriately.  This is one way of using Wirbelsturm to deploy\nmultiple environments (think: `wirbelsturm-testing.yaml` and `wirbelsturm-production.yaml`).\n\n    # Examples\n    $ WIRBELSTURM_CONFIG_FILE=/path/to/your/custom-wirbelsturm.yaml ./deploy\n    $ WIRBELSTURM_CONFIG_FILE=/path/to/your/custom-wirbelsturm.yaml vagrant status\n\n\n\u003ca name=\"access\"\u003e\u003c/a\u003e\n\n## Accessing deployed machines\n\nOnce the machines are up and running you can `vagrant ssh \u003chostname\u003e` into them.  You can get the list of available\nhostnames via `vagrant status`.\n\nBy default the `vagrant ssh` command will connect as the user `vagrant`.  This user has password-less sudo enabled so\nthat you can run privileged commands, install software, switch user ids, perform manual service restarts, etc.\n\n```shell\n# Example: ssh-connect to the nimbus1 machine\n$ vagrant ssh nimbus1\n```\n\nYou can also configure SSH port forwarding to access services that run on the deployed machines.  The default\nconfiguration of Wirbelsturm allows you to access the Storm UI running on `nimbus1` with your browser.  Note that it\nmight take up to a minute after provisioning is complete (e.g. after `./deploy` finishes) until the Storm UI is ready\nto use.  If in doubt just hit the reload button in your browser until it works. :-)\n\n* [http://localhost:28080/](http://localhost:28080/) -- Storm UI\n\nThe UI should provide you with a screen similar to the following.  In this screenshot you can also see that there is\none running topology called \"exclamation-topology\" (which will not be the case after a fresh restart of the cluster).\nIn the section _Submitting an example Storm topology_ I will walk you through submitting this topology to the cluster.\n\n![Storm UI Home Page](images/wirbelsturm_storm-ui-home.png?raw=true)\n\nYou can follow the section [Submitting an example Storm topology](#appendix-storm-topology) in the appendix to run your\nfirst hands-on data analysis with Storm.\n\n\n\u003ca name=\"shutdown\"\u003e\u003c/a\u003e\n\n## Shutting down the deployed environment\n\nTo take down the deployed machines you need to run:\n\n```shell\n# Will ask for confirmation for each machine\n$ vagrant destroy\n\n# Will take down machines without asking for any confirmation\n$ vagrant destroy -f\n```\n\nPlease refer to the [Vagrant documentation](http://docs.vagrantup.com/v2/provisioning/puppet_apply.html) for further\ndetails on how to work with Vagrant, notably its [command-line interface](http://docs.vagrantup.com/v2/cli/index.html)\n`vagrant`.\n\n\n\u003ca name=\"ansible\"\u003e\u003c/a\u003e\n\n## Ansible support\n\n### Ansible and Wirbelsturm\n\nWirbelsturm supports [Ansible](http://www.ansible.com/) to interact with deployed machines.  Note however that\nWirbelsturm uses Puppet -- not Ansible -- for provisioning the machines launched by Vagrant.  Wirbelsturm ships with\nan Ansible wrapper script aptly named [ansible](ansible) that pre-configures several Ansible settings (such as\ngenerating a [dynamic inventory](http://docs.ansible.com/intro_dynamic_inventory.html) of running machines by querying\nVagrant) so that Ansible works out of the box with Wirbelsturm/Vagrant.\n\n\n### Examples\n\nAnsible will only see _running_ machines, i.e. those reported by `vagrant status` as `running`.  So before trying to\nplay with Ansible make sure that you have at least one machine up and running.\n\nHere are some examples on how to use Ansible with Wirbelsturm's [ansible](ansible) wrapper script.\n\n    # Show all running boxes\n    $ ./ansible all --list-hosts\n\n    # Ping all running boxes\n    $ ./ansible all -m ping\n\n    # Install 'tree' on the nimbus1 box\n    $ ./ansible nimbus1 -m shell -a 'yum install -y tree' --sudo\n\n    # Check the status of processes running under supervisord on all machines\n    $ ./ansible all -m shell -a 'supervisorctl status' --sudo\n\n\n\u003ca name=\"configuration\"\u003e\u003c/a\u003e\n\n# Configuration\n\nNow that you have played with Wirbelsturm and its default configuration you may want to create your own configuration.\nIn this section we will explain how to do just that.\n\nBefore we start let us highlight that most of the \"requirements\" or \"conventions\" discussed below are in fact not\nspecific to Wirbelsturm.  They are simply driven by the way Vagrant and Puppet/Hiera work.  As such the sections below\nare also a kind of quick introduction to the aforementioned tools and their usage.  Lastly, if you are already familiar\nwith Vagrant and/or Puppet there should be nothing in the next sections that will surprise you.\n\n\n\u003ca name=\"cfg-big-picture\"\u003e\u003c/a\u003e\n\n## The big picture\n\nThere are three key places in Wirbelsturm that you need to customize for your own deployments.\n\n1. **Defining which machines will be created**:\n   The file `wirbelsturm.yaml` (see [wirbelsturm.yaml.template](wirbelsturm.yaml.template)) controls the\n   creation of the machines in your deployment environment.  Here, you define the name of your environment, how\n   many machines will be launched, what their hostnames and \"roles\" are, etc.  This information is subsequently used by\n   Puppet and Hiera to determine which Puppet manifests and Hiera configuration data will be applied to each machine.\n   `wirbelsturm.yaml` is automatically read by Vagrant when you run e.g.  `vagrant status` or `vagrant up`.  Note that\n   `wirbelsturm.yaml` _is not used_ of course when you are not using Wirbelsturm (and thus Vagrant) to launch your\n   machines -- for instance, if you deploy to existing bare-metal machines.\n2. **Defining which Puppet modules you require**:\n   Like many other Puppet-based setups Wirbelsturm uses [librarian-puppet](https://github.com/rodjek/librarian-puppet)\n   to manage the Puppet modules that are used for your deployments (similar to how tools such as Maven or Gradle manage\n   library dependencies in Java).  `librarian-puppet` takes over the control of the `puppet/modules/` directory.  So\n   if you need additional Puppet modules for your deployments, different versions of existing ones, or remove\n   modules, you only need to update [puppet/Puppetfile](puppet/Puppetfile) and then tell `librarian-puppet` to update\n   the modules under `puppet/modules/` via commands such as `librarian-puppet update` or\n   `librarian-puppet update \u003cmodule-name\u003e` (you must run `librarian-puppet` in the `puppet/` sub-directory).\n3. **Defining configuration data for Puppet via Hiera**:\n   Wirbelsturm performs the provisioning of machines in your deployment through Puppet.  And a Puppet best practice is\n   to create or use Puppet modules that support configuration through Hiera.  Roughly speaking, this means that a Puppet\n   module must expose all relevant configuration settings through Puppet class parameters.  The Hiera hierarchy of\n   Wirbelsturm is defined in [puppet/manifests/hiera.yaml](puppet/manifests/hiera.yaml), and the actual Hiera\n   configuration data is stored under [puppet/manifests/hieradata/](puppet/manifests/hieradata).  Please take a look at\n   the existing content in those two places to get started with Hiera in Wirbelsturm.  If you are familiar with Hiera\n   you should notice that Wirbelsturm uses a straight-forward, typical Hiera setup.\n\nIn the next sections we discuss these three key places, and thereby machine creation and provisioning, in further\ndetail.\n\n\n\u003ca name=\"cfg-machine-creation\"\u003e\u003c/a\u003e\n\n## Defining which machines will be created\n\nThe cluster machines are defined in `wirbelsturm.yaml`.  See\n[wirbelsturm.yaml.template](wirbelsturm.yaml.template) for an example.\n\n[Vagrantfile](Vagrantfile) is set up to dynamically read the information in `wirbelsturm.yaml` to configure and launch\nthe virtual machines.\n\nWe want to highlight the following parameters in particular, because they influence the subsequent _provisioning_ of the\nmachines via Puppet:\n\n* The `environment` parameter in `wirbelsturm.yaml`:  The value of this parameter is made available to Puppet and\n  Hiera as the Puppet fact `node_env`.  So if you set `environment: foo`, for instance, then Wirbelsturm will\n  automatically inject the Puppet fact `node_env = 'foo'` into your machines.\n* The `node_role` parameter:  The value of this parameter is made available to Puppet and Hiera as the Puppet fact\n  `node_role`.\n* The `hostname_prefix` and `count` parameters:  These two parameters determine the hostname of a machine.  If, for\n  instance, `hostname_prefix: supervisor` and `count: 3`, then Wirbelsturm will launch three such machines and give them\n  the respective hostnames `supervisor1`, `supervisor2`, and `supervisor3`.  The hostnames are made available to Puppet\n  as the Puppet fact `hostname`.\n\nThe environment name as well as the hostnames of machines are important parameters because you can use them to determine\nwhich Puppet manifests should be applied to a machine -- see [hiera.yaml](puppet/manifests/hiera.yaml).\n\n\u003ca name=\"cfg-provisioning\"\u003e\u003c/a\u003e\n\n## Provisioning\n\nWirbelsturm relies on Puppet and Hiera for provisioning.  As such the entry points for understanding provisioning are:\n\n* [hiera.yaml](puppet/manifests/hiera.yaml) -- defines the Hiera hierarchy\n* [hieradata/](puppet/manifests/hieradata/) -- the actual Hiera configuration data\n* [site.pp](puppet/manifests/site.pp) -- how we control our use of Hiera in Puppet\n* [Puppetfile](puppet/Puppetfile) -- the collection of Puppet modules used for a deployment;  managed through\n  [librarian-puppet](https://github.com/rodjek/librarian-puppet)\n\n_Machine creation settings must match provisioning settings:_\nWhen using Wirbelsturm/Vagrant for machine creation -- i.e. launching machines and such -- then what is defined in Hiera\nmust match what is defined in `wirbelsturm.yaml`;  otherwise the machines will be launched (via Vagrant) but not\nproperly installed and configured (via Puppet) once up and running.\n\n_Puppet modules should be configurable through Hiera:_\nIn Wirbelsturm it is strongly recommended that all Puppet modules that are used for a deployment (see\n[Puppetfile](puppet/Puppetfile)) expose any relevant configuration settings through Puppet class parameters.  Otherwise\nyou cannot use Hiera to inject configuration data into the Puppet manifests, and instead you must hardcode configuration\ndata into your Puppet manifest code.  We've been there, done that, realized it didn't work well or at all.  Don't make\nthe same mistake we did.\n\n_Informing Puppet how configuration data is made available to Puppet manifests via Hiera:_\nWirbelsturm currently relies on three Puppet facts to determine which Puppet manifests should be applied to a machine:\n\n* `node_env`: The name of the deployment _environment_ (e.g. `default-environment`).  This Puppet fact can be used to\n  group settings that are shared across a deployment environment.  See also the previous section on machine creation.\n* `node_role`: The _role_ of the machine (e.g. `kafka_broker`).  See also the previous section on machine creation.\n* `hostname`: The hostname of the machine (e.g. `supervisor1`).  This is the hostname of the machine as returned by\n  standard Unix commands such as `hostname`.  See also the previous section on machine creation.\n\nSee [hiera.yaml](puppet/manifests/hiera.yaml) for the exact definition how those facts are used to determine which\nHiera configuration data and thereby also which Puppet manifests should be applied to a machine.\n\nWe cover _environments_ and _roles_ in the next sections in further detail, and how they can be mixed and matched.\n\n\n\u003ca name=\"cfg-puppet-modules\"\u003e\u003c/a\u003e\n\n### Defining which Puppet modules you require\n\nIn Wirbelsturm you manage the collection of Puppet modules you require for your deployment through the popular Puppet\ntool [librarian-puppet](https://github.com/rodjek/librarian-puppet).  In concrete terms that means you will add (or\nremove) any required modules to [Puppetfile](puppet/Puppetfile).  Once you have added, changed, or removed modules,\nyou must tell librarian-puppet to update its configuration.\n\nHere is an example workflow:\n\n    $ cd puppet/               # change to the puppet/ subfolder\n    $ vi Puppetfile            # add/modify/remove modules\n    $ librarian-puppet update  # MUST be run from inside the puppet/ subfolder!\n\nThat's all!\n\nSee [librarian-puppet](https://github.com/rodjek/librarian-puppet) for more information.\n\n\n\u003ca name=\"cfg-hiera\"\u003e\u003c/a\u003e\n\n### Defining configuration data for Puppet via Hiera\n\n#### Environments\n\nWirbelsturm has the notion of \"deployment environments\".  These environments are nothing fancy, they are simply a\nname and used to define settings that are shared across a number of machines in the same physical location or logical\nenvironment.  For instance, \"every machine in our `storm-production-nyc` environment should talk to _this_ ZooKeeper\nquorum\".\n\nAn environment can have multiple machines, but a machine can be assigned to _only one environment_.\n\n* Defining environments:  Environments are defined by creating a Hiera YAML file at\n  `environments/\u003cenvironment-name\u003e.yaml` (cf.  [environments/](puppet/manifests/hieradata/environments/)).\n* Assigning machines to environments:  You assign a machine to an environment by providing the Puppet fact `node_env` to\n  the machine.  In Wirbelsturm this is done by setting the `environment` parameter in `wirbelsturm.yaml`.\n* Resolving environments:  The names of environment Hiera YAML files under `environments/` are matched against the\n  `node_env` Puppet fact.  Vagrant injects this variable as a custom Puppet fact into the machine via\n  `FACTER_node_env=...`.  Unfortunately this custom fact is not persisted to the machine, e.g. you will not see it when\n  you manually run `facter` inside the machine.\n    * Example: If `node_env` is `storm-production-nyc`, then we look for a file\n      `environments/storm-production-nyc.yaml`.\n\nThe Hiera settings in the environment file are applied to each machine that is assigned to that environment.\n\nWirbelsturm ships with only one environment, the `default-environment`:\n\n* [default-environment](puppet/manifests/hieradata/environments/default-environment.yaml)\n\nYou can easily create your own environments by following this example and standard Puppet/Hiera practices.\n\nFor each machine you can also override the default environment settings through per-host Hiera YAML files\nat `environments/\u003cenvironment-name\u003e/hosts/\u003chostname\u003e.yaml`.\n\n* Example: `environments/storm-production-nyc/hosts/storm-slave-21.yaml`\n\n\n#### Roles\n\nYou can (and normally should) assign every machine a _role_.  Roles are used to define settings that are shared among\nmachines of the _same kind_.  For instance, every Kafka broker (= the machine's role) should normally look the same,\nregardless of whether it's deployed in Europe or in the US (= environment/location).  This is thus also the difference\nbetween _environments_ and _roles_ -- they have similar but distinct purposes.\n\n_Only one role_ can ever be assigned to a machine.  If a machine should have multiple roles, then you can work around\nthis restriction by creating a compound role -- e.g. by combining the logical roles `webserver` and `monitoring` into a\ncompound role `webserver_with_monitoring`.\n\n* Defining roles: Roles are defined by creating a Hiera YAML file at `roles/\u003crole\u003e.yaml` (cf.\n  [roles/](puppet/manifests/hieradata/roles/)).\n* Assigning roles to machines: A role is assigned to a machine by providing the Puppet fact `node_role` to the machine.\n  In Wirbelsturm this is done by setting the `role` parameter in `wirbelsturm.yaml`.\n* Resolving roles:  The names of role Hiera YAML files under `roles/` are matched against the `node_role` Puppet fact.\n  Vagrant injects this variable as a custom Puppet fact into the machine via `FACTER_node_role=...`.  Unfortunately\n  this custom fact is not persisted to the VM, e.g. you will not see it when you manually run `facter` inside the\n  machine.\n    * Example: If `node_role` is `kafka_broker`, then we look for a file `roles/kafka_broker.yaml`.\n\nThe Hiera settings in the role file are applied to each machine that is assigned to that role.\n\nWirbelsturm ships with some such roles out of the box (see [hieradata/roles/](puppet/manifests/hieradata/roles/) for\nthe full list):\n\n* [kafka_broker](puppet/manifests/hieradata/roles/kafka_broker.yaml)\n* [redis_server](puppet/manifests/hieradata/roles/redis_server.yaml)\n* [storm_master](puppet/manifests/hieradata/roles/storm_master.yaml)\n* [storm_slave](puppet/manifests/hieradata/roles/storm_slave.yaml)\n* [zookeeper_server](puppet/manifests/hieradata/roles/zookeeper_server.yaml)\n\nYou can easily create your own roles by following those examples and standard Puppet/Hiera practices.\n\nFor each machine you can also override the default role settings through per-host Hiera YAML files\nat `environments/\u003cenvironment-name\u003e/roles/\u003crole\u003e.yaml`.\n\n* Example: `environments/storm-production-nyc/roles/kafka_broker.yaml`.\n\n\n#### Combining environments and roles\n\nIn a typical Wirbelsturm setup you will usually combine environment-level Hiera settings and role-level Hiera settings.\nThis way you can compose exactly how machines should be deployed while minimizing duplication of configuration data.\n\n* Example: \"All Kafka brokers (= **role**) should normally look like _this_, but in our `kafka-production-nyc`\n  **environment** we need a different setting for _that_ particular configuration parameter.\"\n\nThe default Hiera hierarchy definition at [hiera.yaml](puppet/manifests/hiera.yaml) controls how this composition of\ndata exactly happens -- notably which values override which other values in case a configuration parameter is defined\nmore than once.  This is standard Hiera 101, by the way, and not specific to Wirbelsturm in any way.\n\n\n\u003ca name=\"supported-deployment-platforms\"\u003e\u003c/a\u003e\n\n# Supported deployment platforms\n\n\n\u003ca name=\"platform-support-overview\"\u003e\u003c/a\u003e\n\n## Platform support overview\n\n\u003ctable\u003e\n  \u003ctr\u003e\n    \u003cth\u003eTarget platform\u003c/th\u003e\n    \u003cth\u003eCode status\u003c/th\u003e\n    \u003cth\u003eDocumentation status\u003c/th\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003eLocal deployment (VMs)\u003c/td\u003e\n    \u003ctd\u003e\u003cstrong\u003eReady\u003c/strong\u003e\u003c/td\u003e\n    \u003ctd\u003e\u003cstrong\u003eReady\u003c/strong\u003e\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003eAmazon AWS/EC2\u003c/td\u003e\n    \u003ctd\u003eBeta\u003c/td\u003e\n    \u003ctd\u003eBeta\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003eOpenStack\u003c/td\u003e\n    \u003ctd\u003eIn progress\u003c/td\u003e\n    \u003ctd\u003eNot started\u003c/td\u003e\n  \u003c/tr\u003e\n\u003c/table\u003e\n\nPlease refer to the individual platform sections below for detailed information.\n\n\n\u003ca name=\"deploying-locally\"\u003e\u003c/a\u003e\n\n## Local deployment (VMs)\n\nThis section covers scenarios where you instruct Wirbelsturm to run its machines locally as VMs on a host machine.\nFor further information please read the _Usage_ section above.\n\n\n### Host requirements\n\nThe \"host\" is the machine on which Wirbelsturm will start the virtual Storm cluster, i.e. the machine on which you\nrun `vagrant up` or `deploy`.\n\nThe minimum hardware requirements for running the [default configuration](#default-configuration) are:\n\n* 4 CPU cores\n* 8 GB of RAM\n* 20 GB of disk space\n\nMore is better, of course.\n\n\n### Telling Wirbelsturm to deploy locally\n\n    # Option 1: Sequential provisioning (native Vagrant)\n    $ vagrant up --provider=virtualbox\n\n    # Option 2: Parallel provisioning (Wirbelsturm wrapper script for `vagrant`)\n    #           Logs are stored under `provisioning-logs/`.\n    $ ./deploy --provider=virtualbox\n\n\n\u003ca name=\"wirbelsturm-less-deployment\"\u003e\u003c/a\u003e\n\n## Wirbelsturm-less deployment\n\nSeveral users have asked how they can re-use their existing Wirbelsturm setup that they created for local development\nand testing in order to deploy \"real\" environments, e.g. backed by a couple of bare-metal machines.  A different but\nrelated use case is situations where you cannot or are not allowed to use Wirbelsturm and/or Vagrant to deploy to\nnon-local environments (i.e. to anything but your local computer).  Of course, it is up to you then to manage the\nmachines (booting machines, configuring networking, etc.), which is normally taken care of by Wirbelsturm/Vagrant.\n\nSee the [Wirbelsturm-less deployment](docs/Wirbelsturm-less_deployment.md) documentation for details.\n\n\n\u003ca name=\"aws\"\u003e\u003c/a\u003e\n\n## Amazon AWS/EC2 (Beta)\n\nWirbelsturm supports deploying to AWS.  See our current [AWS documentation](docs/AWS.md) for details.\n\nHowever at this point you still need to perform a few one-time AWS preparation steps.  And because this means our\nusers do not have the best possible AWS experience we decided to flag AWS support as \"beta\".  What does \"beta\" mean\nin this context?  It means that it is still possible at this point that we will perform a code refactoring to change\nour AWS support for the better -- and thus we may change the way Wirbelsturm users need to configure their AWS\ndeployments or how they interact with AWS through Wirbelsturm/Vagrant may still change.  Since we first finished the\nAWS-related code some time has passed, and several upstream projects such as\n[vagrant-aws](https://github.com/mitchellh/vagrant-aws) have improved during that time.  Also, plugins such as\n[vagrant-hostmanager](https://github.com/smdahlen/vagrant-hostmanager) may allow use to stop using (and thus requiring\nyou to configure) Amazon Route 53 (but right now `vagrant-hostmanager` is not yet compatible with Vagrant 1.5).\n\nWe therefore think that we can further simplify the way you can use Wirbelsturm to deploy to AWS, even though this may\nmean we have to redo certain parts of the code, and even break backwards compatibility in some areas.\n\n\n\n\u003ca name=\"openstack\"\u003e\u003c/a\u003e\n\n## OpenStack\n\nThis section will eventually describe how to deploy to private and public OpenStack clouds.  Code contributions are\nwelcome!\n\n\n\u003ca name=\"supported-puppet-modules\"\u003e\u003c/a\u003e\n\n# Supported Puppet modules\n\n\u003ca name=\"puppet-wirbelsturm-compatibility\"\u003e\u003c/a\u003e\n\n## When is a Puppet module compatible with Wirbelsturm?\n\nIn general any Puppet module is compatible with Wirbelsturm.  Yes, _any_ module.\n\nHowever we strongly recommend to write or use such Puppet modules that expose their relevant configuration settings\nthrough _class parameters_.  This decouples the module's logic (code/manifests) from its configuration, and thereby\nallows you to configure the module's behavior through [Hiera](http://docs.puppetlabs.com/hiera/1/).  This Puppet\nrecommendation is not specific to Wirbelsturm -- in fact, you will often (always?) want to follow this best practice\nevery time you write or use a Puppet module for your deployments.\n\nIf your favorite Puppet module does not follow this style, you can of course still use it in Wirbelsturm.  However\nin this case you will most likely have to fork/modify the module whenever your configuration requirements change.\nOr write \"adapter\" Puppet modules that wrap the original one.  Or...you get the idea.  Whatever workaround you pick it\nwill usually not make you perfectly happy.  But then again, for tasks such as quick prototyping or when your under\npressure it is acceptable to \"just get it done\".  Just be aware that most likely you're adding technical debt to your\nsetup.\n\n\n\u003ca name=\"available-puppet-modules\"\u003e\u003c/a\u003e\n\n## Available Puppet modules\n\nThe following table shows a _non-comprehensive list_ of Puppet modules that are known to work well with Wirbelsturm,\nwhere \"well\" means they can be configured through Hiera.  As we said in the previous section _any_ Puppet module can be\nused in Wirbelsturm, it just happens that some will make your life easier than others.  So treat the table below merely\nas a nice starting point, but not as an exclusive listing.\n\nYou will find more Puppet modules on [PuppetForge](https://forge.puppetlabs.com/) and [GitHub](https://github.com/).\n\n\n\u003ctable\u003e\n  \u003ctr\u003e\n    \u003cth\u003eModule name\u003c/th\u003e\n    \u003cth\u003eDescription\u003c/th\u003e\n    \u003cth\u003eMust build RPM*\u003c/th\u003e\n    \u003cth\u003eIncluded in node role**\u003c/th\u003e\n    \u003cth\u003eBuild status\u003c/th\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003e\u003ca href=\"https://github.com/miguno/puppet-diamond\"\u003epuppet-diamond\u003c/a\u003e\u003c/td\u003e\n    \u003ctd\u003e\n      Deploys \u003ca href=\"https://github.com/BrightcoveOS/Diamond\"\u003eDiamond\u003c/a\u003e, a Python-based tool that collects system\n      metrics and publishes those to Graphite.\n    \u003c/td\u003e\n    \u003ctd\u003e\u003ca href=\"https://github.com/miguno/wirbelsturm-rpm-diamond\"\u003eYes\u003c/a\u003e\u003c/td\u003e\n    \u003ctd\u003e\u003ca href=\"puppet/manifests/hieradata/roles/monitoring.yaml\"\u003emonitoring\u003c/a\u003e\u003c/td\u003e\n    \u003ctd\u003e\n      \u003ca href=\"https://travis-ci.org/miguno/puppet-diamond\"\u003e\u003cimg src=\"https://travis-ci.org/miguno/puppet-diamond.png?branch=master\" alt=\"Build Status\" /\u003e\u003c/a\u003e\n    \u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003e\u003ca href=\"https://github.com/miguno/puppet-kafka\"\u003epuppet-kafka\u003c/a\u003e\u003c/td\u003e\n    \u003ctd\u003e\n      Deploys \u003ca href=\"http://kafka.apache.org/\"\u003eApache Kafka\u003c/a\u003e 0.8.x, a high-throughput distributed messaging\n      system.\n    \u003c/td\u003e\n    \u003ctd\u003e\u003ca href=\"https://github.com/miguno/wirbelsturm-rpm-kafka\"\u003eYes\u003c/a\u003e\u003c/td\u003e\n    \u003ctd\u003e\u003ca href=\"puppet/manifests/hieradata/roles/kafka_broker.yaml\"\u003ekafka_broker\u003c/a\u003e\u003c/td\u003e\n    \u003ctd\u003e\n      \u003ca href=\"https://travis-ci.org/miguno/puppet-kafka\"\u003e\u003cimg src=\"https://travis-ci.org/miguno/puppet-kafka.png?branch=master\" alt=\"Build Status\" /\u003e\u003c/a\u003e\n    \u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003e\u003ca href=\"https://github.com/miguno/puppet-graphite\"\u003epuppet-graphite\u003c/a\u003e\u003c/td\u003e\n    \u003ctd\u003e\n      Deploys \u003ca href=\"http://graphite.wikidot.com/\"\u003eGraphite\u003c/a\u003e 0.9.x, a monitoring-related tool for storing and\n      rendering time-series data.\n    \u003c/td\u003e\n    \u003ctd\u003eNo\u003c/td\u003e\n    \u003ctd\u003e\u003ca href=\"puppet/manifests/hieradata/roles/monitoring.yaml\"\u003emonitoring\u003c/a\u003e\u003c/td\u003e\n    \u003ctd\u003en/a\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003e\u003ca href=\"https://github.com/miguno/puppet-redis\"\u003epuppet-redis\u003c/a\u003e\u003c/td\u003e\n    \u003ctd\u003eDeploys \u003ca href=\"http://redis.io/\"\u003eRedis\u003c/a\u003e 2.8+, a key-value store.\u003c/td\u003e\n    \u003ctd\u003e\u003ca href=\"https://github.com/miguno/wirbelsturm-rpm-redis\"\u003eYes\u003c/a\u003e\u003c/td\u003e\n    \u003ctd\u003e\u003ca href=\"puppet/manifests/hieradata/roles/redis_server.yaml\"\u003eredis_server\u003c/a\u003e\u003c/td\u003e\n    \u003ctd\u003e\n      \u003ca href=\"https://travis-ci.org/miguno/puppet-redis\"\u003e\u003cimg src=\"https://travis-ci.org/miguno/puppet-redis.png?branch=master\" alt=\"Build Status\" /\u003e\u003c/a\u003e\n    \u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003e\u003ca href=\"https://github.com/miguno/puppet-storm\"\u003epuppet-storm\u003c/a\u003e\u003c/td\u003e\n    \u003ctd\u003e\n      Deploys \u003ca href=\"http://storm.apache.org/\"\u003eApache Storm\u003c/a\u003e 0.9.x, a distributed real-time computation\n      system.\n    \u003c/td\u003e\n    \u003ctd\u003e\u003ca href=\"https://github.com/miguno/wirbelsturm-rpm-storm\"\u003eYes\u003c/a\u003e\u003c/td\u003e\n    \u003ctd\u003e\n      \u003ca href=\"puppet/manifests/hieradata/roles/storm_master.yaml\"\u003estorm_master\u003c/a\u003e,\n      \u003ca href=\"puppet/manifests/hieradata/roles/storm_slave.yaml\"\u003estorm_slave\u003c/a\u003e,\n      \u003ca href=\"puppet/manifests/hieradata/roles/storm_single.yaml\"\u003estorm_single\u003c/a\u003e\n    \u003c/td\u003e\n    \u003ctd\u003e\n      \u003ca href=\"https://travis-ci.org/miguno/puppet-storm\"\u003e\u003cimg src=\"https://travis-ci.org/miguno/puppet-storm.png?branch=master\" alt=\"Build Status\" /\u003e\u003c/a\u003e\n    \u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003e\u003ca href=\"https://github.com/miguno/puppet-supervisor\"\u003epuppet-supervisor\u003c/a\u003e\u003c/td\u003e\n    \u003ctd\u003e\n      Deploys \u003ca href=\"http://www.supervisord.org/\"\u003eSupervisord\u003c/a\u003e 3.x, a process control system (process\n      supervisor).\n    \u003c/td\u003e\n    \u003ctd\u003e\u003ca href=\"https://github.com/miguno/wirbelsturm-rpm-supervisord\"\u003eYes\u003c/a\u003e\u003c/td\u003e\n    \u003ctd\u003eIncluded in most node roles.\u003c/td\u003e\n    \u003ctd\u003en/a\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003e\u003ca href=\"https://github.com/miguno/puppet-zookeeper\"\u003epuppet-zookeeper\u003c/a\u003e\u003c/td\u003e\n    \u003ctd\u003e\n      Deploys \u003ca href=\"http://zookeeper.apache.org/\"\u003eApache ZooKeeper\u003c/a\u003e 3.4.x, a centralized service for maintaining\n      configuration information, naming, providing distributed synchronization, and providing group services.\n    \u003c/td\u003e\n    \u003ctd\u003eNo\u003c/td\u003e\n    \u003ctd\u003e\u003ca href=\"puppet/manifests/hieradata/roles/zookeeper_server.yaml\"\u003ezookeeper_server\u003c/a\u003e\u003c/td\u003e\n    \u003ctd\u003e\n      \u003ca href=\"https://travis-ci.org/miguno/puppet-zookeeper\"\u003e\u003cimg src=\"https://travis-ci.org/miguno/puppet-zookeeper.png?branch=master\" alt=\"Build Status\" /\u003e\u003c/a\u003e\n    \u003c/td\u003e\n  \u003c/tr\u003e\n\u003c/table\u003e\n\n_(*) You must build an RPM for this software yourself because a suitable official package is not available._\n\n\u003cem\u003e\n(**) You can use these existing roles directly for the `node_role` parameter in your `wirbelsturm.yaml`.\nOf course you can modify existing node roles or define your own.\n\u003c/em\u003e\n\n\n\u003ca name=\"known-issues\"\u003e\u003c/a\u003e\n\n# Known issues and limitations\n\n## ZooKeeper server fails to join quorum because of \"UnknownHostException\"\n\n_This issue only affects ZK quorum deployments.  Standalone ZK deployments are not affected._\n\nThis issue is caused by a known bug in ZooKeeper 3.4+ that, as of October 2014, is not yet fixed:\n\n* [ZOOKEEPER-1848](https://issues.apache.org/jira/browse/ZOOKEEPER-1846):\n  Cached InetSocketAddresses prevent proper dynamic DNS resolution\n\nUnfortunately this issue is very reliably triggered when using Vagrant (and thus Wirbelsturm) to deploy ZK quorums to\nlocal VMs. :-(\n\nYou can quickly test whether your deployment is affected via the following Ansible command, which sends the\n[stat](http://zookeeper.apache.org/doc/trunk/zookeeperAdmin.html#The+Four+Letter+Words) Four Letter Command to all\nZK servers:\n\n    # Here we test whether the machine `zookeeper1` has joined the quorum\n    $ ./ansible zookeeper* -m shell -a 'echo stat | nc 127.0.0.1 2181'\n\nA negative \"failure\" reply includes the string \"This ZooKeeper instance is not currently serving requests\", which means\nthe ZK server has not joined the quorum, which typically indicates that it is affected by the ZK issue described here:\n\n    zookeeper3 | success | rc=0 \u003e\u003e\n    This ZooKeeper instance is not currently serving requests\n\nIn comparison, a positive \"success\" message looks as follows:\n\n    zookeeper3 | success | rc=0 \u003e\u003e\n    Zookeeper version: 3.4.5-cdh4.7.0--1, built on 05/28/2014 16:33 GMT\n    Clients:\n     /127.0.0.1:48714[0](queued=0,recved=1,sent=0)\n\n    Latency min/avg/max: 0/0/0\n    Received: 1\n    Sent: 0\n    Connections: 1\n    Outstanding: 0\n    Zxid: 0x100000000\n    Mode: follower   # \u003c\u003c\u003c this ZK server has joined the quorum as a follower\n    Node count: 4\n\nAnother telling sign is `java.net.UnknownHostException` errors in the ZK log files:\n\n    $ ./ansible zookeeper* -m shell -a 'grep java.net.UnknownHostException /var/log/zookeeper/zookeeper.log | tail'\n    zookeeper3 | success | rc=0 \u003e\u003e\n    java.net.UnknownHostException: zookeeper2\n\nBut what is going on here?  This ZK issue is triggered when the ZK process is started at a time when the hostnames of\nsome ZK quorum members (here: `zookeeper2`) are not resolvable, and due to the ZK bug above (`InetSocketAddress` being\ncached forever) the ZK process is not to be able to recover from this condition.\n\nCurrently the only remedy is to restart the problematic ZK process, i.e. the one that is complaining about unknown\nhosts.  You can use the [ansible](ansible) wrapper script in Wirbelsturm to trigger such restarts:\n\n    # Restart the ZK process on `zookeeper3`\n    $ ./ansible zookeeper3 -m shell -a 'supervisorctl restart zookeeper' --sudo\n\n    # Restart the ZK processes on all ZK machines\n    $ ./ansible zookeeper* -m shell -a 'supervisorctl restart zookeeper' --sudo\n\n\n\u003ca name=\"faq\"\u003e\u003c/a\u003e\n\n# Frequently Asked Questions\n\n## Wirbelsturm?\n\n\"Wirbelsturm\" is German for [Whirlwind](http://en.wikipedia.org/wiki/Whirlwind), which is a kind of storm.  Originally\nwe built Wirbelsturm with the sole intent to conveniently deploy Storm clusters, but the name stuck as we moved along.\n\n\n## Define exact versions of software to be installed?\n\nIt depends on the Puppet modules you use what needs to be done to, say, tell Wirbelsturm (via Puppet) that you want\nto install Storm version `0.9.2-incubating` specifically.\n\nThe Puppet modules included in Wirbelsturm use Hiera for configuration, so here you must update Hiera data to configure\nwhich exact version of Storm should be installed.\n\nThe following Hiera snippet shows at the example of [puppet-storm](https://github.com/miguno/puppet-storm) how you\ntell Wirbelsturm to install Storm version `0.9.2-incubating` when deploying the default environment in Wirbelsturm:\n\n```yaml\n# In puppet/manifests/hieradata/environments/default-environment.yaml\n\n# puppet-storm exposes the `package_ensure` parameter, which allows you to define which version\n# of the Storm RPM package should be installed.\n# See also https://docs.puppetlabs.com/references/latest/type.html#package-attribute-ensure.\n#\n# NOTE: The name of this parameter may be different across the Puppet modules you use,\n#       and some Puppet modules may not even support such a parameter at all (ours do).\nstorm::package_ensure: '0.9.2_incubating-1.miguno'\n```\n\nYou can find out which exact version identifier (here: `0.9.2_incubating-1.miguno`) you need by inspecting the RPM\npackage that is used to install the software:\n\n```\n$ rpm -qpi storm-0.9.2_incubating.el6.x86_64.rpm\nName        : storm                        Relocations: /opt/storm\nVersion     : 0.9.2_incubating                  Vendor: Storm Project\nRelease     : 1.miguno                      Build Date: Mon Jun 30 12:03:16 2014\nInstall Date: (not installed)            Build Host: build1\nGroup       : default                       Source RPM: storm-0.9.2_incubating-1.miguno.src.rpm\nSize        : 22881927                         License: unknown\nSignature   : RSA/SHA1, Mon Jun 30 12:08:14 2014, Key ID b31f46760aa7be3f\nPackager    : \u003cmichael@michael-noll.com\u003e\nURL         : http://storm-project.net\nSummary     : Distributed real-time computation system\nArchitecture: x86_64\nDescription :\nDistributed real-time computation system\n```\n\nIn the example above, you would combine the \"Version\" and the \"Release\" fields, and use the result as the value of\n`storm::package_ensure`.\n\n\n## Where to start reading the code?\n\nWirbelsturm is based on Vagrant.  This means the entry point for the code is [Vagrantfile](Vagrantfile).  The\nPuppet-related provisioning code is in [manifests/](puppet/manifests/) and, with regard to the included Puppet modules,\n[Puppetfile](puppet/Puppetfile).  Here you should start reading at [manifests/site.pp](puppet/manifests/site.pp) and\n[manifests/hiera.yaml](puppet/manifests/hiera.yaml)..\n\n\n## Increase logging output of Vagrant?\n\nSet the environment variable `VAGRANT_LOG` accordingly.  Example:\n\n```shell\n$ VAGRANT_LOG=debug vagrant up\n```\n\n## Share files between the host and the guest machines?\n\n### Option 1: synced folder\n\nTo upload or download data you only need to place them in the `shared/` directory (host) and `/shared` directory\n(guests aka virtual machines).  Vagrant automatically syncs the contents of these folders.  For instance, if you\ncreate the file `shared/foo` on the host then all the cluster machines can access this file via `/shared/foo`.\nSynced files are readable AND writable from all machines.\n\nNote that:\n\n* When using VirtualBox as provider then changes to the synced folder is instantaneous.\n* When using AWS as provider then changes to the synced folder require another provisioning run (which triggers rsync).\n\n\n### Option 2: vagrant-scp.sh\n\nYou can use the `scp` wrapper script [vagrant-scp.sh](sh/vagrant-scp.sh) to transfer files between the host machine\nand a guest machine (copying directly from guest to guest is not supported).\n\n```shell\n# Upload from the host machine to a guest machine\n$ sh/vagrant-scp.sh /path/to/file/on/host/foo.txt nimbus1:/tmp\n\n# Download from a guest machine to the host machine\n$ sh/vagrant-scp.sh nimbus1:/tmp/bar.txt .\n```\n\n## Force shutdown of VirtualBox vm when 'vagrant destroy' fails?\n\nOn rare occasions Vagrant may fail to destroy (shutdown) a VirtualBox vm.  The error message will be similar\nto:\n\n    There was an error while executing `VBoxManage`, a CLI used by Vagrant\n    for controlling VirtualBox. The command and stderr is shown below.\n\n    Command: [\"unregistervm\", \"964f02c5-b368-44c8-840e-f47f90979791\", \"--delete\"]\n\n    Stderr: VBoxManage: error: Cannot unregister the machine 'wirbelsturm_supervisor2_1377594297' while it is locked\n    VBoxManage: error: Details: code VBOX_E_INVALID_OBJECT_STATE (0x80bb0007), component Machine, interface IMachine, callee nsISupports\n\nYou can force the shutdown by executing the following steps:\n\n1. Kill the VBoxHeadless process of the problematic VirtualBox vm.\n\n        # Find process ID\n        $ ps axu | grep VBoxHeadless | grep \u003cvm-hostname\u003e\n        $ kill \u003cID\u003e\n\n2. Then run `vagrant destroy \u003cvm-hostname\u003e`.\n\n\n# vagrant-hosts does not recognize Vagrant version?\n\nAfter an upgrade of Vagrant you may see the following error when running `deploy`, `vagrant up` or `vagrant provision`:\n\n    [redis1] Running provisioner: hosts...\n    1.3.1 isn't a recognized Vagrant version, vagrant-hosts can't reliably\n    detect the `change_host_name` method.\n\nIn almost all cases this problem can be solved by installing (updating) the latest version of the `vagrant-hosts`\nplugin.\n\n    $ vagrant plugin install vagrant-hosts\n\n\n## Get EC2 information about a guest machine when deploying to AWS?\n\nExample:\n\n```shell\n$ vagrant awsinfo -m nimbus1\n```\n\n\n## Bootstrap fails while compiling Ruby\n\nOn Mac OS X you may run into the following error when running `./bootstrap`:\n\n    ruby-1.9.3-p362 - #compiling....................\n    Error running '__rvm_make -j 1',\n    showing last 15 lines of /Users/brady.doll/.rvm/log/1409157940_ruby-1.9.3-p362/make.log\n    f_rational_new_no_reduce1(VALUE klass, VALUE x)\n    ^\n    6 warnings generated.\n    compiling re.c\n    compiling regcomp.c\n    compiling regenc.c\n    compiling regerror.c\n    compiling regexec.c\n    compiling regparse.c\n    regparse.c:582:15: error: implicit conversion loses integer precision: 'st_index_t' (aka 'unsigned long') to 'int' [-Werror,-Wshorten-64-to-32]\n        return t-\u003enum_entries;\n        ~~~~~~ ~~~^~~~~~~~~~~\n    1 error generated.\n    make: *** [regparse.o] Error 1\n    ++ return 2\n    There has been an error while running make. Halting the installation.\n    Installing bundler...\n    ERROR:  While executing gem ... (Gem::FilePermissionError)\n        You don't have write permissions for the /Library/Ruby/Gems/2.0.0 directory.\n    Installing gems (if any)\n    bash: line 200: bundle: command not found\n    Thanks for using ruby-bootstrap.  Happy hacking!\n    ruby-1.9.3-p362 is not installed.\n    To install do: 'rvm install ruby-1.9.3-p362'\n    Checking Vagrant environment...\n    Checking for Vagrant: OK\n\n    \u003crest removed\u003e\n\nThe following steps may fix the problem.\n\n1. Install [Homebrew](http://brew.sh/) or [MacPorts](http://www.macports.org/), and then run:\n\n   Homebrew\n\n        $ brew update\n        $ brew tap homebrew/dupes\n        $ brew install apple-gcc42\n\n   MacPorts:\n\n        $ sudo port selfupdate\n        $ sudo port install apple-gcc42\n\n2. Compile Ruby manually\n\n        $ CC=/opt/local/bin/gcc-apple-4.2 rvm install ruby-1.9.3-p362 --enable-shared --without-tk --without-tcl\n\n3. Re-run `./bootstrap` -- the install should complete successfully now.\n\nSee [Error running Bootstrap on Mac OSX 10.9](https://github.com/miguno/wirbelsturm/issues/19) for details.\n\n\n## Run on Dell desktop computers?\n\nYou may need to tweak the BIOS settings of Dell desktop computers to allow the execution of 64-bit VMs.\n\n* Under _Performance_, set _Virtualization_ to **On** (factory default is Off)\n* Set _VT for Direct I/O_ to **On** (factory default is Off)\n\n\n## \"Malformed version number string\" after upgrade to Vagrant 1.5.x\n\nYou may run into the following error when upgrading from Vagrant 1.4.x to 1.5.x:\n\n    /Applications/Vagrant/embedded/lib/ruby/2.0.0/rubygems/version.rb:191:in `initialize': Malformed version number string aws (ArgumentError)\n\nMost likely this means your Vagrant upgrade did not succeed for some reason.  One indication is that the file\n`$HOME/.vagrant.d/setup_version` contains the content `1.5` instead of `1.1\\n`.\n\nThe following command fixes this problem:\n\n    echo \"1.1\" \u003e $HOME/.vagrant.d/setup_version\n\nNow you can try re-running Vagrant.  See the discussion at [Can't start my VM on Vagrant 1.5.1](https://github.com/mitchellh/vagrant/issues/3195) for\ndetails.\n\n\n\u003ca name=\"how-it-works\"\u003e\u003c/a\u003e\n\n# How it works\n\n## Main configuration of Wirbelsturm\n\nThe main configuration file is `wirbelsturm.yaml` (see [wirbelsturm.yaml.template](wirbelsturm.yaml.template)).\nThis configuration file defines the various machines, their roles and additional information such as how many of\neach you want to deploy.\n\nWe introduced the `wirbelsturm.yaml` file to simplify the deployment of many machines of the same type.  For instance,\nhere's how you can change your deployment to run 30 instead of 2 Storm slave machines:\n\n```yaml\n# wirbelsturm.yaml: run 2 Storm slaves\nnodes:\n  storm_slave:\n      count: 2\n  ...\n```\n\n```yaml\n# wirbelsturm.yaml: run 30 Storm slaves\nnodes:\n  storm_slave:\n      count: 30     # \u003c\u003c\u003c changing 2 to 30 is all it takes\n  ...\n```\n\nIn native Vagrant you would have to copy-paste nearly identical configuration sections 30x in `Vagrantfile`, in which\nonly the hostname and IP address would change.\n\n\n## Passing Wirbelsturm configuration to Vagrant\n\nWe use a custom Ruby module [wirbelsturm.rb](lib/wirbelsturm.rb) that parses the `wirbelsturm.yaml` configuration file\nand hands over this data to Vagrant's [Vagrantfile](Vagrantfile).  Vagrant will launch the defined machines and will\nthen use Puppet to provision them once they have booted.\n\n\n## Masterless and nodeless Puppet setup\n\nOur Puppet setup is _master-less_ (no Puppet Master used) and _node-less_.\n\nOne reason to go with a master-less setup was that we have\none less dependency (Puppet Master) to worry about.  Also, going without a Puppet Master means we do not have to scale\nor HA the Puppet Master.\n\nThe nodeless approach is described at\n[puppet-examples/nodeless-puppet](https://github.com/jordansissel/puppet-examples/tree/master/nodeless-puppet/).\n\"Nodeless\" means that we are not making use of Puppet's\n[node definitions](http://docs.puppetlabs.com/puppet/2.7/reference/lang_node_definitions.html), which have the form\n`node 'nimbus1' { ... }`.  Instead, Wirbelsturm relies on Puppet's so-called _facts_ to define the _role_ of a\nmachine (through `wirbelsturm.yaml`) and thus which Puppet code is applied to the machine.  These roles determine\nwhich Puppet manifests and which [Hiera configuration data](puppet/manifests/hiera.yaml) will be applied to a machine\n(\"If machine has the 'webserver', then do X, Y, and Z.\").  One benefit of not using node definitions is that we are\nnot coupling the hostname of machines to their purpose (read: role).\n\nUnder the hood we are using Vagrant's feature of adding the required\n[custom Puppet facts](http://docs.vagrantup.com/v2/provisioning/puppet_apply.html) such as their role and the name of\nthe deployment environment to the machines.   In the case of deploying to AWS we are also adding the same information\nto the EC2 tags of the instances.  This facilitates identifying and working with the instances on the EC2 console.\n\n_Note that you will not see Vagrant-injected custom Puppet facts when you run `facter` on a guest machine.  The_\n_custom fact is only available as a variable to the Puppet manifests/modules._\n\n\n## DNS configuration\n\nWirbelsturm uses the Vagrant plugin [vagrant-hosts](https://github.com/adrienthebo/vagrant-hosts) to manage DNS settings\nconfigured in `/etc/hosts` on the cluster machines.  This only works for the VirtualBox provider though.  Wirbelsturm\nuses a different approach for the DNS configuration ([Route 53](http://aws.amazon.com/route53/)) when deploying to\nAmazon AWS.\n\n\n## RPM packages\n\nPuppet works best when software is installed as `.rpm` (RHEL family) or `.deb` (Debian family) packages instead of\n(say) tarballs.\n\nPreferably one would use only official software packages, such as those provided by the official RHEL/CentOS\nrepositories, [EPEL](https://fedoraproject.org/wiki/EPEL) or binary releases of the upstream software projects.\nUnfortunately a number of software projects we want to deploy (e.g. Kafka, Storm) do not provide such RPM packages yet.\nFor this reason we create our own RPMs where needed, and also release the packaging code (see e.g.\n[wirbelsturm-rpm-kafka](https://github.com/miguno/wirbelsturm-rpm-kafka)).\n\n## Yum repositories\n\nWe host our custom RPMs, where needed, in a public yum repository for the convenience of Wirbelsturm users.  However we\nwant to become neither a third-party package maintainer nor a third-party repository, so this practice may likely\nchange.\n\nWe therefore strongly recommend that you manage your own RPM packages and associated yum repositories, particularly\nwhen you are performing production deployments.\n\n\n\u003ca name=\"wishlist\"\u003e\u003c/a\u003e\n\n# Wishlist\n\nA non-comprehensive list of features we are still considering to add to Wirbelsturm.\n\n* Puppet:\n    * Investigate how we can easily reverse/clean up roles from a machine if a role does not apply anymore\n      (cf. [nodeless-puppet](https://github.com/jordansissel/puppet-examples/tree/master/nodeless-puppet/)).\n      Most of the work in that regard would need to happen on the side of the actual Puppet modules though.\n* Amazon AWS\n    * Investigate whether we want to support deployments to Amazon VPC environments, too.\n    * Reduce SOA/TTL for Route53 entries to reduce \"startup\" time for DNS?\n        * See [Creating A Records Dynamically, Can't Ping Them](https://forums.aws.amazon.com/thread.jspa?messageID=298775):\n          \"Be aware that based on your current SOA record negative responses will be cached for 5 minutes.\"\n\n\n\u003ca name=\"appendix\"\u003e\u003c/a\u003e\n\n# Appendix\n\n\n\u003ca name=\"appendix-storm-topology\"\u003e\u003c/a\u003e\n\n## Submitting an example Storm topology\n\n_Note: The instructions below are subject to change._\n\nOnce you have a Storm cluster up and running you can submit your first Storm topology.  We will use an example topology\nfrom [storm-starter](https://github.com/apache/storm/tree/master/examples/storm-starter) to run a first Storm topology\nin the cluster.\n\nFirst you will need to install [Apache Maven](http://maven.apache.org/):\n\n```shell\n# Homebrew\n$ brew install maven\n# MacPorts\n$ sudo port intall maven3\n# sudo port select --set maven maven3\n```\n\n```shell\n$ cd /tmp\n# Clone Storm\n$ git clone git://github.com/apache/storm.git\n\n# At this point you may want to perform a checkout of the exact version of Storm\n# that is running in Wirbelsturm (or your \"real\" Storm cluster).\n#\n# The Storm team uses git tags to label release versions.  The following command,\n# for example, checks out the code for Storm 0.9.3:\n#\n#     $ git checkout v0.9.3\n#\n# You can list all available tags by running `git tag`.\n\n# Build Storm\n$ cd storm\n$ mvn clean install -DskipTests=true\n\n# Build the storm-starter example\n$ cd examples/storm-starter\n$ mvn compile exec:java -Dstorm.topology=storm.starter.WordCountTopology\n$ mvn package\n```\n\nThe last command `mvn package` will create a jar file of the storm-starter code at the following location:\n\n    target/storm-starter-{version}-jar-with-dependencies.jar\n\nWe can now use this jar file to submit and run the `ExclamationTopology` in our Storm cluster.  But first we must make\nthis jar file available to the cluster machines.  To do so you must copy the jar file to the `shared/` folder on the\nhost machine.  This folder is mounted automatically in each virtual machine under `/shared` (note the leading slash).\n\n_Note: The version number might be different for you, update the command to match. In the following examples we will use\nversion 0.9.3-SNAPSHOT._\n\n```shell\n# Run the following command on the host machine in the Wirbelsturm base directory\n# (i.e. where Vagrantfile is)\n$ cp /tmp/storm/examples/storm-starter/target/storm-starter-0.9.3-SNAPSHOT-jar-with-dependencies.jar shared/\n```\n\nFor this example we will submit the topology from the `nimbus1` machine.  That being said you can use any cluster\nmachine on which Storm is installed.\n\n```shell\n$ vagrant ssh nimbus1\n[vagrant@nimbus1 ~]$ /opt/storm/bin/storm jar \\\n                        /shared/storm-starter-0.9.3-SNAPSHOT-jar-with-dependencies.jar \\\n                        storm.starter.ExclamationTopology exclamation-topology\n```\n\nThe `storm jar` command submits a topology to the cluster.  It instructs Storm to launch the `ExclamationTopology`\nin distributed mode in the cluster.  Behind the scenes the storm-starter jar file is distributed by the Nimbus daemon\nacross the Storm slave nodes.\n\nYou can now open the Storm UI and check how your topology is doing:\n\n* [http://localhost:28080/](http://localhost:28080/) -- Storm UI\n\nTo kill the topology either use the Storm UI or run the `storm` CLI tool:\n\n```shell\n$ /opt/storm/bin/storm kill exclamation-topology\n```\n\nFor more details please refer to\n[Running a Multi-Node Storm Cluster](http://www.michael-noll.com/tutorials/running-multi-node-storm-cluster/#submitting-the-topology-to-the-cluster).\n\n\n\u003ca name=\"changelog\"\u003e\u003c/a\u003e\n\n# Change log\n\nSee [CHANGELOG](CHANGELOG.md).\n\n\n\u003ca name=\"license\"\u003e\u003c/a\u003e\n\n## License\n\nCopyright © 2013-2014 Michael G. Noll\n\nSee [LICENSE](LICENSE) for licensing information.\n\n\n\u003ca name=\"contributing\"\u003e\u003c/a\u003e\n\n# Contributing to Wirbelsturm\n\nAll contributions are welcome: ideas, documentation, code, patches, bug reports, feature requests etc.  And you don't\nneed to be a programmer to speak up!\n\nIf you are new to GitHub please read [Contributing to a project](https://help.github.com/articles/fork-a-repo) for how\nto send patches and pull requests to Wirbelsturm.\n\n\n\u003ca name=\"credits\"\u003e\u003c/a\u003e\n\n# Credits\n\nWe want to thank the creators of [Vagrant](http://www.vagrantup.com/) and [Puppet](http://www.vagrantup.com/) in\nparticular, and also the open source community in general.  Wirbelsturm is only a thin integration layer between those\ntools, and _none_ of the features that Wirbelsturm provides would be possible without those existing tools.  Many thanks\nto all of you!\n\n* [deploy](deploy) is based on [para-vagrant.sh](https://github.com/joemiller/sensu-tests/blob/master/para-vagrant.sh)\n  by Joe Miller.\n\nSee also our [NOTICE](NOTICE) file.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmiguno%2Fwirbelsturm","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmiguno%2Fwirbelsturm","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmiguno%2Fwirbelsturm/lists"}