{"id":21280277,"url":"https://github.com/zalando-stups/planb-cassandra","last_synced_at":"2025-07-11T09:31:54.337Z","repository":{"id":66771341,"uuid":"51748164","full_name":"zalando-stups/planb-cassandra","owner":"zalando-stups","description":"Plan B Cassandra for STUPS/AWS with static IPs","archived":false,"fork":false,"pushed_at":"2020-10-05T12:02:38.000Z","size":600,"stargazers_count":27,"open_issues_count":40,"forks_count":18,"subscribers_count":22,"default_branch":"master","last_synced_at":"2025-04-06T02:33:37.054Z","etag":null,"topics":["aws","cassandra","ec2"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/zalando-stups.png","metadata":{"files":{"readme":"README.rst","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2016-02-15T10:29:09.000Z","updated_at":"2023-02-13T12:18:57.000Z","dependencies_parsed_at":null,"dependency_job_id":"0e0b6ea1-301d-4663-bb4e-fc32aa5ec3d4","html_url":"https://github.com/zalando-stups/planb-cassandra","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/zalando-stups/planb-cassandra","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zalando-stups%2Fplanb-cassandra","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zalando-stups%2Fplanb-cassandra/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zalando-stups%2Fplanb-cassandra/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zalando-stups%2Fplanb-cassandra/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/zalando-stups","download_url":"https://codeload.github.com/zalando-stups/planb-cassandra/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zalando-stups%2Fplanb-cassandra/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":264777752,"owners_count":23662552,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["aws","cassandra","ec2"],"created_at":"2024-11-21T10:29:06.715Z","updated_at":"2025-07-11T09:31:54.325Z","avatar_url":"https://github.com/zalando-stups.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"================\nPlan B Cassandra\n================\n\nBootstrap and update a Cassandra cluster on STUPS_/AWS.\n\nPlanb deploys Cassandra by means of individual EC2 instances running Taupage_ \u0026 Docker with the latest\nCassandra version 3.0.x (default; the new 'tick-tock' releases 3.x and older 2.x versions\nare still available).\n\nFeatures:\n\n* internal to a VPC or span multiple AWS regions\n* fully-automated setup including Elastic IPs (when needed), EC2 security groups, SSL certs\n* multi-region replication available (using Ec2MultiRegionSnitch_)\n* encrypted inter-node communication (SSL/TLS)\n* `EC2 Auto Recovery`_ enabled\n* Jolokia_ agent to expose JMX metrics via HTTP\n\nNon-Features:\n\n* dynamic cluster sizing - please see `STUPS Cassandra`_ if you need a dynamic Cassandra cluster setup\n\n\nPrerequisites\n==============\n\n* Python 3.5+\n* Python dependencies (``sudo pip3 install -r requirements.txt``)\n* Java 8 with ``keytool`` in your ``PATH`` (required to generate SSL certificates)\n* Latest Stups tooling installed and configured\n* You have created a dedicated AWS IAM user for auto-recovery.  The policy\n  document for this user should look like the following::\n\n    {\n        \"Version\": \"2012-10-17\",\n        \"Statement\": [\n            {\n                \"Effect\": \"Allow\",\n                \"Action\": [\n                    \"ec2:DescribeInstanceRecoveryAttribute\",\n                    \"ec2:RecoverInstances\",\n                    \"ec2:DescribeInstanceStatus\",\n                    \"ec2:DescribeInstances\",\n                    \"cloudwatch:PutMetricAlarm\"\n                ],\n                \"Resource\": [\n                    \"*\"\n                ]\n            }\n        ]\n    }\n* You have a ``planb_autorecovery`` section in your AWS credentials file\n  (``~/.aws/credentials``) with the access key of the auto-recovery user::\n\n    [planb_autorecovery]\n    aws_access_key_id = THEKEYID\n    aws_secret_access_key = THESECRETKEY\n\n  These credentials are only used to create the auto-recovery alarm.  When\n  triggered by the failing system status check, the recovery action is\n  performed by this dedicated user.\n\n  .. note::\n\n     The access keys for the auto-recovery user can be rotated or made\n     inactive at any time, without impacting its ability to perform the\n     recovery action.  The user still needs to be there, however.\n\n\nUsage\n=====\n\nCreate a new cluster\n--------------------\n\nTo create a cluster named \"mycluster\" in two regions with 3 nodes per region\n(the default size, enough for testing):\n\n.. code-block:: bash\n\n    $ zaws login  # get temporary AWS credentials\n    $ ./planb.py create --cluster-name mycluster --use-dmz eu-west-1 eu-central-1\n\nThe above example requires Elastic IPs to be allocated in every region (this\nmight require to increase the AWS limits for Elastic IPs).\n\nTo create a cluster in a single region, using private IPs only, see\nthe following example:\n\n.. code-block:: bash\n\n    $ ./planb.py create --cluster-name mycluster eu-central-1\n\nIt is possible to use Public IPs even with a single region, for\nexample, if your application(s) connect from different VPC(s).  This\nis currently **not recommended**, though, as there is no provision for\nclient-to-server encryption.\n\nAvailable options are:\n\n===========================  ============================================================================\n--cluster-name               Not actually an option, you must specify the name of a cluster to create\n--cluster-size               Number of nodes to create per AWS region.  Default: 3\n--dc-suffix                  Optional \"DC suffix\".\n--num-tokens                 Number of virtual nodes per node.  Default: 256\n--instance-type              AWS EC2 instance type to use for the nodes.  Default: t2.medium\n--volume-type                Type of EBS data volume to create for every node.  Default: gp2 (General Purpose SSD).\n--volume-size                Size of EBS data volume in GB for every node.  Default: 16\n--volume-iops                Number of provisioned IOPS for the volumes, used only for volume type of io1.  Default: 100 (when applicable).\n--no-termination-protection  Don't protect EC2 instances from accidental termination.  Useful for testing and development.\n--use-dmz                    Deploy the cluster into DMZ subnets using Public IPs (required for multi-region setup).\n--hosted-zone                Specify this to create SRV records for every region, listing all nodes' private IP addresses in that region.  This is optional.\n--scalyr-key                 API Key for writing logs to Scalyr (optional).\n--scalyr-region              Scalyr account region, such as 'eu' (optional).\n--artifact-name              Override Pierone artifact name.  Default: planb-cassandra-3.0\n--docker-image               Override default Docker image.\n--environment, -e            Extend/override environment section of Taupage user data.\n--sns-topic                  Amazon SNS topic name to use for notifications about Auto-Recovery.\n--sns-email                  Email address to subscribe to Amazon SNS notification topic.  See below for details.\n===========================  ============================================================================\n\nIn order to be able to receive notification emails in case instance\nrecovery is triggered, provide either SNS topic name in\n``--sns-topic``, or email to subscribe in ``--sns-email`` (or both).\n\nIf only the email address is specified, then SNS topic name defaults\nto ``planb-cassandra-system-event``.  An SNS topic will be created (if\nit doesn't exist) in each of the specified regions.  If email is\nspecified, then it will be subscribed to the topic.\n\nIf you use the Hosted Zone parameter, a full name specification is\nrequired e.g.: ``--hosted-zone myzone.example.com.`` (note the\ntrailing dot.)\n\nAfter the create command finishes successfully, follow the on-screen\ninstructions to create the admin superuser, set replication factors for\nsystem_auth keyspace and then create your application user and the data\nkeyspace.\n\nThe generated administrator password is available inside the docker\ncontainer in an environment variable ``ADMIN_PASSWORD``.\n\nThe list of private IP contact points for the application can be\nobtained with the following snippet:\n\n.. code-block:: bash\n\n    $ aws ec2 describe-instances --region $REGION --filter 'Name=tag:Name,Values=planb-cassandra' | grep PrivateIp | sed s/[^0-9.]//g | sort -u\n\nUpdate of a cluster\n-------------------\n\n.. important::\n\n   The Jolokia port 8778 should be accessible from the Odd host. Ensure the\n   ingress rule for your clusters security group allows connections from the Odd\n   host.\n\nTo update the Docker image or AMI you should ensure that you are logged in to\nyour account and have SSH access to your Odd host. The following commands will\nallow you to update the Docker image on all nodes of the cluster `mycluster`.\nIf an action is interrupted the next call will resume with the last action on\nthe last used node.\n\n.. code-block:: bash\n\n    $ zaws re $ACCOUNT  # for longer updates run `zaws login -r` in background\n    $ piu re -O $ODDHOST $ODDHOST  # for longer updates add `-t 180` or bigger\n    $ ./planb.py update \\\n        --region eu-central-1 \\\n        --odd-host $ODDHOST \\\n        --cluster-name mycluster \\\n        --docker-image registry.opensource.zalan.do/stups/planb-cassandra-3.0:cd-69 \\\n        --sns-topic planb-cassandra-system-event \\\n        --sns-email test@example.com\n\nAvailable options for update:\n\n===================  ========================================================\n--region             The region where the update should be applied (required)\n--odd-host           The Odd host in the region of your VPC (required)\n--cluster-name       The name of your cluster (required)\n--filters            Additional AWS resource filters (in JSON format)\n--force-termination  Disable termination protection for the duration of update\n--no-prompt          Don't prompt before updating every node.\n--docker-image       The full specified name of the Docker image\n--taupage-ami-id     The full specified name of the AMI\n--instance-type      The type of instance to deploy each node on (e.g. t2.medium)\n--scalyr-key         API Key for writing logs to Scalyr (optional).\n--scalyr-region      Scalyr account region, such as 'eu' (optional).\n--environment, -e    Extend/override environment section of Taupage user data.\n--sns-topic          Amazon SNS topic name to use for notifications about Auto-Recovery.\n--sns-email          Email address to subscribe to Amazon SNS notification topic.  See description of ``create`` subcommand above for details.\n===================  ========================================================\n\nThe cluster name parameter is used to list all EC2 instances in the region\nwith the matching ``Name`` tag.  This parameter may contain wildcards (``*``).\nFor example, if you have multiple virtual data centers in a cluster, this\nallows to update all nodes of all DCs by running only one command.\n\nAny additional resource filters supported by AWS may be provided (only JSON\nformat is supported, though).  For example, to limit the update operation to a\nspecific Availability Zone, add the following parameter: ``--filters\n'[{\"Name\":\"availability-zone\",\"Values\":[\"eu-central-1c\"]}]'``.\n\nBy default, ``update`` is an interactive command which operates on one node at a time.\nIt will prompt before starting update of each node.  It starts by draining the\ntarget node and then terminates the EC2 instance that is running it.  Then a new\nEC2 instance is created with the same private and public IP addresses (if any),\nand potentially different configuration as specified by the options.  The new\ninstance is expected to attach the EBS volume that was previously utilized by the\nnode.  This keeps all the node's data and identification within the cluster intact.\n\nThe command will wait for the replacement node to be back UP.  You should still\nmonitor the status of the cluster to verify that all other nodes also see the new\nnode as UP before proceeding.\n\nIf you're confident enough in using this command, you may opt in for \"fire and\nforget\" behavior, by specifying the ``--no-prompt`` flag.\n\nWhile performing the update, which destroys the running EC2 instance and creates a\nblank one, the command keeps the current state in the tags of the EBS data volume.\n\nIf interrupted by some unexpected problems, the command resumes the update sequence\nby using the information in the EBS volume tags.  This relies however on an assumption\nthat the command is ran again with essentially the same parameters on the same machine,\nsince some of the state is stored in a temporary file, named after the EBS volume id.\n\nIf the command enters `failed` state, as a safety precaution it will not try to proceed\nfurther, even if started again.  The operator is then responsible for analysing the\nfailure reason and removing the failed state tag from the related EBS volume before\nstarting the command again.  One common source of failed state is forgetting to use\n`--force-termination` flag on a cluster which was deployed with termination protection\nenabled.\n\nNo provisions are made by the command to detect if a concurrent update operation is\nin progress for a given cluster.  It makes sense to ensure that only one operator is\nusing the command as part of routine maintenance at any given time.\n\nExtend an existing cluster\n--------------------------\n\nThere are a number of scenarios requiring to extend an existing cluster.  The\npossible use-cases are::\n\n* Add a new \"virtual data center\"\n* Add a new region\n* Add more nodes to existing data center\n\nAvailable options for extend:\n\n==============================  ============================================================================\n--from-region                   Name of AWS region where a cluster is already running.\n--to-region                     Name of AWS region where a new data center should be created.  This can be the same as \"from region\", in this case a virtual data center is created.\n--cluster-name                  The name of a cluster to extend.\n--ring-size                     Number of nodes to create in the new data center.\n--dc-suffix                     Optional \"DC suffix\".  When creating a virtual data center be sure to specify a new suffix for each virtual data center you create!\n--num-tokens                    Number of virtual nodes per node.  Default: 256\n--allocate-tokens-for-keyspace  Use new token allocation algorithm, available starting with version 3.0.\n--instance-type                 AWS EC2 instance type to use for the nodes.  Default: t2.medium\n--volume-type                   Type of EBS data volume to create for every node.  Default: gp2 (General Purpose SSD).\n--volume-size                   Size of EBS data volume in GB for every node.  Default: 16\n--volume-iops                   Number of provisioned IOPS for the volumes, used only for volume type of io1.  Default: 100 (when applicable).\n--no-termination-protection     Don't protect EC2 instances from accidental termination.  Useful for testing and development.\n--use-dmz                       Deploy the new data center into DMZ subnets using Public IPs (required for multi-region setup).\n--hosted-zone                   Specify this to create the SRV record for the new data center.  This is optional.\n--artifact-name                 Override Pierone artifact name.  Default: planb-cassandra-3.0\n--docker-image                  Override default Docker image.\n--environment, -e               Extend/override environment section of Taupage user data.\n--sns-topic                     Amazon SNS topic name to use for notifications about Auto-Recovery.\n--sns-email                     Email address to subscribe to Amazon SNS notification topic.  See description of ``create`` subcommand above for details.\n==============================  ============================================================================\n\n-------------------------------\nAdd a new \"virtual data center\"\n-------------------------------\n\nTo add a new virtual data center in the same region where your existing\ncluster is running run the extend command like this:\n\n.. code-block:: bash\n\n    $ planb.py extend \\\n        --from-region eu-central-1 \\\n        --to-region eu-central-1 \\\n        --cluster-name mycluster \\\n        --ring-size 3 \\\n        --dc-suffix _new \\\n        --hosted-zone myzone.example.com.\n\n.. important::\n\n   In this mode the new nodes are created with ``auto_bootstrap: false``.\n   When creating a new virtual data center in the same region, you **must**\n   specify the DC suffix which doesn't exist in the region yet!  Otherwise you\n   risk adding a number of empty nodes to the cluster, which will be serving\n   read requests and your client applications will suffer from apparent data\n   loss.\n\nAfter the command has run successfully, you need to login to each of the nodes\nin the new data center and run ``nodetool rebuild $existing_dc_name``.\n\nOn version 3.0 or later it is possible to request use of the new token\nallocation algorithm.  For that, start by including the to-be-deployed virtual\nDC in the replication settings of the data keyspace, by running a CQL\nstatement like the following one on one of the existing cluster nodes:\n\n.. code-block::\n\n   cqlsh\u003e ALTER KEYSPACE mydata WITH replication = {\n       'class': 'NetworkTopologyStrategy',\n       'eu-central': 3,\n       'eu-central_new': 3\n   };\n\nThen run the extend command, specifying the\n``--allocate-tokens-for-keyspace=mydata`` as one of the options.\n\nWith the new token allocation algorithm it makes sense to use a much smaller\nnumber of tokens than the default 256.  E.g. 16 tokens are generally enough to\nachieve balanced ownership distribution.  Use the ``--num-tokens`` option to\nset the desired number of tokens per node.\n\n.. important::\n\n   In order for the token allocation algorithm to be actually used, the\n   ``auto_bootstrap`` parameter has to be set to ``true``.  This is done\n   automatically by the deployment script.  Due to this, before you can run\n   ``nodetool rebuild`` command on the nodes of the newly deployed ring, you\n   have to run manually the following CQL command on every new node:\n   ``TRUNCATE system.available_ranges``.\n\n----------------\nAdd a new region\n----------------\n\nTo extend a cluster to a new AWS region, run the command like this:\n\n.. code-block:: bash\n\n    $ planb.py extend \\\n        --from-region eu-central-1 \\\n        --to-region eu-west-1 \\\n        --cluster-name mycluster \\\n        --ring-size 3 \\\n        --use-dmz \\\n        --hosted-zone myzone.example.com.\n\nThe DC suffix is optional in this case, unless you already have a cluster with\nthis name in the target region.  You must specify the DMZ option, and the\nexisting cluster must already be running in the DMZ: otherwise the new and\nexisting nodes will not be able to communicate with each other.\n\n--------------------------------------\nAdd more nodes to existing data center\n--------------------------------------\n\nThis is currently unsupported, due to the use of `auto_bootstrap: false` when\ncreating new nodes.  In general, it should be possible to override this option\nand add the nodes one by one to the existing data center, but care should be\ntaken while doing so.\n\nRunning commands remotely on Cassandra nodes\n============================================\n\nThere is a command group called ``remote`` that allows you to run arbitrary\nshell commands on all nodes of a given Cassandra cluster.  This can be useful\nwhen applying a configuration change, e.g. setting compaction throughput:\n\n.. code-block:: bash\n\n    $ planb.py remote \\\n        --region eu-west-1 \\\n        --odd-host $ODDHOST \\\n        --cluster-name mycluster \\\n        --piu \"setting cassandra compaction throughput\" \\\n        nodetool \\\n        -- \\\n        setcompactionthroughput 50\n\nThe following options are available for the ``remote`` command:\n\n==============  ==================================================\n--region        AWS region.\n--odd-host      Odd host name for the first SSH hop.\n--cluster-name  The name of the cluster (Name tag on the EC2 instances).\n--filters       Additional AWS resource filters (in JSON format)\n--piu           Run ``piu`` first with this parameter as reason.\n--echo          Print the command before running it.\n--no-prompt     Don't prompt before running the command.\n--no-wait       Don't wait for the command to exit.\n--ip-label      Label all output from the node with its IP address.\n--help          Show this message and exit.\n==============  ==================================================\n\nThere are 3 subcommands in the ``remote`` command group:\n\n========  ==============================\nshell     Run an arbitrary shell command.\nnodetool  Run a nodetool command.\ncqlsh     Run an administrative CQL shell command.\n========  ==============================\n\nThe most basic is ``shell`` which allows to run any command on the server.\nTwo shorthand commands for running ``nodetool`` and ``cqlsh -u admin -p\n$ADMIN_PASSWORD`` are also provided.\n\nClient configuration for Public IPs setup\n=========================================\n\nWhen configuring your client application to talk to a Cassandra\ncluster deployed in AWS using Public IPs, be sure to enable address\ntranslation using EC2MultiRegionAddressTranslator_.  Not only it saves\ncosts when communicating within single AWS region, it also prevents\navailability problems when security group for your Cassandra is not\nconfigured to allow client access on Public IPs (via the region's NAT\ninstances addresses).\n\nEven if your client connects to the ring using Private IPs, the list\nof peers it gets from the first Cassandra node to be contacted only\nconsists of Public IPs in such setup.  Should that node go down at a\nlater time, the client has no chance of reconnecting to a different\nnode if the client traffic on Public IPs is not allowed.  For the same\nreason the client won't be able to distribute load efficiently, as it\nwill have to choose the same coordinator node for every request it\nsends (namely, the one it has first contacted via the Private IP).\n\n\nTroubleshooting\n===============\n\nTo watch the cluster's node status (e.g. joining during initial bootstrap):\n\n.. code-block:: bash\n\n    $ # on Taupage instance\n    $ watch docker exec -it taupageapp nodetool status\n\nThe output should look something like this (freshly bootstrapped cluster):\n\n::\n\n    Datacenter: eu-central\n    ======================\n    Status=Up/Down\n    |/ State=Normal/Leaving/Joining/Moving\n    --  Address        Load       Tokens  Owns (effective)  Host ID                               Rack\n    UN  52.29.137.93   66.59 KB   256     34.8%             62f50c2c-cb0f-4f62-a518-aa7b1fd04377  1a\n    UN  52.28.11.187   66.43 KB   256     31.1%             69d698a9-7357-46b2-93b8-6c038155f0c1  1b\n    UN  52.29.41.128   71.79 KB   256     35.0%             b76e7ed7-78de-4bbc-9742-13adbbcfd438  1a\n    Datacenter: eu-west\n    ===================\n    Status=Up/Down\n    |/ State=Normal/Leaving/Joining/Moving\n    --  Address        Load       Tokens  Owns (effective)  Host ID                               Rack\n    UN  52.49.209.129  91.29 KB   256     34.8%             140bc7de-9973-46fd-af8c-68148bf20524  1b\n    UN  52.49.192.149  81.16 KB   256     32.1%             cb45fc4c-291d-4b2b-b50f-3a11048f0211  1c\n    UN  52.49.128.58   81.22 KB   256     32.1%             8a270de3-b419-4baf-8449-f4bc65c51d0d  1a\n\n\nScaling up instance\n===================\n\nThe following manual process may be applied whenever there is a need\nto scale up EC2 instances or update Taupage AMI.\n\nFor every node in the cluster, one by one:\n\n#. Stop a node (``nodetool drain; nodetool stopdaemon``).\n#. Terminate EC2 instance, **take note of its IP address(es)**.  Simply stopping will not work as the private IP will be still occupied by the stopped instance.\n#. Use the 'Launch More Like This' menu in AWS web console on one of the remaining nodes.\n#. **Use the latest available Taupage AMI version.  Older versions are subject to data loss race conditions when attaching EBS volumes.**\n#. Be sure to reuse the private IP of the node you just terminated on the new node.\n#. In the 'Instance Details' section, edit 'User Data' to add ``erase_on_boot: false`` flag under ``mounts: /var/lib/cassandra``.  See documentation of Taupage_ for detailed description and syntax example.  The docker image version being used can also be updated in this section, however, it is recommended to avoid changing multiple things at a time.  Also, docker image can be updated without terminating the instance, by stopping and starting it with updated 'User Data' instead.\n#. While the new instance is spinning up, attach the (now detached) data volume to the new instance.  Use ``/dev/sdf`` as the device name.\n#. Log in to node, check application logs, if it didn't start up correctly: ``docker restart taupageapp``.\n#. Repair the node with ``nodetool repair`` (optional: if the node was down for less than ``max_hint_window_in_ms``, which is by default 3 hours, hinted hand off should take care of streaming the changes from alive nodes).\n#. Check status with ``nodetool status``.\n\nProceed with other nodes as long as the current one is back and\neverything looks OK from nodetool and application points of view.\n\n\nScaling out cluster\n===================\n\nIt is possible to manually scale out already deployed cluster by\nfollowing these steps:\n\n#. Increase replication factor of ``system_auth`` keyspace (if needed)\n   in every region affected.  Don't set RFs to be more than 5 per region\n   or virtual DC.\n\n   For example, if you run in two regions and want to scale to 5 nodes\n   per region, issue the following CQL command on any of the nodes:\n\n   ``ALTER KEYSPACE system_auth WITH replication = {'class': 'NetworkTopologyStrategy', 'eu-central': 5, 'eu-west': 5};``\n\n#. *For public IPs setup only:* pre-allocate Elastic IPs for the new\n   nodes in every region, then update security groups in every region\n   to include all newly allocated Elastic IP addresses.\n\n   For example, if scaling from 3 to 5 nodes in two regions you will\n   need 2 new IP addresses in every region and both security groups\n   need to be updated to include a total of 4 new addresses.\n\n#. Choose a private IP for the new instance, that is not already taken by any\n   other EC2 instance in the VPC.  You will need it on further steps.\n\n#. Create a new EBS volume of appropriate type and size (normally you want to\n   have the same settings as for the rest of the cluster).  EBS encryption is\n   not recommended as it might prevent auto-recovery.\n\n#. Create a ``Name`` tag on the volume in the format:\n   ``\u003ccluster-name\u003e-\u003cprivate-ip\u003e``.\n\n#. Create an additional tag on the newly created **empty EBS volume:**\n   the tag name should be ``Taupage:erase-on-boot`` and the value ``True``.\n\n#. Use the 'Launch More Like This' menu in the AWS web console on one\n   of the running nodes.\n\n#. Choose appropriate subnet for the new node: ``internal-...``\n   vs. ``dmz-...`` for public IPs setup.  The subnet need to match your\n   private IP, which should also be assigned manually on the same page.\n\n#. Make sure that under 'Instance Details' the setting 'Auto-assign\n   Public IP' is set to 'Disable'.\n\n#. **Review UserData.** Make sure that ``AUTO_BOOTSTRAP`` environment variable\n   is set to ``true`` or not present.  Update the referenced EBS volume to:\n   ``\u003ccluster-name\u003e-\u003cprivate-ip\u003e``\n\n#. Launch the instance.\n\n#. *For public IPs setup:* while the instance is starting up,\n   associate one of the pre-allocated Elastic IP addresses with it.\n\n   **Caution!** For multi-region setup the nodes are started in DMZ\n   subnet and thus don't have internet traffic before you give them a\n   public IP.  Be sure to do this before anything else, or the new\n   node won't be able to ship its logs and you won't be able to ssh\n   into it (restarting the node should help if it was too late).\n\n#. Monitor the logs of the new instance and ``nodetool status`` to\n   track its progress in joining the ring.\n\n#. Use the 'CloudWatch Monitoring' \u003e 'Add/Edit Alarms' to add an\n   auto-recovery alarm for the new instance.\n\n   Check '[x] Take the action: [*] Recover this instance' and leave\n   the rest of parameters at their default values.  It is also\n   recommended to set up a notification SNS topic for actual recovery\n   events.\n\nOnly when the new node has fully joined, proceed to add more nodes.\nAfter all new nodes have joined, issue ``nodetool cleanup`` command on\nevery node in order to free up the space that is still occupied by the\ndata that the node is no longer responsible for.\n\n.. _STUPS: https://stups.io/\n.. _Odd: http://docs.stups.io/en/latest/components/odd.html\n.. _Taupage: http://docs.stups.io/en/latest/components/taupage.html\n.. _Ec2MultiRegionSnitch: http://docs.datastax.com/en/cassandra/2.1/cassandra/architecture/architectureSnitchEC2MultiRegion_c.html\n.. _EC2MultiRegionAddressTranslator: https://datastax.github.io/java-driver/manual/address_resolution/#ec2-multi-region\n.. _EC2 Auto Recovery: https://aws.amazon.com/blogs/aws/new-auto-recovery-for-amazon-ec2/\n.. _Jolokia: https://jolokia.org/\n.. _STUPS Cassandra: https://github.com/zalando/stups-cassandra\n.. _Più: http://docs.stups.io/en/latest/components/piu.html\n\nUpgrade your cluster from Cassandra 2.1 -\u003e 3.0.x\n===================\n\nIn order to upgrade your Cluster you should run the following steps. You should have in mind that this process is a rolling update, which means applying the changes for each node in your cluster one by one.\nAfter upgrading the last node in your cluster you are done.\n\n**Disclaimer: Before you actually start, you should:**\n  1. Read the [Datastax guide](https://docs.datastax.com/en/latest-upgrade/upgrade/cassandra/upgrdCassandraDetails.html) and consider the upgrade restrictions.\n  2. Check if your client applications driver actually support V4 of the cql-protocol\n\n\n1. Check for the latest Plan-B Cassandra image version: \n  `curl https://registry.opensource.zalan.do/teams/stups/artifacts/planb-cassandra-3.0/tags | jq '.[-1].name'`\n2. Connect to the instance where you want to run the upgrade and enter your docker container. \n3. Run `nodetool upgradesstables` and `nodetool drain`. The latter command will flush the memtables and speed up the upgrade process later on. *This command is mandatory and cannot be skipped.*\n   Excerpt from the manual `Cassandra stops listening for connections from the client and other nodes. You need to restart Cassandra after running nodetool drain.`\n4. Remove the docker container by running on the host `docker rm -f taupageapp`\n5. If you are running cassandra with the old folder structure where the data is directly located in __mounts/var/lib/cassandra/__ do the following. **If not go on with step 6.** \n  1. Move all keyspaces to __/mounts/var/lib/cassandra/data/data__\n  2. Move the folder  commit_logs to __/mounts/var/lib/cassandra/data/commitlog__ \n  3. Move the folder saved_caches to __/mounts/var/lib/cassandra/data/__\n  4. Set owner of data folders to application\n    Example:\n    ```\n    **Before Move**\n\n    /mounts/var/lib/cassandra$ ls\n    commit_logs  keyspace_1 saved_caches  system_auth  system_traces \n\n\n    **After Move**\n\n    /mounts/var/lib/cassandra$ ls -la\n    total 28\n    drwxrwxrwx 4 application application  4096 Oct 10 12:21 .\n    drwxr-xr-x 3 root        root         4096 Aug 25 13:27 ..\n    drwxrwxr-x 5 application mpickhan     4096 Oct 10 12:21 data\n\n    /mounts/var/lib/cassandra$ ls -la data/\n    total 36\n    drwxrwxr-x 5 application mpickhan     4096 Oct 10 12:21 .\n    drwxrwxrwx 4 application application  4096 Oct 10 12:21 ..\n    drwxr-xr-x 2 application root        20480 Oct 10 12:15 commitlog\n    drwxrwxr-x 9 application mpickhan     4096 Oct 10 12:19 data\n    drwxr-xr-x 2 application root         4096 Oct 10 10:52 saved_caches\n\n    /mounts/var/lib/cassandra$ ls -la data/data/\n    total 36\n    drwxrwxr-x  9 application mpickhan 4096 Oct 10 12:19 .\n    drwxrwxr-x  5 application mpickhan 4096 Oct 10 12:21 ..\n    drwxr-xr-x 10 application root     4096 Aug 25 14:29 keyspace_1\n    drwxr-xr-x 19 application root     4096 Aug 25 13:27 system\n    drwxr-xr-x  5 application root     4096 Aug 25 13:27 system_auth\n    drwxr-xr-x  4 application root     4096 Aug 25 13:27 system_traces\n    ```\n6. **Stop** the ec2-Instance and change the user details `Go to Actions -\u003e Instance Settings -\u003e View/Change User Details` Change the \"source\" entry to the version you want to upgrade to:\n    **Important:** Use the stop command and __not__ terminate.\n    ```\n    Example:\n\n    From: \"source: registry.opensource.zalan.do/stups/planb-cassandra:cd89\"\n    To: \"source: registry.opensource.zalan.do/stups/planb-cassandra-3.0:cd105\"\n    ```\n7. Start the instance and connect to it. At this point your node should be working and serving reads and writes. Login to the docker container and finish the upgrade by running `nodetool upgradesstables`.\n   Check the logs for errors and warnings. (__Note:__ For the size of ~12GB SSTables it takes approximately one hour to convert them to the new format.)\n8. Proceed with each node in your cluster.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fzalando-stups%2Fplanb-cassandra","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fzalando-stups%2Fplanb-cassandra","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fzalando-stups%2Fplanb-cassandra/lists"}