{"id":15056722,"url":"https://github.com/spotify/cstar","last_synced_at":"2025-04-04T18:08:25.439Z","repository":{"id":32944351,"uuid":"142997020","full_name":"spotify/cstar","owner":"spotify","description":"Apache Cassandra cluster orchestration tool for the command line","archived":false,"fork":false,"pushed_at":"2024-01-04T17:38:31.000Z","size":332,"stargazers_count":256,"open_issues_count":9,"forks_count":37,"subscribers_count":21,"default_branch":"master","last_synced_at":"2025-03-28T05:11:08.575Z","etag":null,"topics":["cassandra","orchestration","python"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/spotify.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2018-07-31T10:07:26.000Z","updated_at":"2025-02-28T18:11:14.000Z","dependencies_parsed_at":"2022-08-07T18:30:23.738Z","dependency_job_id":"8cfda85d-da58-4a7b-810b-7a38695091f7","html_url":"https://github.com/spotify/cstar","commit_stats":{"total_commits":74,"total_committers":24,"mean_commits":"3.0833333333333335","dds":0.6486486486486487,"last_synced_commit":"af21a6043b9efaa12e5691975db27db9646da725"},"previous_names":[],"tags_count":12,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/spotify%2Fcstar","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/spotify%2Fcstar/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/spotify%2Fcstar/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/spotify%2Fcstar/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/spotify","download_url":"https://codeload.github.com/spotify/cstar/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247226215,"owners_count":20904465,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cassandra","orchestration","python"],"created_at":"2024-09-24T21:55:49.035Z","updated_at":"2025-04-04T18:08:25.402Z","avatar_url":"https://github.com/spotify.png","language":"Python","funding_links":[],"categories":["Python","Packages"],"sub_categories":["Open Source Applications"],"readme":"# cstar\n\n[![CircleCI](https://circleci.com/gh/spotify/cstar/tree/master.svg?style=shield)](https://circleci.com/gh/spotify/cstar)\n[![License](https://img.shields.io/github/license/spotify/cstar.svg)](LICENSE)\n\n`cstar` is an Apache Cassandra cluster orchestration tool for the command line.\n\n[![asciicast](https://asciinema.org/a/BJkHpAGCdkSXTAhYf7bPVmerz.png)](https://asciinema.org/a/BJkHpAGCdkSXTAhYf7bPVmerz?autoplay=1)\n\n## Why not simply use Ansible or Fabric?\n\nAnsible does not have the primitives required to run things in a topology aware fashion. One could\nsplit the C* cluster into groups that can be safely executed in parallel and run one group at a time.\nBut unless the job takes almost exactly the same amount of time to run on every host, such a solution\nwould run with a significantly lower rate of parallelism, not to mention it would be kludgy enough to\nbe unpleasant to work with.\n\nUnfortunately, Fabric is not thread safe, so the same type of limitations apply. Fabric allows one to\nrun a job in parallel on many machines, but with similar restrictions as those of Ansible groups.\nIt’s possibly to use fabric and celery together to do what is needed, but it’s a very complicated\nsolution.\n\n## Requirements\n\nAll involved machines are assumed to be some sort of UNIX-like system like OS X or Linux. The machine\nrunning cstar must have python3, the Cassandra hosts must have a Bourne style shell.\n\n## Installing\n\nYou need to have Python3 and run an updated version of pip (9.0.1).\n\n    # pip3 install cstar\n\nIt's also possible to install straight from repo. This installs the latest version that may not be pushed to pypi:\n\n    # pip install git+https://github.com/spotify/cstar.git\n\n\n\n## Code of conduct\n\nThis project adheres to the\n[Open Code of Conduct](https://github.com/spotify/code-of-conduct/blob/master/code-of-conduct.md).\nBy participating, you are expected to honor this code.\n\n## CLI\n\nCStar is run through the cstar command, like so\n\n    # cstar COMMAND [HOST-SPEC] [PARAMETERS]\n\nThe HOST-SPEC specifies what nodes to run the script on. There are three ways to specify a the spec:\n\n1. The `--seed-host` switch tells cstar to connect to a specific host and fetch the full ring topology\n   from there, and then run the script on all nodes in the cluster. `--seed-host` can be specified\n   multiple times, and multiple hosts can be specified as a comma-separated list in order to run a\n   script across multiple clusters.\n2. The `--host` switch specifies an exact list of hosts to use. `--host` can be specified multiple\n   times, and multiple hosts can be specified as a comma-separated list.\n3. The `--host-file` switch points to a file name containing a newline separated list of hosts. This\n   can be used together with process substitution, e.g. `--host-file \u003c(dig -t srv ...)`\n\nThe command is the name of a script located in either `/usr/lib/cstar/commands` or in\n`~/.cstar/commands`. This script will be uploaded to all nodes in the cluster and executed. File suffixes\nare stripped. The requirements of the script are described below. Cstar comes pre-packaged with one script file\ncalled ``run`` which takes a single parameter ``--command`` - see examples below.\n\nSome additional switches to control cstar:\n\n* One can override the parallelism specified in a script by setting the switches\n  `--cluster-parallel`, `--dc-parallel` and `--strategy`.\n\nThere are two special case invocations:\n\n* One can skip the script name and instead use the `continue` command to specify a previously halted job\n  to resume.\n\n* One can skip the script name and instead use the `cleanup-jobs`. See [Cleaning up old jobs](#Cleaning-up-old-jobs).\n\n* If you need to access the remote cluster with a specific username, add `--ssh-username=remote_username` to your cstar command line. A private key file can also be specified using `--ssh-identity-file=my_key_file.pem`.\n\n* To use plain text authentication, please add `--ssh-password=my_password` to the command line.\n\n* In order to run the command first on a single node and then stop execution to verify everything worked as expected, add the following flag to your command line : `--stop-after=1`. cstar will stop after the first node executed the command and print out the appropriate resume command to continue the execution when ready : `cstar continue \u003cJOB_ID\u003e`\n\nA script file can specify additional parameters.\n\n## Command syntax\n\nIn order to run a command, it is first uploaded to the relevant host, and then executed from there.\n\nCommands can be written in any scripting language in which the hash symbol starts a line comment, e.g.\nshell-script, python, perl or ruby.\n\nThe first line must be a valid shebang. After that, commented lines containing key value pairs may\nbe used to override how the script is parallelised as well as providing additional parameters for\nthe script, e.g. `# C* dc-parallel: true`\n\nThe possible keys are:\n\n`cluster-parallel`, can the script be run on multiple clusters in parallel. Default value is `true`.  \n\n`dc-parallel`, can the script be run on multiple data centers in the same cluster in parallel. Default value is `false`.\n\n`strategy`, how many nodes within one data center can the script be run on. Default is `topology`.\nCan be one of:\n\n* `one`, only one node per data center\n* `topology`, inspect topology and run on as many nodes as the topology allows\n* `all`, can be run on all nodes at once\n\n`description`, specifies a description for the script used in the help message.\n\n`argument`, specifies an additional input parameter for the script, as well as a help text and an\noptional default value.\n\n## Job output\n\nCstar automatically saves the job status to file during operation.\n\nStandard output, standard error and exit status of each command run against a Cassandra host is\nsaved locally on machine where cstar is running. They are available under the users home directory in\n`.cstar/jobs/JOB_ID/HOSTNAME`\n\n## How jobs are run\n\nWhen a new cstar job is created, it is assigned an id. (It's a UUID)\n\nCstar stores intermediate job output in the directory\n`~/.cstar/remote_jobs/\u003cJOB_ID\u003e`. This directory contains files with the stdout, stderr and PID of the\nscript, and once it finishes, it will also contain a file with the exit status of the script.\n\nOnce the job finishes, these files will be moved over to the original host and put in the directory `~/.cstar/jobs/\u003cJOB_ID\u003e/\u003cREMOTE_HOST_NAME\u003e`.\n\nCstar jobs are run nohuped, this means that even if the ssh connection is severed, the job will proceed.\nIn order to kill a cstar script invocation on a specific host, you will need ssh to the host and kill\nthe proccess.\n\nIf a job is halted half-way, either by pressing `^C` or by using the `--stop-after` parameter, it can be\nrestarted using `cstar continue \u003cJOB_ID\u003e`. If the script was finished or already running when cstar\nshut down, it will not be rerun.\n\n## Cleaning up old jobs\n\nEven on successful completion, the output of a cstar job is not deleted. This means it's easy to check\nwhat the output of a script was after it completed. The downside of this is that you can get a lot of\ndata lying around in `~/.cstar/jobs`. In order to clean things up, you can use\n`cstar cleanup-jobs`. By default it will remove all jobs older than one week. You can override the\nmaximum age of a job before it's deleted by using the `--max-job-age` parameter.\n\n## Examples\n\n    # cstar run --command='service cassandra restart' --seed-host some-host\n\nExplanation: Run the local cli command ``service cassandra restart`` on a cluster. If necessary, add ``sudo`` to the\ncommand.\n\n    # cstar puppet-upgrade-cassandra --seed-host some-host --puppet-branch=cass-2.2-upgrade\n\nExplanation: Run the command puppet-upgrade-cassandra on a cluster. The puppet-upgrade-cassandra\ncommand expects a parameter, the puppet branch to run in order to perform the Cassandra upgrade. See the\npuppet-upgrade-cassandra example [below](#Example-script-file).\n\n    # cstar puppet-upgrade-cassandra --help\n\nExplanation: Show help for the puppet-upgrade-cassandra command. This includes documentation for any\nadditional command-specific switches for the puppet-upgrade-cassandra command.\n\n    # cstar continue 90642c11-4714-44c4-a13a-94b86f09e3bb\n\nExplanation: Resume previously created job with job id 90642c11-4714-44c4-a13a-94b86f09e3bb.\nThe job id is the first line written on any executed job.\n\n## Example script file\n\nThis is an example script file that would saved to `~/.cstar/commands/puppet-upgrade-cassandra.sh`. It upgrades a\nCassandra cluster by running puppet on a different branch, then restarting the node, then upgrading the sstables.\n\n    # !/usr/bin/env bash\n    # C* cluster-parallel: true                                                                                                                                                                                    \n    # C* dc-parallel: true                                                                                                                                                                                         \n    # C* strategy: topology                                                                                                                                                                                        \n    # C* description: Upgrade one or more clusters by switching to a different puppet branch                                                                                                                       \n    # C* argument: {\"option\":\"--snapshot-name\", \"name\":\"SNAPSHOT_NAME\", \"description\":\"Name of pre-upgrade snapshot\", \"default\":\"preupgrade\"}                                                                      \n    # C* argument: {\"option\":\"--puppet-branch\", \"name\":\"PUPPET_BRANCH\", \"description\":\"Name of puppet branch to switch to\", \"required\":true}                                                                       \n\n    nodetool snapshot -t $SNAPSHOT_NAME\n    sudo puppet --branch $PUPPET_BRANCH\n    sudo service cassandra restart\n    nodetool upgradesstables\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fspotify%2Fcstar","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fspotify%2Fcstar","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fspotify%2Fcstar/lists"}