{"id":15056795,"url":"https://github.com/kayvansol/cassandra","last_synced_at":"2025-10-11T14:33:11.050Z","repository":{"id":238906509,"uuid":"797930605","full_name":"kayvansol/Cassandra","owner":"kayvansol","description":"Apache Cassandra Cluster with docker compose","archived":false,"fork":false,"pushed_at":"2024-05-17T21:02:42.000Z","size":1273,"stargazers_count":10,"open_issues_count":0,"forks_count":4,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-10-11T14:32:39.192Z","etag":null,"topics":["apache-cassandra","cassandra-cluster","cql","cqlsh","docker","docker-compose","gossip","nodetool"],"latest_commit_sha":null,"homepage":"https://medium.com/@kayvan.sol2/deploying-apache-cassandra-cluster-3-nodes-with-docker-compose-3634ef8345e8","language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/kayvansol.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-05-08T18:56:36.000Z","updated_at":"2025-09-22T10:55:12.000Z","dependencies_parsed_at":"2024-05-15T12:41:37.057Z","dependency_job_id":null,"html_url":"https://github.com/kayvansol/Cassandra","commit_stats":null,"previous_names":["kayvansol/cassandra"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/kayvansol/Cassandra","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kayvansol%2FCassandra","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kayvansol%2FCassandra/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kayvansol%2FCassandra/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kayvansol%2FCassandra/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/kayvansol","download_url":"https://codeload.github.com/kayvansol/Cassandra/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kayvansol%2FCassandra/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":279007450,"owners_count":26084313,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-11T02:00:06.511Z","response_time":55,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["apache-cassandra","cassandra-cluster","cql","cqlsh","docker","docker-compose","gossip","nodetool"],"created_at":"2024-09-24T21:56:32.535Z","updated_at":"2025-10-11T14:33:11.035Z","avatar_url":"https://github.com/kayvansol.png","language":null,"funding_links":[],"categories":[],"sub_categories":[],"readme":"# Deploying Apache Cassandra Cluster (3 Nodes) With Docker Compose\n\n![alt text](https://raw.githubusercontent.com/kayvansol/Cassandra/main/img/cassandra.jpg?raw=true)\n\nPull the cassandra docker image :\n```bash\ndocker pull cassandra\n```\n\nSome Notices and roles about deployment :\n\n- Controlling the startup order of the nodes in the Compose file, such that Compose first makes sure that the seed node cassandra-1 is up and healthy, then starts cassandra-2 and also makes sure that cassandra-2 node is up and healthy, then starts cassandra-3, and so on. Basically, preventing nodes from all starting simultaneously, especially the nodes after the seed node. When nodes are started simultaneously with Compose, it can lead to errors such as a conflict with token ranges, causing some of the nodes to fail to join the cluster.\n\n- Using a Snitch configuration that more resembles your production environment, which is usually when a multi-node or multi-cluster or multi-datacenter becomes necessary. For example, you can use a GossipingPropertyFileSnitch, which is also the same Snitch type used in the Cassandra tutorial for Initializing a multiple node cluster (multiple datacenters).\n\n- Explicitly setting the CASSANDRA_CLUSTER_NAME and CASSANDRA_DC environment variables, which correspondingly sets the cluster_name on the cassandra.yaml config and the dc option on the cassandra-rackdc.properties file. This allows you explicitly tell the nodes to join the same datacenter and cluster. These options are only relevant for GossipingPropertyFileSnitch.\n\n\ndocker compose file with network named cassandra-net :\n```yaml\nversion: \"3.3\"\n\nnetworks:\n  cassandra-net:\n    driver: bridge\n\nservices:\n\n  cassandra-1:\n    image: \"cassandra:latest\"  # cassandra:4.1.3\n    container_name: \"cassandra-1\"\n    ports:\n      - 7000:7000\n      - 9042:9042\n    networks:\n      - cassandra-net\n    environment:\n      - CASSANDRA_START_RPC=true       # default\n      - CASSANDRA_RPC_ADDRESS=0.0.0.0  # default\n      - CASSANDRA_LISTEN_ADDRESS=auto  # default, use IP addr of container # = CASSANDRA_BROADCAST_ADDRESS\n      - CASSANDRA_CLUSTER_NAME=my-cluster\n      - CASSANDRA_ENDPOINT_SNITCH=GossipingPropertyFileSnitch\n      - CASSANDRA_DC=my-datacenter-1\n    volumes:\n      - cassandra-node-1:/var/lib/cassandra:rw\n    restart:\n      on-failure\n    healthcheck:\n      test: [\"CMD-SHELL\", \"nodetool status\"]\n      interval: 2m\n      start_period: 2m\n      timeout: 10s\n      retries: 3\n\n  cassandra-2:\n    image: \"cassandra:latest\"  # cassandra:4.1.3\n    container_name: \"cassandra-2\"\n    ports:\n      - 9043:9042\n    networks:\n      - cassandra-net\n    environment:\n      - CASSANDRA_START_RPC=true       # default\n      - CASSANDRA_RPC_ADDRESS=0.0.0.0  # default\n      - CASSANDRA_LISTEN_ADDRESS=auto  # default, use IP addr of container # = CASSANDRA_BROADCAST_ADDRESS\n      - CASSANDRA_CLUSTER_NAME=my-cluster\n      - CASSANDRA_ENDPOINT_SNITCH=GossipingPropertyFileSnitch\n      - CASSANDRA_DC=my-datacenter-1\n      - CASSANDRA_SEEDS=cassandra-1\n    depends_on:\n      cassandra-1:\n        condition: service_healthy\n    volumes:\n      - cassandra-node-2:/var/lib/cassandra:rw\n    restart:\n      on-failure\n    healthcheck:\n      test: [\"CMD-SHELL\", \"nodetool status\"]\n      interval: 2m\n      start_period: 2m\n      timeout: 10s\n      retries: 3\n\n  cassandra-3:\n    image: \"cassandra:latest\"  # cassandra:4.1.3\n    container_name: \"cassandra-3\"\n    ports:\n      - 9044:9042\n    networks:\n      - cassandra-net\n    environment:\n      - CASSANDRA_START_RPC=true       # default\n      - CASSANDRA_RPC_ADDRESS=0.0.0.0  # default\n      - CASSANDRA_LISTEN_ADDRESS=auto  # default, use IP addr of container # = CASSANDRA_BROADCAST_ADDRESS\n      - CASSANDRA_CLUSTER_NAME=my-cluster\n      - CASSANDRA_ENDPOINT_SNITCH=GossipingPropertyFileSnitch\n      - CASSANDRA_DC=my-datacenter-1\n      - CASSANDRA_SEEDS=cassandra-1\n    depends_on:\n      cassandra-2:\n        condition: service_healthy\n    volumes:\n      - cassandra-node-3:/var/lib/cassandra:rw\n    restart:\n      on-failure\n    healthcheck:\n      test: [\"CMD-SHELL\", \"nodetool status\"]\n      interval: 2m\n      start_period: 2m\n      timeout: 10s\n      retries: 3\n\nvolumes:\n  cassandra-node-1:\n  cassandra-node-2:\n  cassandra-node-3:\n\n```\n\nNote : The main thing here are the healthcheck blocks :\n```yaml\nhealthcheck:\n      test: [\"CMD-SHELL\", \"nodetool status\"]\n      interval: 2m\n      start_period: 2m\n      timeout: 10s\n      retries: 3\n```\n\nand the updated depends_on on each node :\n```yaml\ndepends_on:\n      cassandra-2:\n        condition: service_healthy\n```\n\nThe modified Compose sets cassandra-3 to only start when cassandra-2 is healthy, and to only start cassandra-2 when cassandra-1 is healthy.\n\n\nIn that Compose file:\n\n- Call nodetool status after 2 minutes (to give time for the node to bootup/bootstrap)\n- If it responds in \u003c10s and the exit code is 0, the node is to be considered healthy\n- Repeat the check every 2m and for 3 times.\n\nThere's also some extra env vars in there :\n```yaml\nenvironment:\n      - CASSANDRA_START_RPC=true       # default\n      - CASSANDRA_RPC_ADDRESS=0.0.0.0  # default\n      - CASSANDRA_LISTEN_ADDRESS=auto  # default, use IP addr of container # = CASSANDRA_BROADCAST_ADDRESS\n```\n\nwhich may not be needed, since those are already the defaults on the cassandra Docker image (see section on [Configuring Cassandra](https://hub.docker.com/_/cassandra) from the Dockerhub page. Basically, those explicitly set the IP address of the containers to be both the listen and broadcast address. I'm just noting it here in case the defaults change.\n\nif you are running all the nodes in the same machine, you need to specify different ports for each of them :\n```yaml\n\ncassandra-1:\n    ...\n    ports:\n      - 7000:7000\n      - 9042:9042\n\n  cassandra-2:\n    ...\n    ports:\n      - 9043:9042\n\n  cassandra-3:\n    ...\n    ports:\n      - 9044:9042\n\n```\notherwise, the containers may not start correctly.\n\nStart your deployment :\n```bash\ndocker compose up -d\n```\n\nDocker Desktop :\n\n![alt text](https://raw.githubusercontent.com/kayvansol/Cassandra/main/img/containers.png?raw=true)\n\nCheck all containers log for being healthy and joining process :\n\n![alt text](https://raw.githubusercontent.com/kayvansol/Cassandra/main/img/joiningLog.png?raw=true)\n\nThe [gossip protocol](https://docs.datastax.com/en/cassandra-oss/3.0/cassandra/architecture/archGossipAbout.html) should detect if a particular node goes down and mark it as such, but still keep it in the list.\n\nUse nodetool for check the status in all 3 nodes :\n```bash\ndocker exec cassandra-3 nodetool status\n```\n![alt text](https://raw.githubusercontent.com/kayvansol/Cassandra/main/img/nodetool.png?raw=true)\n\nThe main problem with this config is that starting the nodes takes a long time. In that sample Compose file where healthcheck.interval is 2m, it takes about ~5mins for all 3 nodes to properly start-up.\n\nThe Cassandra Query Language (CQL) is very similar to SQL but suited for the JOINless structure of Cassandra.\n\nStart to working with the cluster with cqlsh ...\n\nWe want to create a keyspace, the layer at which Cassandra replicates its data, a table to hold the data, and insert some data into that table :\n```bash\ndocker exec -it cassandra-1 bash\n```\n\n```sql\n# cqlsh\n...\ncqlsh\u003e CREATE KEYSPACE IF NOT EXISTS store WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : '1' };\n\ncqlsh\u003e CREATE TABLE IF NOT EXISTS store.shopping_cart (\n        userid text PRIMARY KEY,\n        item_count int,\n        last_update_timestamp timestamp\n);\n\ncqlsh\u003e INSERT INTO store.shopping_cart\n      (userid, item_count, last_update_timestamp)\n      VALUES ('9876', 2, toTimeStamp(now()));\n\ncqlsh\u003e INSERT INTO store.shopping_cart\n      (userid, item_count, last_update_timestamp)\n      VALUES ('1234', 5, toTimeStamp(now()));\n\ncqlsh\u003e SELECT * FROM store.shopping_cart;\n\n```\nthe result in all 3 nodes :\n\n![alt text](https://raw.githubusercontent.com/kayvansol/Cassandra/main/img/cqlsh1.png?raw=true)\n\n![alt text](https://raw.githubusercontent.com/kayvansol/Cassandra/main/img/cqlsh2.png?raw=true)\n\n![alt text](https://raw.githubusercontent.com/kayvansol/Cassandra/main/img/cqlsh3.png?raw=true)\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkayvansol%2Fcassandra","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fkayvansol%2Fcassandra","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkayvansol%2Fcassandra/lists"}