{"id":27370221,"url":"https://github.com/datastaxdevs/workshop-cassandra-data-modeling","last_synced_at":"2025-04-13T08:48:23.244Z","repository":{"id":43603219,"uuid":"306084327","full_name":"datastaxdevs/workshop-cassandra-data-modeling","owner":"datastaxdevs","description":"This session looks at how to effectively design a data model for your application. You’ll leave knowing how to create data models that scale effectively as your system grows ","archived":false,"fork":false,"pushed_at":"2023-02-11T01:10:45.000Z","size":15668,"stargazers_count":21,"open_issues_count":1,"forks_count":12,"subscribers_count":5,"default_branch":"main","last_synced_at":"2023-03-04T04:05:51.286Z","etag":null,"topics":["cassandra","cql","nosql","workshop"],"latest_commit_sha":null,"homepage":"","language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/datastaxdevs.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2020-10-21T16:29:22.000Z","updated_at":"2023-03-03T10:10:46.000Z","dependencies_parsed_at":"2023-02-12T06:45:49.498Z","dependency_job_id":null,"html_url":"https://github.com/datastaxdevs/workshop-cassandra-data-modeling","commit_stats":null,"previous_names":[],"tags_count":null,"template":null,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/datastaxdevs%2Fworkshop-cassandra-data-modeling","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/datastaxdevs%2Fworkshop-cassandra-data-modeling/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/datastaxdevs%2Fworkshop-cassandra-data-modeling/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/datastaxdevs%2Fworkshop-cassandra-data-modeling/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/datastaxdevs","download_url":"https://codeload.github.com/datastaxdevs/workshop-cassandra-data-modeling/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248688195,"owners_count":21145762,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cassandra","cql","nosql","workshop"],"created_at":"2025-04-13T08:48:22.637Z","updated_at":"2025-04-13T08:48:23.236Z","avatar_url":"https://github.com/datastaxdevs.png","language":null,"funding_links":[],"categories":[],"sub_categories":[],"readme":"# 🎓🎓 Apache Cassandra® Data Modelling\n\nWelcome to the **Apache Cassandra® Data Modelling** workshop! In this two-hour workshop, we show the methodology to build an effective data model with the distributed `NoSQL database Apache Cassandra™`.\n\nUsing **Astra DB**, the cloud based _Cassandra-as-a-Service_ platform delivered by DataStax, we will cover the process for every developer who wants to build an application: list the use cases and build an effective data model.\n\n![](images/splash.png)\n\nIt doesn't matter if you join our workshop live or you prefer to do at your own pace, we have you covered. In this repository, you'll find everything you need for this workshop:\n\n\u003e [🔖 Accessing HANDS-ON](#-start-hands-on)\n\n## 📋 Table of content\n\n\u003cimg src=\"images/illustrations.png?raw=true\" align=\"right\" width=\"300px\"/\u003e\n\n1. [Objectives](#1-objectives)\n2. [Frequently Asked Questions](#2-frequently-asked-questions)\n3. [Materials for the Session](#3-materials-for-the-session)\n4. [Create Your Astra DB Instance](#4-create-your-astra-db-instance)\n5. [Tables with Single-Row and Multi-Row Partitions](#5-tables-with-single-row-and-multi-row-partitions)\n6. [Dynamic Bucketing](#6-dynamic-bucketing)\n7. [Working with Data Types](#7-working-with-data-types)\n8. [KDM Data Modeling Tool](#8-kdm-data-modeling-tool)\n9. [Sensor Data Modeling](#9-sensor-data-modeling)\n10. [Homework](#10-homework)\n11. [What's NEXT ](#11-whats-next-)\n\u003cp\u003e\u003cbr/\u003e\n\n## 1. Objectives\n\n1️⃣ **Understand how data is distributed and organized in Apache Cassandra®**\n\n2️⃣ **Learn how primary, partition, and clustering keys are defined in Apache Cassandra®**\n\n3️⃣ **Become familiar with CQL data types in Apache Cassandra®**\n\n4️⃣ **Learn about the data modeling methodology for Apache Cassandra®**\n\n🚀 **Have fun with an interactive session**\n\n## 2. Frequently Asked Questions\n\n\u003cp/\u003e\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003e 1️⃣ Can I run this workshop on my computer?\u003c/b\u003e\u003c/summary\u003e\n\u003chr\u003e\n\u003cp\u003eThere is nothing preventing you from running the workshop on your own machine. If you do so, you will need the following:\n\u003col\u003e\n\u003cli\u003e\u003cb\u003egit\u003c/b\u003e installed on your local system\n\u003c/ol\u003e\n\u003c/p\u003e\nIn this readme, we try to provide instructions for local development as well - but keep in mind that the main focus is development on Gitpod, hence \u003cstrong\u003ewe can't guarantee live support\u003c/strong\u003e about local development in order to keep on track with the schedule. However, we will do our best to give you the info you need to succeed.\n\u003c/details\u003e\n\u003cp/\u003e\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003e 2️⃣ What other prerequisites are required?\u003c/b\u003e\u003c/summary\u003e\n\u003chr\u003e\n\u003cul\u003e\n\u003cli\u003eYou will need enough \"real estate\" on screen, we will ask you to open a few windows and it would not fit on mobiles (tablets should be OK)\n\u003cli\u003eYou will need an Astra account: don't worry, we'll work through that in the following\n\u003cli\u003eAs \"Intermediate level\" we expect you to know what java and Spring are.\n\u003c/ul\u003e\n\u003c/p\u003e\n\u003c/details\u003e\n\u003cp/\u003e\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003e 3️⃣ Do I need to pay for anything for this workshop?\u003c/b\u003e\u003c/summary\u003e\n\u003chr\u003e\n\u003cb\u003eNo.\u003c/b\u003e All tools and services we provide here are FREE. FREE not only during the session but also after.\n\u003c/details\u003e\n\u003cp/\u003e\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003e 4️⃣ Will I get a certificate if I attend this workshop?\u003c/b\u003e\u003c/summary\u003e\n\u003chr\u003e\nAttending the session is not enough. You need to complete the homework detailed below and you will get a nice badge that you can share on LinkedIn or anywhere else.\n\u003c/details\u003e\n\u003cp/\u003e\n\n## 3. Materials for the Session\n\nIt doesn't matter if you join our workshop live or you prefer to work at your own pace,\nwe have you covered. In this repository, you'll find everything you need for this workshop:\n\n- [Slide deck](/slides/slides.pdf)\n- [Discord chat](https://dtsx.io/discord)\n- [Questions and Answers](https://community.datastax.com/)\n\n----\n\n# 🏁 Start Hands-on\n\n## 4. Create Your Astra DB Instance\n\n_**`ASTRA DB`** is the simplest way to run Cassandra with zero operations at all - just push the button and get your cluster. No credit card required, 40M read/write operations and about 80GB storage monthly for free - sufficient to run small production workloads. If you end your credits the databases will pause, no charge_\n\nLeveraging [Database creation guide](https://awesome-astra.github.io/docs/pages/astra/create-instance/#c-procedure) create a database. *Right-Click the button* with *Open in a new TAB.*\n\n\u003ca href=\"https://astra.dev/yt-7-27\"\u003e\u003cimg src=\"images/create_astra_db_button.png?raw=true\" /\u003e\u003c/a\u003e\n\n|Field|Value|\n|---|---|\n|**Database Name**| `workshops`|\n|**Keyspace Name**| `sensor_data`|\n|**Regions**| Select `GOOGLE CLOUD`, then an Area close to you, then a region with no LOCKER 🔒 icons, those are the region you can use for free.   |\n\n\u003e **ℹ️ Note:** If you already have a database `workshops`, simply add a keyspace `sensor_data` using the `Add Keyspace` button on the bottom right hand corner of db dashboard page.\n\nWhile the database is being created, you will also get a **Security token**:\nsave it somewhere safe, as it will be needed to later in other workshops (In particular the string starting with `AstraCS:...`.)\n\n\u003e **⚠️ Important**\n\u003e ```\n\u003e The instructor will show you on screen how to create a token\n\u003e but will have to destroy to token immediately for security reasons.\n\u003e ```\n\nThe status will change from `Pending` to `Active` when the database is ready, this will only take 2-3 minutes. You will also receive an email when it is ready.\n\n[🏠 Back to Table of Contents](#-table-of-content)\n\n## 5. Tables with Single-Row and Multi-Row Partitions\n\nA [GitHub](https://github.com) account may be required to run this hands-on lab in Gitpod.\n\n### ✅ Part 1: [Tables with Single-Row Partitions](https://gitpod.io/#https://github.com/DataStax-Academy/workshop-cassandra-data-modeling-tables-single-row-partitions/)\n\n### ✅ Part 2: [Tables with Multi-Row Partitions](https://gitpod.io/#https://github.com/DataStax-Academy/workshop-cassandra-data-modeling-tables-multi-row-partitions/)\n\n[🏠 Back to Table of Contents](#-table-of-content)\n\n## 6. Dynamic Bucketing\n\n\u003cdetails\u003e\n  \u003csummary\u003e \u003ch3\u003e📌 Homework 1\u003c/h3\u003e\u003c/summary\u003e\n\nConsider the table that supports query `Find all sensors in a specified network`:\n```sql\nCREATE TABLE sensors_by_network_2 (\n  network TEXT,\n  sensor TEXT,\n  PRIMARY KEY ((network), sensor)\n);\n```\n\nAssume that a network may have none to millions of sensors. With dynamic bucketing, we can introduce artificial buckets to store sensors. A network with a few sensors may only need one bucket. A network with many sensors may need many buckets. Once buckets belonging to a particular network get filled with sensors, we can dynamically assign new buckets to store new sensors of this network.\n\n📘 **Implement dynamic bucketing in Astra DB**\n```sql\n-- Table to manage buckets\nCREATE TABLE buckets_by_network (\n  network TEXT,\n  bucket TIMEUUID,\n  PRIMARY KEY ((network), bucket)\n) WITH CLUSTERING ORDER BY (bucket DESC);\n\n-- Table to store sensors\nCREATE TABLE sensors_by_bucket (\n  bucket TIMEUUID,\n  sensor TEXT,\n  PRIMARY KEY ((bucket), sensor)\n);\n\n\n-- Sample data\nINSERT INTO buckets_by_network (network, bucket) VALUES ('forest-net', 49171ffe-0d12-11ed-861d-0242ac120002);\nINSERT INTO buckets_by_network (network, bucket) VALUES ('forest-net', 74a13ede-0d12-11ed-861d-0242ac120002);\n\nINSERT INTO sensors_by_bucket (bucket, sensor) VALUES (49171ffe-0d12-11ed-861d-0242ac120002, 's1001');\nINSERT INTO sensors_by_bucket (bucket, sensor) VALUES (49171ffe-0d12-11ed-861d-0242ac120002, 's1002');\n\nINSERT INTO sensors_by_bucket (bucket, sensor) VALUES (74a13ede-0d12-11ed-861d-0242ac120002, 's1003');\n```\n\n📘 **Add a new sensor to a network**\n\n1. Get the latest bucket.\n```sql\nSELECT bucket FROM buckets_by_network WHERE network = 'forest-net' LIMIT 1;\n```\n\n2. Check the number of sensors in the bucket.\n```sql\nSELECT COUNT(*) AS sensors\nFROM sensors_by_bucket WHERE bucket = 74a13ede-0d12-11ed-861d-0242ac120002;\n```\n\n3. Depending on the sensors-per-bucket threshold, insert a new sensor into the existing bucket, or create a new bucket and insert into the new bucket.\n```sql\nINSERT INTO sensors_by_bucket (bucket, sensor) VALUES (74a13ede-0d12-11ed-861d-0242ac120002, 's1004');\n```\n\n📘 **Retrieve sensors in a specified network**\n\n1. Retrieve the buckets\n```sql\nSELECT bucket FROM buckets_by_network WHERE network = 'forest-net';\n```\n\n2. Retrieve the sensors\n```sql\nSELECT sensor\nFROM sensors_by_bucket\nWHERE bucket IN (74a13ede-0d12-11ed-861d-0242ac120002, 49171ffe-0d12-11ed-861d-0242ac120002);\n```\n\n\u003c/details\u003e\n\n[🏠 Back to Table of Contents](#-table-of-content)\n\n## 7. Working with Data Types\n\n\u003cdetails\u003e\n  \u003csummary\u003e \u003ch3\u003e📌 Homework 2\u003c/h3\u003e\u003c/summary\u003e\n\n### ✅ Step 7a. `List` Collections\n\n📘 **Command to execute**\n\n```sql\n// Definition\nCREATE TABLE IF NOT EXISTS table_with_list (\n  uid      uuid,\n  items    list\u003ctext\u003e,\n  PRIMARY KEY (uid)\n);\n\n// Insert\nINSERT INTO table_with_list(uid,items)\nVALUES (c7133017-6409-4d7a-9479-07a5c1e79306, ['a', 'b', 'c']);\n\n// Replace\nUPDATE table_with_list SET items = ['d', 'e']\nWHERE uid = c7133017-6409-4d7a-9479-07a5c1e79306;\n\n// Show result\nSELECT * FROM table_with_list ;\n\n// Append to list\nUPDATE table_with_list SET items = items + ['f']\nWHERE uid = c7133017-6409-4d7a-9479-07a5c1e79306;\n\n// Replace an element (not available in Astra because read before write)\nUPDATE table_with_list SET items[0] = ['g']\nWHERE uid = c7133017-6409-4d7a-9479-07a5c1e79306;\n```\n\n### ✅ Step 7b. `Set` Collections\n\n📘 **Command to execute**\n\n```sql\n// Definition\nCREATE TABLE IF NOT EXISTS table_with_set (\n  uid      uuid,\n  animals  set\u003ctext\u003e,\n  PRIMARY KEY (uid)\n);\n\n// Insert\nINSERT INTO table_with_set(uid,animals)\nVALUES (87fad746-4adf-4107-9858-df8643564186, {'spider', 'cat', 'dog'});\n\n// Replace\nUPDATE table_with_set SET animals = {'pangolin', 'bat'}\nWHERE uid = 87fad746-4adf-4107-9858-df8643564186;\n\n// Show result\nSELECT * FROM table_with_set;\n\n// Append to Set\nUPDATE table_with_set SET animals = animals + {'sheep'}\nWHERE uid = 87fad746-4adf-4107-9858-df8643564186;\n```\n\n### ✅ Step 7c. `Map` Collections\n\n📘 **Command to execute**\n\n\n```sql\n// Definition\nCREATE TABLE IF NOT EXISTS table_with_map (\n  uid         text,\n  dictionary  map\u003ctext, text\u003e,\n  PRIMARY KEY (uid)\n);\n\n// Insert\nINSERT INTO table_with_map(uid, dictionary)\nVALUES ('fr_en', {'fromage':'cheese', 'vin':'wine', 'pain':'bread'});\n\n// Replace\nUPDATE table_with_map SET dictionary = {'saucisse': 'sausage'}\nWHERE uid = 'fr_en';\n\n// Show result\nSELECT * FROM table_with_map;\n\n// Append to Map\nUPDATE table_with_map SET dictionary = dictionary + {'frites':'fries'}\nWHERE uid = 'fr_en';\n```\n\n### ✅ Step 7d. User-Defined Types\n\n📘 **Command to execute**\n\n```sql\n// Definition\nCREATE TYPE IF NOT EXISTS udt_address (\n  street text,\n  city text,\n  state text,\n);\n\n// Use the UDT in a table\nCREATE TABLE IF NOT EXISTS table_with_udt (\n  uid      text,\n  address   udt_address,\n  PRIMARY KEY (uid)\n);\n\n// INSERT (not quote on field names like street)\nINSERT INTO table_with_udt(uid, address)\nVALUES ('superman', {street:'daily planet',city:'metropolis',state:'CA'});\n\n// Replace\nUPDATE table_with_udt\nSET address = {street:'pingouin alley',city:'antarctica',state:'melting'}\nWHERE uid = 'superman';\n\n// Replace a single field\nUPDATE table_with_udt\nSET address.state = 'melt'\nWHERE uid = 'superman';\n```\n\n### ✅ Step 7e. Counters\n\n📘 **Command to execute**\n\n```sql\n// Definition\nCREATE TABLE IF NOT EXISTS table_with_counters (\n  handle        text,\n  following     counter,\n  followers     counter,\n  notifications counter,\n  PRIMARY KEY (handle)\n);\n\n// You have a new follower\nUPDATE table_with_counters SET followers = followers + 1\nWHERE  handle = 'clunven';\n\n// Some counters are... null\nSELECT * from table_with_counters;\n\n// Set to 0... but set is not valid\nUPDATE table_with_counters\nSET following = following + 0, notifications = notifications + 0\nWHERE handle = 'clunven';\n\n// Following someone\nUPDATE table_with_counters SET following = following + 1\nWHERE handle = 'clunven';\n\n// You have a new message\nUPDATE table_with_counters SET notifications = notifications + 1\nWHERE handle = 'clunven';\n\n```\n\n\u003c/details\u003e\n\n[🏠 Back to Table of Contents](#-table-of-content)\n\n## 8. KDM Data Modeling Tool\n\n\u003cdetails\u003e\n  \u003csummary\u003e \u003ch3\u003e🍿 Demo\u003c/h3\u003e\u003c/summary\u003e\n\n### ✅ Download [the project XML file](https://raw.githubusercontent.com/datastaxdevs/workshop-cassandra-data-modeling/main/materials/kdm_sensor_data.xml).\n\n### ✅ Open [the KDM tool](http://kdm.kashliev.com/).\n\n### ✅ Import the project by selecting `Import Project` from the menu and specifying file `kdm_sensor_data.xml`.\n\n![](images/kdm_01.png)\n\n![](images/kdm_02.png)\n\n### ✅ Explore the five data modeling steps supported by KDM. Note that the conceptual data model in Step 1 and queries in Step 2 are already defined.\n\n![](images/kdm_03.png)\n\n\u003c/details\u003e\n\n[🏠 Back to Table of Contents](#-table-of-content)\n\n## 9. Sensor Data Modeling\n\nA [GitHub](https://github.com) account may be required to run this hands-on lab in Gitpod.\n\n### ✅ [Sensor Data Modeling](https://gitpod.io/#https://github.com/DataStax-Academy/workshop-cassandra-data-modeling-sensor-data/)\n\n[🏠 Back to Table of Contents](#-table-of-content)\n\n\n## 10. Homework\n\n1. Complete [Working with Data Types](#7-working-with-data-types). Take a screenshot of the CQL Console showing the rows in tables\n`table_with_udt` and `table_with_counters` before _and_ after executing the DELETE statements.\n\n2. Complete the mini-course [Time Series Data Modeling](https://www.datastax.com/learn/data-modeling-by-example/time-series-model). Take a screenshot of the final screen of the practice lab, with the console output at the right.\n\n3. [Submit your homework](https://forms.gle/Z69y4MM3SpEDg7nt5) and be awarded a nice verifiable badge!\n\n[🏠 Back to Table of Contents](#-table-of-content)\n\n## 11. What's NEXT ?\n\nWe've just scratched the surface of what you can do using Astra DB, built on Apache Cassandra.\n\nGo take a look at [DataStax for Developers](https://www.datastax.com/dev) to see what else is possible.\nThere's plenty to dig into!\n\nCongratulations: you made to the end of today's workshop.\n\n![Badge](images/badge_data_modeling.png)\n\n**... and see you at our next workshop!**\n\n\u003e Sincerely yours, The DataStax Developers\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdatastaxdevs%2Fworkshop-cassandra-data-modeling","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdatastaxdevs%2Fworkshop-cassandra-data-modeling","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdatastaxdevs%2Fworkshop-cassandra-data-modeling/lists"}