{"id":15056664,"url":"https://github.com/northbrains/dashboard-1","last_synced_at":"2026-01-24T07:06:25.080Z","repository":{"id":239154202,"uuid":"781546128","full_name":"NorthBrains/dashboard-1","owner":"NorthBrains","description":"An interactive dashboard simulating live sales and warehouse data for the company.","archived":false,"fork":false,"pushed_at":"2024-10-25T04:37:32.000Z","size":3336,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-04-04T03:22:47.824Z","etag":null,"topics":["cassandra","containers","dashboard","data-engineering","data-science","docker","docker-compose","kafka","kafka-streaming","plotly","plotly-dash","spark","spark-streaming","streaming"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/NorthBrains.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-04-03T15:31:17.000Z","updated_at":"2024-09-04T07:15:30.000Z","dependencies_parsed_at":"2024-05-12T06:23:19.462Z","dependency_job_id":"fcca233e-5863-411a-bc49-312777c17489","html_url":"https://github.com/NorthBrains/dashboard-1","commit_stats":null,"previous_names":["northbrains/dashboard-1"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NorthBrains%2Fdashboard-1","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NorthBrains%2Fdashboard-1/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NorthBrains%2Fdashboard-1/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NorthBrains%2Fdashboard-1/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/NorthBrains","download_url":"https://codeload.github.com/NorthBrains/dashboard-1/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254567252,"owners_count":22092738,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cassandra","containers","dashboard","data-engineering","data-science","docker","docker-compose","kafka","kafka-streaming","plotly","plotly-dash","spark","spark-streaming","streaming"],"created_at":"2024-09-24T21:54:47.879Z","updated_at":"2026-01-24T07:06:25.048Z","avatar_url":"https://github.com/NorthBrains.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003ch1\u003eSales and Warehouse Live Data Dashboard 📊\u003c/h1\u003e\n\n\u003cimg src=\"./diagrams/dashboard.png\" alt=\"Dashboard Diagram\"\u003e\n\n\u003cp\u003e\nThis project is designed to create a live data dashboard for Sales and Warehouse using a modern data streaming and processing architecture. The architecture leverages Apache Kafka for data streaming, Apache Spark for processing, Cassandra for data storage, and Plotly Dash for creating an analytical dashboard. All components are containerized using Docker to ensure easy deployment and scalability.\n\u003c/p\u003e\n\n\u003ch2\u003eArchitecture Overview 🔧\u003c/h2\u003e\n\n\u003cul\u003e\n    \u003cli\u003e\u003cstrong\u003e🐳 Docker:\u003c/strong\u003e All components are containerized to ensure consistent environments across different platforms and ease of deployment.\u003c/li\u003e\n    \u003cli\u003e\u003cstrong\u003e🐍 Python Script (Producer):\u003c/strong\u003e Acts as the data source, sending data streams to specific Kafka topics (\"Sales\" and \"Warehouse\").\u003c/li\u003e\n    \u003cli\u003e\u003cstrong\u003e📦 Kafka Cluster:\u003c/strong\u003e Comprises three controllers and three brokers to manage and distribute the data streams. This setup uses Kraft (Kafka Raft) mode instead of the traditional Zookeeper-based architecture. Kraft eliminates the need for Zookeeper by integrating consensus and metadata management directly into Kafka. This results in a simpler architecture with reduced operational complexity, improved scalability, and faster recovery from failures. Kafka automatically creates the required topics (init topics container) and ensures data flows correctly through the system.\u003c/li\u003e\n    \u003cli\u003e\u003cstrong\u003e⚡ Apache Spark (Consumer):\u003c/strong\u003e Receives and processes the streaming data from Kafka. The Spark cluster consists of two workers and one master node, where Spark submits jobs (on the workers) one after the other, processes the data based on the Kafka topics (\"Sales\" and \"Warehouse\"), and inserts the results into Cassandra, acting as a consumer.\u003c/li\u003e\n    \u003cli\u003e\u003cstrong\u003e📊 Cassandra:\u003c/strong\u003e Stores the processed data from Spark. It offers high availability and scalability, making it ideal for real-time data storage.\u003c/li\u003e\n    \u003cli\u003e\u003cstrong\u003e📈 Plotly Dash:\u003c/strong\u003e Provides an analytical dashboard for visualizing the data stored in Cassandra, allowing users to interact with and analyze the live data streams. It allows us to switch between Sales and Warehouse live data.\u003c/li\u003e\n    \u003cli\u003e\u003cstrong\u003e🚀 init-cassandra Container:\u003c/strong\u003e This additional container automatically creates the keyspace and the necessary tables in Cassandra when the environment is started, ensuring full automation of the setup process.\u003c/li\u003e\n\u003c/ul\u003e\n\n\u003ch2\u003eGetting Started\u003c/h2\u003e\n\n\u003ch3\u003ePrerequisites\u003c/h3\u003e\n\u003cp\u003e🐳 Docker and Docker Compose installed on your machine\u003c/p\u003e\n\n\u003ch3\u003eSetting Up the Environment\u003c/h3\u003e\n\n\u003cli\u003e\u003cstrong\u003eClone the repository:\u003c/strong\u003e\n        \u003cpre\u003e\u003ccode\u003egit clone git@github.com:NorthBrains/dashboard-1.git\u003c/code\u003e\u003c/pre\u003e\n\u003c/li\u003e\n\n\u003cli\u003e\u003cstrong\u003eStart the Docker containers:\u003c/strong\u003e\n        \u003cpre\u003e\u003ccode\u003edocker-compose up -d\u003c/code\u003e\u003c/pre\u003e\n        \u003cp\u003eOnce Docker Compose is up, all services including the streaming, processing, and dashboard will automatically start without requiring additional configuration.\u003c/p\u003e\n\u003c/li\u003e\n\n\u003cli\u003e\u003cstrong\u003eSet Up Cassandra Keyspace and Tables:\u003c/strong\u003e\n        \u003cp\u003eAfter the Cassandra container is up and running, the \u003cstrong\u003einit-cassandra\u003c/strong\u003e container will automatically create the necessary keyspace and tables. This container ensures that the database schema is initialized properly without manual intervention.\u003c/p\u003e\n\u003c/li\u003e\n\n\u003ch3\u003eChecking if the Streaming is Working\u003c/h3\u003e\n\u003cp\u003eTo verify if the streaming is working correctly, you can execute the following commands to query the data in Cassandra:\u003c/p\u003e\n\u003cpre\u003e\u003ccode\u003edocker exec -it cassandra_one cqlsh -u cassandra -p cassandra\u003c/code\u003e\u003c/pre\u003e\n\u003cpre\u003e\u003ccode\u003eSELECT * FROM company_one.sales_data;\u003c/code\u003e\u003c/pre\u003e\n\u003cpre\u003e\u003ccode\u003eSELECT * FROM company_one.warehouse_data;\u003c/code\u003e\u003c/pre\u003e\n\n\u003ch3\u003eAccessing the Spark Master GUI\u003c/h3\u003e\n\u003cp\u003eYou can access the Spark Master GUI by navigating to \u003ca href=\"http://localhost:8190\"\u003ehttp://localhost:8190\u003c/a\u003e in your web browser.\u003c/p\u003e\n\u003cp\u003eThis interface allows you to monitor the status of running Spark workers and applications. You can check the health and performance of the Spark cluster, including the details of each worker node, active jobs, stages, and tasks.\u003c/p\u003e\n\n\u003ch3\u003eAccessing the Dashboard\u003c/h3\u003e\n\u003cp\u003eOnce all services are up and running, you can access the Plotly Dash dashboard by navigating to \u003ca href=\"http://localhost:8900\"\u003ehttp://localhost:8900\u003c/a\u003e in your web browser.\u003c/p\u003e\n\n\u003ch3\u003eStopping the Environment\u003c/h3\u003e\n\u003cp\u003eTo stop and remove all running containers, execute:\u003c/p\u003e\n\u003cpre\u003e\u003ccode\u003edocker-compose down\u003c/code\u003e\u003c/pre\u003e\n\n\u003ch2\u003eAdditional Information\u003c/h2\u003e\n\n\u003cul\u003e\n    \u003cli\u003e\u003cstrong\u003e📦 Kafka:\u003c/strong\u003e Ensure that the topics are correctly initialized and the data is being streamed to the appropriate topics (\u003ccode\u003eSales\u003c/code\u003e and \u003ccode\u003eWarehouse\u003c/code\u003e).\u003c/li\u003e\n    \u003cli\u003e\u003cstrong\u003e⚡ Spark:\u003c/strong\u003e The Spark jobs should be configured to read from Kafka, process the data, and write the results to Cassandra.\u003c/li\u003e\n    \u003cli\u003e\u003cstrong\u003e📊 Cassandra:\u003c/strong\u003e Regularly monitor the storage and performance to ensure that it scales according to the incoming data volume.\u003c/li\u003e\n    \u003cli\u003e\u003cstrong\u003e📈 Dash:\u003c/strong\u003e Customize the dashboards as needed to include more visualizations or interactive elements that suit your data analysis needs.\u003c/li\u003e\n\u003c/ul\u003e\n\n\u003ch2\u003eContributions\u003c/h2\u003e\n\u003cp\u003eFeel free to contribute to this project by submitting issues or pull requests. All contributions are welcome and appreciated!\u003c/p\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnorthbrains%2Fdashboard-1","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fnorthbrains%2Fdashboard-1","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnorthbrains%2Fdashboard-1/lists"}