{"id":29007189,"url":"https://github.com/epishova/structured-streaming-cassandra-sink","last_synced_at":"2025-07-09T09:15:28.395Z","repository":{"id":138823578,"uuid":"141181244","full_name":"epishova/Structured-Streaming-Cassandra-Sink","owner":"epishova","description":"An example of how to create and use Cassandra sink in Spark Structured Streaming application","archived":false,"fork":false,"pushed_at":"2018-10-06T18:37:06.000Z","size":54,"stargazers_count":8,"open_issues_count":0,"forks_count":2,"subscribers_count":0,"default_branch":"master","last_synced_at":"2025-06-25T13:07:33.386Z","etag":null,"topics":["cassandra","scala","sink","spark","structured-streaming"],"latest_commit_sha":null,"homepage":null,"language":"Scala","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/epishova.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2018-07-16T19:00:20.000Z","updated_at":"2022-05-31T14:55:06.000Z","dependencies_parsed_at":null,"dependency_job_id":"c2066bc0-5893-40b9-9ebf-a4278a56b69d","html_url":"https://github.com/epishova/Structured-Streaming-Cassandra-Sink","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/epishova/Structured-Streaming-Cassandra-Sink","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/epishova%2FStructured-Streaming-Cassandra-Sink","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/epishova%2FStructured-Streaming-Cassandra-Sink/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/epishova%2FStructured-Streaming-Cassandra-Sink/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/epishova%2FStructured-Streaming-Cassandra-Sink/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/epishova","download_url":"https://codeload.github.com/epishova/Structured-Streaming-Cassandra-Sink/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/epishova%2FStructured-Streaming-Cassandra-Sink/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":264428789,"owners_count":23606692,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cassandra","scala","sink","spark","structured-streaming"],"created_at":"2025-06-25T13:07:31.940Z","updated_at":"2025-07-09T09:15:28.389Z","avatar_url":"https://github.com/epishova.png","language":"Scala","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Structured-Streaming-Cassandra-Sink\n### An example of how to create and use Cassandra sink in Spark Structured Streaming application\n\nThis code was developed as part of the Insight Data Engineering [project](https://github.com/epishova/FXTrue-Structured-Streaming-Insight-Project). This is a simple example of how to create and use Cassandra sink in Spark Structured Streaming. I hope it will be useful for those who have just begun to work with Structured Streaming API. I am new to it too, so comments and suggestions on how to improve the application are very welcome.\n\nThe idea of this application is very simple. It reads messages from Kafka, parses them, and saves them into Cassandra. This example was run on AWS cluster, so if you'd like to test it just replace the addresses of my AWS instances with yours (everything that looks like `ec2-xx-xxx-xx-xx.compute-1.amazonaws.com`).\n\nThis repo contains `pom.xml` and can be built with Maven by `mvn package`. After that you can execute the application using\n`./bin/spark-submit --packages org.apache.spark:spark-sql-kafka-0-10_2.11:2.3.1,datastax:spark-cassandra-connector:2.3.0-s_2.11 --class com.insight.app.CassandraSink.KafkaToCassandra --master spark://ec2-18-232-26-53.compute-1.amazonaws.com:7077 target/cassandra-sink-0.0.1-SNAPSHOT.jar`.\n\nYou can read the detailed description in `blog_draft.md` or [here](https://dzone.com/articles/cassandra-sink-for-spark-structured-streaming).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fepishova%2Fstructured-streaming-cassandra-sink","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fepishova%2Fstructured-streaming-cassandra-sink","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fepishova%2Fstructured-streaming-cassandra-sink/lists"}