Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/knands42/data-modeling-with-cassandra
Build a simple ETL and model a few cassandra databases to recieve this values and query from it
https://github.com/knands42/data-modeling-with-cassandra
cassandra etl python
Last synced: 10 days ago
JSON representation
Build a simple ETL and model a few cassandra databases to recieve this values and query from it
- Host: GitHub
- URL: https://github.com/knands42/data-modeling-with-cassandra
- Owner: knands42
- Created: 2024-09-18T02:31:46.000Z (5 months ago)
- Default Branch: main
- Last Pushed: 2024-10-09T02:32:24.000Z (4 months ago)
- Last Synced: 2025-01-21T01:31:40.151Z (14 days ago)
- Topics: cassandra, etl, python
- Language: Jupyter Notebook
- Homepage:
- Size: 590 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Data-Modeling-With-Cassandra
Resolution of the first project `Data Modeling (with Apache Cassandra)` from [Data Engineering with AWS](https://www.udacity.com/enrollment/nd027) from [Udacity](https://www.udacity.com/).
### Project Overview
Create a NoSQL database in Apache Cassandra for a musica streaming app start-up called Sparkify. Model songs and user activity data to optimize queries for understanding app user behavior such as what songs users are listening to.
* Build an ETL pipeline to transform a set of CSV files into a denormalized dataset
* Design and create Apache Cassandra data tables to answer specified business questions
* Insert data from the new dataaset to the Apache Cassandra tables
* Test by running `SELECT` statements to varify the data that have been inserted into each table