Projects in Awesome Lists tagged with aws-redshift
A curated list of projects in awesome lists tagged with aws-redshift .
https://github.com/alanchn31/data-engineering-projects
Personal Data Engineering Projects
airflow aws-redshift cassandra data-engineering data-engineering-nanodegree data-lake data-modeling data-warehouse ingest-data mongodb postgres scrapy spark star-schema
Last synced: 12 Apr 2025
https://github.com/alanchn31/Data-Engineering-Projects
Personal Data Engineering Projects
airflow aws-redshift cassandra data-engineering data-engineering-nanodegree data-lake data-modeling data-warehouse ingest-data mongodb postgres scrapy spark star-schema
Last synced: 16 Apr 2025
https://github.com/tokern/piicatcher
Scan databases and data warehouses for PII data. Tag tables and columns in data catalogs like Amundsen and Datahub
aws-athena aws-glue aws-redshift catalog data data-catalog database phi pii python snowflake
Last synced: 12 Apr 2025
https://github.com/aws/amazon-redshift-python-driver
Redshift Python Connector. It supports Python Database API Specification v2.0.
amazon-redshift aws-redshift data-analysis data-science
Last synced: 14 May 2025
https://github.com/alanchn31/Movalytics-Data-Warehouse
Data pipeline performing ETL to AWS Redshift using Spark, orchestrated with Apache Airflow
airflow analytics aws-redshift aws-s3 data-engineer-nanodegree data-engineering data-engineering-pipeline data-modelling data-warehouse-cloud docker movie-database movie-recommendation movie-reviews pyspark python3 redshift spark sql udacity
Last synced: 29 Jul 2025
https://github.com/wittline/uber-expenses-tracking
The goal of this project is to track the expenses of Uber Rides and Uber Eats through data Engineering processes using technologies such as Apache Airflow, AWS Redshift and Power BI.
airflow-docker apache-airflow aws aws-redshift data-engineering data-modeling etl-pipeline expenses-dashboard expenses-tracker power-bi python uber uber-data uber-eats
Last synced: 13 Apr 2025
https://github.com/aws-solutions/clickstream-analytics-on-aws
Clickstream Analytics on AWS source code
aws aws-amplify aws-cdk aws-clickstream-solution aws-emr-serverless aws-kinesis-stream aws-msk aws-quicksight aws-redshift aws-solutions clickstream data-analysis web-analysis web-analytics
Last synced: 23 Aug 2025
https://github.com/kenthsu/udacity-data-engineering-nanodgree
Udacity Data Engineering Nanodegree Program
apache-airflow apache-cassandra apache-spark aws-redshift aws-s3 data-engineering data-lake data-pipelines data-quality data-warehouses postgresql
Last synced: 10 Apr 2025
https://github.com/heroku-examples/analytics-with-kafka-redshift-metabase
An example system that captures a large stream of product usage data, or events, and provides both real-time data visualization and SQL-based data analytics.
aws-redshift data-analytics data-visualization heroku kafka metabase
Last synced: 30 Apr 2025
https://github.com/moritzkoerber/covid-19-data-engineering-pipeline
A Covid-19 data pipeline on AWS featuring PySpark/Glue, Docker, Great Expectations, Airflow, and Redshift, templated in CloudFormation and CDK, deployable via Github Actions.
apache-airflow apache-spark api aws aws-cdk aws-cloudformation aws-ecr aws-glue aws-lambda aws-redshift aws-s3 docker great-expectations pyspark spark
Last synced: 28 Apr 2025
https://github.com/lovenui/dataengineering-capstone-project
airflow aws-redshift aws-s3 data-engineering python spark sql
Last synced: 11 Apr 2025
https://github.com/kishlayjeet/zomato-twitter-sentiment-analysis-data-pipeline
This project provides valuable customer sentiment insights for Zomato by tracking and analyzing tweets related to their brand and services.
airflow aws-lambda aws-redshift aws-s3 boto3 data-engineering data-pipeline etl nltk pandas psycopg2 python selenium sentiment-data-pipeline twitter-data-pipeline twitter-sentiment-analysis vedar-lexicon zomato-data-analysis zomato-data-pipeline
Last synced: 19 Aug 2025
https://github.com/vsouza/spark-kinesis-redshift
Example project for consuming AWS Kinesis streamming and save data on Amazon Redshift using Apache Spark
aws aws-kinesis aws-kinesis-stream aws-redshift etl etl-pipeline python shell spark spark-streaming
Last synced: 24 Apr 2025
https://github.com/kishaningithub/rdapp
rdapp - Redshift Data API Postgres Proxy
aws aws-redshift go golang hacktoberfest redshift
Last synced: 14 Apr 2025
https://github.com/lenguyenthedat/aws-redshift-to-rds
A simple command-line tool to copy tables from Amazon Redshift to Amazon RDS (PostgreSQL).
amazon-rds amazon-redshift aws aws-redshift haskell rds
Last synced: 13 Apr 2025
https://github.com/federicoserini/dend-project-3-data-warehouse-aws
Project 3 - Data Engineering Nanodegree
aws aws-redshift aws-s3 data-engineering udacity-nanodegree
Last synced: 12 Jun 2025
https://github.com/federicoserini/dend-project-5-data-pipelines
Project 5 - Data Engineering Nanodegree
apache-airflow aws aws-redshift aws-s3 data-engineering data-pipelines udacity-nanodegree
Last synced: 22 Apr 2025
https://github.com/mikecerton/the-retail-elt-pipeline-end-to-end
This project designs and implements an ETL pipeline using Apache Airflow (Docker Compose) to ingest, process, and store retail data. AWS S3 acts as the data lake, AWS Redshift as the data warehouse, and Looker Studio for visualization. [Data Engineer]
apache-airflow aws-redshift aws-s3 data-engineer etl-pipeline looker-studio
Last synced: 02 Apr 2025
https://github.com/giulic3/data-engineering-nanodegree
Projects realized for the Data Engineering Nanodegree offered by Udacity https://www.udacity.com/course/data-engineer-nanodegree--nd027
apache-airflow apache-cassandra apache-spark aws aws-emr aws-redshift aws-s3 data-engineering postgresql
Last synced: 15 Aug 2025
https://github.com/hanan-nawaz/flighttragedyanalysis
Flight Tragedy Analysis is a comprehensive data analysis project focused on examining aviation accidents and incidents from 1905 to 2009. This project provides users with valuable insights into historical plane crashes and their associated data.
airplane-crashes aws aws-redshift aws-s3 data-engineering etl kaggle postgresql power-bi psycopg2 python sql
Last synced: 08 Sep 2025
https://github.com/kingyiusuen/udacity-data-engineering-nanodegree
Projects for Udacity's Data Engineering Nanodegree
airflow aws aws-athena aws-glue aws-redshift aws-s3 cassandra data-engineering spark
Last synced: 18 Oct 2025
https://github.com/marcy-terui/catlass
Cloud Automation as Code with Cloud Automator
automation aws aws-ec2 aws-rds aws-redshift aws-s3 aws-sqs infrastructure-as-code
Last synced: 14 Mar 2025
https://github.com/andre-marcos-perez/ifood-arch-readme
The application is the documentation of my solution for the iFood data architect test.
aws aws-athena aws-cloudwatch aws-emr aws-lambda aws-rds aws-redshift aws-s3 aws-step-functions
Last synced: 08 Oct 2025
https://github.com/eljandoubi/airflow-data-pipeline
Airflow data pipeline
airflow aws-redshift postgresql
Last synced: 08 Aug 2025
https://github.com/rathod-shubham/amazon-cloud
List of amazing AWS Services that can be utilized.🚀
amazon amazon-cloudfront amazon-cloudwatch amazon-s3 amazon-sagemaker amazon-sns amazon-web-services amazoncloud automation aws aws-amplify aws-apigateway aws-appsync aws-cognito aws-comprehend aws-deeplens aws-greengrass aws-lambda aws-redshift aws-s3
Last synced: 12 Aug 2025
https://github.com/jibbs1703/awsresourcemanager
This repository contains the python modules and packages make up the AWS Resource Manager, a custom python package/wheel designed to simplify the management of AWS services through custom-written use cases and utilities. This repository serves to reinforce my knowledge on building python packages and wheels.
aws-credentials aws-dynamodb aws-ec2 aws-management aws-redshift aws-resources aws-s3
Last synced: 14 Mar 2025
https://github.com/desininja/airline-data-ingestion-pipeline
ETL pipeline using AWS services.
aws aws-glue aws-redshift aws-s3 aws-step-function data-engineering etl
Last synced: 12 Nov 2025
https://github.com/shrikantnaidu/data-warehousing-with-aws
Data Warehousing with AWS
aws aws-redshift data-warehousing etl-pipeline iaac
Last synced: 12 Oct 2025
https://github.com/vermicida/data-warehouse
Data Warehouse, the code corresponding the project #3 of the Udacity's Data Engineer Nanodegree Program
aws-redshift data-engineering data-warehouse etl-pipeline python
Last synced: 15 May 2025
https://github.com/bhawnamehbubani/kafka-spark-redshift-streaming-data-ingestion-project
This project is a real-time data pipeline designed for ingesting, processing, and storing telecom call records. It integrates Apache Kafka, Apache Spark Streaming, and AWS Redshift to handle large volumes of streaming data in near real-time. The pipeline is containerized with Docker Compose, enabling easy deployment, scalability, and modularity.
apache-kafka apache-spark aws-redshift docker spark-streaming
Last synced: 21 Mar 2025
https://github.com/maxinexiong/cloud-data-warehousing-with-aws-redshift
This project builds a cloud-based ETL pipeline for Sparkify to move data to a cloud data warehouse. It extracts song and user activity data from AWS S3, stages it in Redshift, and transforms it into a star-schema data model with fact and dimension tables, enabling efficient querying to answer business questions.
aws-boto3 aws-redshift aws-s3 cloud-data-warehouse data-engineering data-warehouse data-warehousing dimensional-model dimensional-modeling etl etl-pipeline extract-transform-load infrastructure-as-code postgresql postgresql-database redshift-cluster
Last synced: 16 Jul 2025
https://github.com/epomatti/aws-redshift
AWS Redshift
aws aws-redshift redshift s3 terraform
Last synced: 24 Dec 2025
https://github.com/exasol/redshift-virtual-schema
Virtual Schema for connecting Redshift as a data source to Exasol
aws-redshift exasol exasol-integration redshift virtual-schema
Last synced: 03 Mar 2025
https://github.com/eljandoubi/aws-data-warehouse
Build an ETL pipeline for a database hosted on AWS Redshift.
aws-redshift etl-pipeline postgresql
Last synced: 10 Jul 2025
https://github.com/frankfarrell/redshift-js
Typescript Library for doing some redshift specific tasks
aws-redshift redshift redshift-table redshift-unload
Last synced: 06 Apr 2025
https://github.com/santiagortiiz/platzi-aws-redshift
Platzi. School of Amazon Web Services. Redshift for Big Data management.
aws aws-redshift big-data platzi redshift
Last synced: 09 Sep 2025
https://github.com/gares95/data-warehouse_aws-redshift
Building an ETL pipeline for a database hosted on Redshift. Project based on Udacity's template.
aws-redshift data-warehouse redshift udacity-nanodegree
Last synced: 19 Feb 2025
https://github.com/jibbs1703/tickit-data-lake
This repository demonstrates the creation of a robust, 3-tier data lake using an Orchestrator and AWS resources. It collects data from an on-premises NoSQL database and loads it into a SQL database in AWS.
aws-glue aws-glue-crawler aws-glue-data-catalog aws-redshift aws-s3 boto3 data-lake database etl-pipeline medallion-architecture mongodb precommit-hooks
Last synced: 20 Mar 2025
https://github.com/martinkalema/mysql-kafka-s3-redshift-data-pipeline
ETL pipeline
aws-redshift aws-s3 kafka mysql-database streaming
Last synced: 28 Feb 2025
https://github.com/jibbs1703/weather-gas-etl-pipeline
This repository contains a in ETL pipeline for collecting, transforming and storing hourly weather and atmospheric gas data. The pipeline leverages Docker containerization, AWS cloud infrastructure resources and is orchestrated using Apache-Airflow.
airflow aws-ec2 aws-redshift aws-s3 data-engineering docker docker-compose docker-image etl-automation etl-pipeline restful-api
Last synced: 20 Mar 2025
https://github.com/guledim/super-cafe-etl-aws
In this group project simulating a real-world setting, we built a scalable ETL pipeline to process daily CSV transactions into a centralized PostgreSQL database. We used Docker, Grafana for visualization, and later implemented AWS cloud services to deploy a scalable, cloud-based ETL system.
aws aws-ec2 aws-lambda aws-redshift aws-s3 docker etl-pipeline etl-pipeline-automation grafana group-project python sql
Last synced: 15 Jun 2025
https://github.com/githarsh53/smart-city-realtime-project
Smart City Realtime Data Engineering Project
aws aws-athena aws-ec2 aws-glue aws-glue-crawler aws-glue-data-catalog aws-quicksight aws-redshift aws-s3 kafka pyspark python spark spark-streaming yaml
Last synced: 09 Apr 2025
https://github.com/mihirkudale/youtube-analysis-data-engineering-project
This project aims to securely manage, streamline, and perform analysis on the structured and semi-structured YouTube videos data based on the video categories and the trending metrics.
aws aws-athena aws-glue aws-lambda aws-quicksight aws-redshift aws-s3 data-engineering python youtube-analysis
Last synced: 20 Mar 2025
https://github.com/najuzilu/cdw-awsredshift
Building a cloud data warehouse with AWS Redshift.
aws-ec2 aws-redshift cloud-data-warehouse python
Last synced: 22 Mar 2025
https://github.com/jibbs1703/tickit-data-pipeline
This repository demonstrates the creation of a robust data pipeline using an Orchestrator, on-prem and cloud resources. It collects data from on-premises SQL and NoSQL database and loads it into a SQL database in the cloud.
aws-glue aws-glue-crawler aws-glue-data-catalog aws-redshift aws-s3 boto3 data-lake database etl-pipeline medallion-architecture mongodb precommit-hooks
Last synced: 17 Sep 2025