An open API service indexing awesome lists of open source software.

Projects in Awesome Lists tagged with aws-redshift

A curated list of projects in awesome lists tagged with aws-redshift .

https://github.com/tokern/piicatcher

Scan databases and data warehouses for PII data. Tag tables and columns in data catalogs like Amundsen and Datahub

aws-athena aws-glue aws-redshift catalog data data-catalog database phi pii python snowflake

Last synced: 12 Apr 2025

https://github.com/aws/amazon-redshift-python-driver

Redshift Python Connector. It supports Python Database API Specification v2.0.

amazon-redshift aws-redshift data-analysis data-science

Last synced: 14 May 2025

https://github.com/wittline/uber-expenses-tracking

The goal of this project is to track the expenses of Uber Rides and Uber Eats through data Engineering processes using technologies such as Apache Airflow, AWS Redshift and Power BI.

airflow-docker apache-airflow aws aws-redshift data-engineering data-modeling etl-pipeline expenses-dashboard expenses-tracker power-bi python uber uber-data uber-eats

Last synced: 13 Apr 2025

https://github.com/heroku-examples/analytics-with-kafka-redshift-metabase

An example system that captures a large stream of product usage data, or events, and provides both real-time data visualization and SQL-based data analytics.

aws-redshift data-analytics data-visualization heroku kafka metabase

Last synced: 30 Apr 2025

https://github.com/moritzkoerber/covid-19-data-engineering-pipeline

A Covid-19 data pipeline on AWS featuring PySpark/Glue, Docker, Great Expectations, Airflow, and Redshift, templated in CloudFormation and CDK, deployable via Github Actions.

apache-airflow apache-spark api aws aws-cdk aws-cloudformation aws-ecr aws-glue aws-lambda aws-redshift aws-s3 docker great-expectations pyspark spark

Last synced: 28 Apr 2025

https://github.com/vsouza/spark-kinesis-redshift

Example project for consuming AWS Kinesis streamming and save data on Amazon Redshift using Apache Spark

aws aws-kinesis aws-kinesis-stream aws-redshift etl etl-pipeline python shell spark spark-streaming

Last synced: 24 Apr 2025

https://github.com/kishaningithub/rdapp

rdapp - Redshift Data API Postgres Proxy

aws aws-redshift go golang hacktoberfest redshift

Last synced: 14 Apr 2025

https://github.com/lenguyenthedat/aws-redshift-to-rds

A simple command-line tool to copy tables from Amazon Redshift to Amazon RDS (PostgreSQL).

amazon-rds amazon-redshift aws aws-redshift haskell rds

Last synced: 13 Apr 2025

https://github.com/mikecerton/the-retail-elt-pipeline-end-to-end

This project designs and implements an ETL pipeline using Apache Airflow (Docker Compose) to ingest, process, and store retail data. AWS S3 acts as the data lake, AWS Redshift as the data warehouse, and Looker Studio for visualization. [Data Engineer]

apache-airflow aws-redshift aws-s3 data-engineer etl-pipeline looker-studio

Last synced: 02 Apr 2025

https://github.com/giulic3/data-engineering-nanodegree

Projects realized for the Data Engineering Nanodegree offered by Udacity https://www.udacity.com/course/data-engineer-nanodegree--nd027

apache-airflow apache-cassandra apache-spark aws aws-emr aws-redshift aws-s3 data-engineering postgresql

Last synced: 15 Aug 2025

https://github.com/hanan-nawaz/flighttragedyanalysis

Flight Tragedy Analysis is a comprehensive data analysis project focused on examining aviation accidents and incidents from 1905 to 2009. This project provides users with valuable insights into historical plane crashes and their associated data.

airplane-crashes aws aws-redshift aws-s3 data-engineering etl kaggle postgresql power-bi psycopg2 python sql

Last synced: 08 Sep 2025

https://github.com/marcy-terui/catlass

Cloud Automation as Code with Cloud Automator

automation aws aws-ec2 aws-rds aws-redshift aws-s3 aws-sqs infrastructure-as-code

Last synced: 14 Mar 2025

https://github.com/andre-marcos-perez/ifood-arch-readme

The application is the documentation of my solution for the iFood data architect test.

aws aws-athena aws-cloudwatch aws-emr aws-lambda aws-rds aws-redshift aws-s3 aws-step-functions

Last synced: 08 Oct 2025

https://github.com/jibbs1703/awsresourcemanager

This repository contains the python modules and packages make up the AWS Resource Manager, a custom python package/wheel designed to simplify the management of AWS services through custom-written use cases and utilities. This repository serves to reinforce my knowledge on building python packages and wheels.

aws-credentials aws-dynamodb aws-ec2 aws-management aws-redshift aws-resources aws-s3

Last synced: 14 Mar 2025

https://github.com/vermicida/data-warehouse

Data Warehouse, the code corresponding the project #3 of the Udacity's Data Engineer Nanodegree Program

aws-redshift data-engineering data-warehouse etl-pipeline python

Last synced: 15 May 2025

https://github.com/bhawnamehbubani/kafka-spark-redshift-streaming-data-ingestion-project

This project is a real-time data pipeline designed for ingesting, processing, and storing telecom call records. It integrates Apache Kafka, Apache Spark Streaming, and AWS Redshift to handle large volumes of streaming data in near real-time. The pipeline is containerized with Docker Compose, enabling easy deployment, scalability, and modularity.

apache-kafka apache-spark aws-redshift docker spark-streaming

Last synced: 21 Mar 2025

https://github.com/maxinexiong/cloud-data-warehousing-with-aws-redshift

This project builds a cloud-based ETL pipeline for Sparkify to move data to a cloud data warehouse. It extracts song and user activity data from AWS S3, stages it in Redshift, and transforms it into a star-schema data model with fact and dimension tables, enabling efficient querying to answer business questions.

aws-boto3 aws-redshift aws-s3 cloud-data-warehouse data-engineering data-warehouse data-warehousing dimensional-model dimensional-modeling etl etl-pipeline extract-transform-load infrastructure-as-code postgresql postgresql-database redshift-cluster

Last synced: 16 Jul 2025

https://github.com/exasol/redshift-virtual-schema

Virtual Schema for connecting Redshift as a data source to Exasol

aws-redshift exasol exasol-integration redshift virtual-schema

Last synced: 03 Mar 2025

https://github.com/eljandoubi/aws-data-warehouse

Build an ETL pipeline for a database hosted on AWS Redshift.

aws-redshift etl-pipeline postgresql

Last synced: 10 Jul 2025

https://github.com/frankfarrell/redshift-js

Typescript Library for doing some redshift specific tasks

aws-redshift redshift redshift-table redshift-unload

Last synced: 06 Apr 2025

https://github.com/santiagortiiz/platzi-aws-redshift

Platzi. School of Amazon Web Services. Redshift for Big Data management.

aws aws-redshift big-data platzi redshift

Last synced: 09 Sep 2025

https://github.com/gares95/data-warehouse_aws-redshift

Building an ETL pipeline for a database hosted on Redshift. Project based on Udacity's template.

aws-redshift data-warehouse redshift udacity-nanodegree

Last synced: 19 Feb 2025

https://github.com/jibbs1703/tickit-data-lake

This repository demonstrates the creation of a robust, 3-tier data lake using an Orchestrator and AWS resources. It collects data from an on-premises NoSQL database and loads it into a SQL database in AWS.

aws-glue aws-glue-crawler aws-glue-data-catalog aws-redshift aws-s3 boto3 data-lake database etl-pipeline medallion-architecture mongodb precommit-hooks

Last synced: 20 Mar 2025

https://github.com/jibbs1703/weather-gas-etl-pipeline

This repository contains a in ETL pipeline for collecting, transforming and storing hourly weather and atmospheric gas data. The pipeline leverages Docker containerization, AWS cloud infrastructure resources and is orchestrated using Apache-Airflow.

airflow aws-ec2 aws-redshift aws-s3 data-engineering docker docker-compose docker-image etl-automation etl-pipeline restful-api

Last synced: 20 Mar 2025

https://github.com/guledim/super-cafe-etl-aws

In this group project simulating a real-world setting, we built a scalable ETL pipeline to process daily CSV transactions into a centralized PostgreSQL database. We used Docker, Grafana for visualization, and later implemented AWS cloud services to deploy a scalable, cloud-based ETL system.

aws aws-ec2 aws-lambda aws-redshift aws-s3 docker etl-pipeline etl-pipeline-automation grafana group-project python sql

Last synced: 15 Jun 2025

https://github.com/mihirkudale/youtube-analysis-data-engineering-project

This project aims to securely manage, streamline, and perform analysis on the structured and semi-structured YouTube videos data based on the video categories and the trending metrics.

aws aws-athena aws-glue aws-lambda aws-quicksight aws-redshift aws-s3 data-engineering python youtube-analysis

Last synced: 20 Mar 2025

https://github.com/najuzilu/cdw-awsredshift

Building a cloud data warehouse with AWS Redshift.

aws-ec2 aws-redshift cloud-data-warehouse python

Last synced: 22 Mar 2025

https://github.com/jibbs1703/tickit-data-pipeline

This repository demonstrates the creation of a robust data pipeline using an Orchestrator, on-prem and cloud resources. It collects data from on-premises SQL and NoSQL database and loads it into a SQL database in the cloud.

aws-glue aws-glue-crawler aws-glue-data-catalog aws-redshift aws-s3 boto3 data-lake database etl-pipeline medallion-architecture mongodb precommit-hooks

Last synced: 17 Sep 2025