An open API service indexing awesome lists of open source software.

Projects in Awesome Lists tagged with glue-catalog

A curated list of projects in awesome lists tagged with glue-catalog .

https://github.com/aws/aws-sdk-pandas

pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, Neptune, OpenSearch, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).

amazon-athena amazon-sagemaker-notebook apache-arrow apache-parquet athena aws aws-glue aws-lambda data-engineering data-science emr etl glue-catalog lambda modin mysql pandas python ray redshift

Last synced: 22 Apr 2025

https://github.com/dbt-labs/dbt-athena

The athena adapter plugin for dbt (https://getdbt.com)

athena dbt dbt-athena dbt-athena-community glue-catalog iceberg s3

Last synced: 08 Apr 2025

https://github.com/bbenzikry/spark-eks

Examples and custom spark images for working with the spark-on-k8s operator on AWS

aws docker dockerfile eks eks-cluster glue-catalog kubernetes kubernetes-operator metastore spark

Last synced: 19 Mar 2025

https://github.com/miztiik/s3-to-rds-with-glue

Extract, transform, and load data for analytic processing using AWS Glue

cdk cloud-development-kit etl glue glue-catalog glue-job miztiik-automation s3-to-rds spark

Last synced: 04 Dec 2024

https://github.com/kyopark2014/case-study-wait-for-callback

This is a case study showing how to deploy "Wait-for-Callback" using Step Functions

event-bridge glue-catalog lambda-functions step-functions

Last synced: 12 Apr 2025

https://github.com/gakas14/aws-serverless-data-lake

This workshop is to build a serverless data lake architecture using Amazon Kinesis Firehose for streaming data ingestion, AWS Glue for Data Integration (ETL, Catalogue Management), Amazon S3 for data lake storage, Amazon Athena for SQL big data analytics.

athena aws data-lake etl glue-catalog glue-etl kinesis-firehose kinesis-stream s3 sql

Last synced: 20 Feb 2025

https://github.com/infraspecdev/terraform-aws-athena

This Terraform module automates the setup of AWS Athena to query ALB access and connection logs stored in an S3 bucket.

athena glue-catalog terrform-module

Last synced: 20 Feb 2025

https://github.com/bhawnamehbubani/process-and-ingest-only-quality-movies-in-redshift-dara-warehouse

This repository contains a production-grade ETL (Extract, Transform, Load) pipeline built with AWS Glue and Amazon Redshift. The pipeline processes a raw IMDb movie dataset stored in Amazon S3, applies data quality validation, dynamically routes data based on validation results, and loads it into Amazon Redshift for advanced analytic

crawlers eventbridge glue-catalog glue-low-code-etl redshift s3-bucket sns

Last synced: 30 Mar 2025