Projects in Awesome Lists tagged with modern-data-stack
A curated list of projects in awesome lists tagged with modern-data-stack .
https://github.com/zinggai/zingg
Scalable identity resolution, entity resolution, data mastering and deduplication using ML
analytics analytics-engineering data-science data-transformation data-transformations dataengineering datalake dataquality dedupe deduplication entity-resolution etl fuzzy-matching fuzzymatch identity identity-resolution masterdata ml modern-data-stack spark
Last synced: 10 Apr 2025
https://github.com/valmi-io/valmi-activation
⚡ valmi.io reverse ETL (data activation) is the open source ( OSS ) data activation platform to load data from warehouses into Webhooks and SaaS tools like Klaviyo, Facebook Ads, Salesforce, Braze etc. Valmi.io Customer Data Platform (CDP) helps track and ingest user activity events from websites, shopify, serverside events. https://cloud.valmi.io
airbyte cdp composable-cdp dagster dbt duckdb ecommerce email-marketing etl event-ingestion event-tracking marketing-automation modern-data-stack open-source push-notifications reverse-etl shopify shopify-app shopify-events user-behavior
Last synced: 24 Jan 2025
https://github.com/anna-geller/prefect-streaming
Example project demonstrating deployment patterns for real-time streaming workflows with Prefect 2.0
automation aws data data-engineering data-pipeline data-science data-warehouse dataflow dataflow-ops ecs-fargate engineering event-driven mlops modern-data-stack orchestration prefect python real-time serverless streaming
Last synced: 24 Mar 2025
https://github.com/porte-bleue/mongo-to-postgres-etl
Set up a Cost-Effective Modern Data Stack for a Charity
analytics-platform analytics-stack charity data-for-good dbt modern-data-stack mongodb mongodb-database postgresql-database prefect preset python
Last synced: 09 Feb 2025
https://github.com/esadek/mini-mds
Lightweight, open source, locally-hosted Modern Data Stack
dash dbt dlt duckdb modern-data-stack pandera prefect
Last synced: 15 Apr 2025
https://github.com/flyanakin/CountMoney
A simple low-cost finance data pipeline orchestration. All you need is just python & SQL.
airtable-api dagster dbt etl finance modern-data-stack orchestration postgresql python sql stock tushare workflow
Last synced: 17 Nov 2024
https://github.com/guaradata/etl-mds-marketing
Conecte os dados do Facebook Ads ao seu banco de dados Postgres com Airbyte e Mage.
airbyte docker docker-compose mage-ia mds modern-data-stack nginx nginx-proxy postgresql python
Last synced: 10 Apr 2025
https://github.com/shey/airbyte-oauth2-proxy-nginx
Ansible playbook to setup a host to serve Airbyte (via docker-compose) behind a oauth2-proxy
airbyte ansible certbot docker-compose modern-data-stack nginx oauth2-proxy ubuntu
Last synced: 05 Apr 2025
https://github.com/guaradata/mds-lab
Este projeto é um laboratório prático que implementa uma Pilha de Dados Moderna (MDS) usando containers Docker, projetado para aprendizado e experimentação com ferramentas open-source como MinIO (armazenamento S3), PostgreSQL, Apache Hive, Spark, Kyuubi, JupyterLab e Dremio.
dbeaver delta-lake docker docker-compose dremio hive-metastore jupyter-notebook kyuubi minio modern-data-stack spark
Last synced: 10 Apr 2025