Projects in Awesome Lists tagged with modern-data-stack
A curated list of projects in awesome lists tagged with modern-data-stack .
https://github.com/zinggai/zingg
Scalable identity resolution, entity resolution, data mastering and deduplication using ML
analytics analytics-engineering data-science data-transformation data-transformations dataengineering datalake dataquality dedupe deduplication entity-resolution etl fuzzy-matching fuzzymatch identity identity-resolution masterdata ml modern-data-stack spark
Last synced: 14 May 2025
https://github.com/meteroid-oss/meteroid
Open-source Pricing and Billing Infrastructure 🚀 Subscription management, Invoicing, Pricing, Usage-based billing, Cost limiting, Grandfathering, Experiments, Revenue analytics & Actionable insights
analytics api billing clickhouse invoicing metering modern-data-stack payments plg pricing revenue rust saas self-hosted stripe subscriptions typescript usage-based-billing
Last synced: 06 Feb 2026
https://github.com/valmi-io/valmi-activation
⚡ valmi.io reverse ETL (data activation) is the open source ( OSS ) data activation platform to load data from warehouses into Webhooks and SaaS tools like Klaviyo, Facebook Ads, Salesforce, Braze etc. Valmi.io Customer Data Platform (CDP) helps track and ingest user activity events from websites, shopify, serverside events. https://cloud.valmi.io
airbyte cdp composable-cdp dagster dbt duckdb ecommerce email-marketing etl event-ingestion event-tracking marketing-automation modern-data-stack open-source push-notifications reverse-etl shopify shopify-app shopify-events user-behavior
Last synced: 01 May 2025
https://github.com/anna-geller/prefect-streaming
Example project demonstrating deployment patterns for real-time streaming workflows with Prefect 2.0
automation aws data data-engineering data-pipeline data-science data-warehouse dataflow dataflow-ops ecs-fargate engineering event-driven mlops modern-data-stack orchestration prefect python real-time serverless streaming
Last synced: 24 Mar 2025
https://github.com/porte-bleue/mongo-to-postgres-etl
Set up a Cost-Effective Modern Data Stack for a Charity
analytics-platform analytics-stack charity data-for-good dbt modern-data-stack mongodb mongodb-database postgresql-database prefect preset python
Last synced: 24 Oct 2025
https://github.com/esadek/mini-mds
Lightweight, open source, locally-hosted Modern Data Stack
dash dbt dlt duckdb modern-data-stack pandera prefect
Last synced: 15 Apr 2025
https://github.com/flyanakin/CountMoney
A simple low-cost finance data pipeline orchestration. All you need is just python & SQL.
airtable-api dagster dbt etl finance modern-data-stack orchestration postgresql python sql stock tushare workflow
Last synced: 11 May 2025
https://github.com/guaradata/etl-mds-marketing
Conecte os dados do Facebook Ads ao seu banco de dados Postgres com Airbyte e Mage.
airbyte docker docker-compose mage-ia mds modern-data-stack nginx nginx-proxy postgresql python
Last synced: 10 Apr 2025
https://github.com/shey/airbyte-oauth2-proxy-nginx
Ansible playbook to setup a host to serve Airbyte (via docker-compose) behind a oauth2-proxy
airbyte ansible certbot docker-compose modern-data-stack nginx oauth2-proxy ubuntu
Last synced: 05 Apr 2025
https://github.com/guaradata/.github
Repositório especial da página Guaradata no Github.
data-engineering docker modern-data-stack spark
Last synced: 29 Jun 2025
https://github.com/antoniosilv-l/local-data-stack
Repositório criado com o objetivo de estudar e construir uma arquitetura de dados moderna, seguindo os melhores padrões.
devops devsecops modern-data-stack
Last synced: 04 Feb 2026
https://github.com/guaradata/spark-minio-delta-jupyter-dremio-lab
Este projeto é um laboratório prático que implementa uma Pilha de Dados Moderna (MDS) usando containers Docker, projetado para aprendizado e experimentação com ferramentas open-source como MinIO (armazenamento S3), PostgreSQL, Apache Hive, Spark, Kyuubi, JupyterLab e Dremio.
dbeaver delta-lake docker docker-compose dremio hive-metastore jupyter-notebook kyuubi minio modern-data-stack spark
Last synced: 07 May 2025
https://github.com/guaradata/mds-lab
Este projeto é um laboratório prático que implementa uma Pilha de Dados Moderna (MDS) usando containers Docker, projetado para aprendizado e experimentação com ferramentas open-source como MinIO (armazenamento S3), PostgreSQL, Apache Hive, Spark, Kyuubi, JupyterLab e Dremio.
dbeaver delta-lake docker docker-compose dremio hive-metastore jupyter-notebook kyuubi minio modern-data-stack spark
Last synced: 10 Apr 2025