Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/turbot/steampipe-plugin-github
Use SQL to instantly query repositories, users, gists and more from GitHub. Open source CLI. No DB required.
backup etl github github-cli github-client hacktoberfest postgresql postgresql-fdw sql sqlite steampipe steampipe-plugin zero-etl
Last synced: 03 Jul 2024
https://github.com/turbot/steampipe-plugin-azure
Use SQL to instantly query Azure resources across regions and subscriptions. Open source CLI. No DB required.
azure azure-cli azure-client azure-devops backup etl hacktoberfest postgresql postgresql-fdw sql sqlite steampipe steampipe-plugin zero-etl
Last synced: 03 Jul 2024
https://github.com/turbot/steampipe-plugin-shopify
Use SQL to instantly query Shopify products, orders and more. Open source CLI. No DB required.
backup etl hacktoberfest postgresql postgresql-fdw shopify shopify- shopify-orders shopify-partners shopify-products sql sqlite steampipe steampipe-plugin zero-etl
Last synced: 03 Jul 2024
https://github.com/turbot/steampipe-plugin-sdk
Steampipe Plugin SDK is a simple abstraction layer to write a Steampipe plugin. Plugins automatically work across all engine types including the Steampipe CLI, Postgres FDW, SQLite extension and the export CLI.
etl hacktoberfest postgresql postgresql-fdw sql sqlite sqlite-extension steampipe steampipe-plugin zero-etl
Last synced: 03 Jul 2024
https://github.com/turbot/steampipe-plugin-openai
Use SQL to instantly query OpenAI for completions, models & more. Open source CLI. No DB required.
backup etl golang gpt-3 hacktoberfest openai postgresql postgresql-fdw sql sqlite steampipe steampipe-plugin zero-etl
Last synced: 03 Jul 2024
https://github.com/turbot/steampipe-plugin-gcp
Use SQL to instantly query GCP resources across regions, projects and organizations. Open source CLI. No DB required.
backup etl gcloud gcloud-cli gcp hacktoberfest postgresql postgresql-fdw sql sqlite steampipe steampipe-plugin zero-etl
Last synced: 03 Jul 2024
https://github.com/turbot/steampipe-plugin-csv
Use SQL to instantly query data from CSV files. Open source CLI. No DB required.
backup csv etl hacktoberfest postgresql postgresql-fdw sql sqlite steampipe steampipe-plugin zero-etl
Last synced: 03 Jul 2024
https://github.com/turbot/steampipe-plugin-finance
Use SQL to instantly query financial data including quotes (equities, cryptocurrency, etc) and US public company information. Open source CLI. No DB required.
backup cryptocurrency edgar edgar-scraper etl finance hacktoberfest postgresql postgresql-fdw sql sqlite steampipe steampipe-plugin stock-market yahoo-finance yahoo-finance-api zero-etl
Last synced: 03 Jul 2024
https://github.com/flyanakin/CountMoney
A simple low-cost finance data pipeline orchestration. All you need is just python & SQL.
airtable-api dagster dbt etl finance modern-data-stack orchestration postgresql python sql stock tushare workflow
Last synced: 02 Jul 2024
https://github.com/quintoandar/butterfree
A tool for building feature stores.
data-engineering data-science etl etl-framework feature-store package pyspark python
Last synced: 29 Jun 2024
https://github.com/apache/flink-cdc
Flink CDC is a streaming data integration tool
batch cdc change-data-capture data-integration data-pipeline distributed elt etl flink kafka mysql paimon postgresql real-time schema-evolution
Last synced: 26 Jun 2024
https://github.com/rwynn/monstache
a go daemon that syncs MongoDB to Elasticsearch in realtime. you know, for search.
change-streams connector daemon elasticsearch etl go golang mongodb opensearch oplog realtime river sync synchronization tail
Last synced: 25 Jun 2024
https://github.com/zinggAI/zingg
Scalable identity resolution, entity resolution, data mastering and deduplication using ML
analytics analytics-engineering data-science data-transformation data-transformations dataengineering datalake dataquality dedupe deduplication entity-resolution etl fuzzy-matching fuzzymatch identity identity-resolution masterdata ml modern-data-stack spark
Last synced: 22 Jun 2024
https://github.com/webysther/aws-glue-docker
🐋 Docker image for AWS Glue Spark/Python
apache-arrow aws aws-cli aws-glue aws-glue-docker cdk data-engineering development docker docker-image dockerfile etl glue-catalog glue-pyspark pandas pytest python python-poetry sam spark
Last synced: 21 Jun 2024
https://github.com/EvilLord666/ReportGenerator
A small cross-database tool for building excel documents (reports) based on data from database that extacts via View or Stored Procedures with parametres, ordering e.t.c.
cross-database database database-reporting di-service etl etl-automation excel excel-export excel-to-sql generator reportgenerator reporting-engine reporting-tool reports smart-reporting sql-to-excel statement stored-procedures
Last synced: 21 Jun 2024
https://github.com/DataCater/datacater
The developer-friendly ETL platform for transforming data in real-time. Based on Apache Kafka® and Kubernetes®.
apache-kafka cloud-native data-pipelines etl kafka kubernetes python
Last synced: 21 Jun 2024
https://github.com/dswarm/dswarm
an open-source data management platform for knowledge workers (https://github.com/dswarm/dswarm-documentation/wiki)
csv datamanagement datamapper dswarm etl json mapping metadata schema-mapping xml
Last synced: 20 Jun 2024
https://github.com/Marklogic-retired/marklogic-data-hub
The MarkLogic Data Hub: documentation ==>
database database-management datahub dhf etl framework gradle java javascript json marklogic marklogic-data-hub spring-batch triplestore xml xquery
Last synced: 17 Jun 2024
https://github.com/singer-io/getting-started
This repository is a getting started guide to Singer.
data-analysis etl etl-framework python singer
Last synced: 17 Jun 2024
https://github.com/datacleaner/DataCleaner
The premier open source Data Quality solution
data data-analysis data-science database datacleaner dataquality desktop etl mdm profiling
Last synced: 17 Jun 2024
https://github.com/quadratichq/quadratic
Quadratic | Data Science Spreadsheet with Python & SQL
data data-analysis data-engineering data-science etl python quadratic spreadsheet sql wasm webgl
Last synced: 17 Jun 2024
https://github.com/datacoon/awesome-dataops
Awesome list of dataops products, open source and resources
cloud data data-engineering dataops etl workflow-engine
Last synced: 17 Jun 2024
https://github.com/dataplane-app/dataplane
Dataplane is an Airflow inspired unified data platform with additional data mesh and RPA capability to automate, schedule and design data pipelines and workflows. Dataplane is written in Golang with a React front end.
airflow data data-analysis data-engineering data-integration data-pipelines data-science dataplane datawarehouse etl finance golang kubernetes pipelines robotics-process-automation rpa scheduler workflow workflow-automation workflows
Last synced: 17 Jun 2024
https://github.com/appbaseio/abc
Power of appbase.io via CLI, with nifty imports from your favorite data sources
Last synced: 16 Jun 2024
https://github.com/camposvinicius/aws-etl
This is an ETL application on AWS with general open sales and customer data that you can find here: https://github.com/camposvinicius/data/blob/main/AdventureWorks.zip, it's a zipped file with some .csvs inside that we will apply transformations.
airflow argocd athena aws catalog data data-engineer database emr emr-cluster etl glue kubernetes pipeline postgres pyspark rds spark
Last synced: 16 Jun 2024
https://github.com/deepeth/mars
The powerful analysis platform to explore and visualize data from blockchain.
bitcoin blockchain ethereum etl rust schema web3
Last synced: 16 Jun 2024
https://github.com/wgzhao/Addax
Addax is a versatile open-source ETL tool that can seamlessly transfer data between various RDBMS and NoSQL databases, making it an ideal solution for data migration.
clickhouse data-integrity database datax etl excel hadoop hdfs hive impala influxdb kudu mysql oracle postgresql sqlserver trino
Last synced: 16 Jun 2024
https://github.com/thenaturalist/awesome-business-intelligence
Actively curated list of awesome BI tools. PRs welcome!
awesome-list business-intelligence data-analysis data-science data-visualization database etl sql
Last synced: 16 Jun 2024
https://github.com/elastic/eland
Python Client and Toolkit for DataFrames, Big Data, Machine Learning and ETL in Elasticsearch
big-data data-analysis dataframe dataframes eland elasticsearch etl lightgbm machine-learning pandas python scikit-learn time-series-forecasting
Last synced: 16 Jun 2024
https://github.com/instill-ai/instill-core
🔮 Instill Core is a full-stack AI infrastructure tool for data, model and pipeline orchestration, designed to streamline every aspect of building versatile AI-first applications
ai api cli developer-tools etl generative-ai golang gpt hacktoberfest llm low-code no-code open-source pipeline python stable-diffusion typescript unstructured-data
Last synced: 14 Jun 2024
https://github.com/fedspendingtransparency/usaspending-api
Server application to serve U.S. federal spending data via a RESTful API
api api-blueprint black database database-setup django django-rest-framework docker dredd elasticsearch etl federal-spending-data local-database markdown postgres-database postgresql pytest python3 restful-api
Last synced: 13 Jun 2024
https://github.com/apache/hop
Hop Orchestration Platform
apache data-integration etl hop java orchestration pipeline streaming workflow
Last synced: 12 Jun 2024
https://github.com/microsoft/etl2pcapng
Utility that converts an .etl file containing a Windows network packet capture into .pcapng format.
Last synced: 11 Jun 2024
https://github.com/toluaina/pgsync
Postgres to Elasticsearch/OpenSearch sync
change-data-capture elasticsearch elasticsearch-sync etl kibana opensearch postgresql python sql
Last synced: 11 Jun 2024
https://github.com/blockchain-etl/polygon-etl
ETL (extract, transform and load) tools for ingesting Polygon blockchain data to Google BigQuery and Pub/Sub
airflow bigquery cryptocurrency data-engineering etl gcp matic-network maticnetwork polygon
Last synced: 11 Jun 2024
https://github.com/blockchain-etl/bitcoin-etl
ETL scripts for Bitcoin, Litecoin, Dash, Zcash, Doge, Bitcoin Cash. Available in Google BigQuery https://goo.gl/oY5BCQ
apache-beam bitcoin bitcoincash blockchain-analytics crypto cryptocurrency dash data-analytics data-engineering dogecoin etl gcp google-dataflow google-pubsub litecoin on-chain-analysis web3 zcash
Last synced: 11 Jun 2024
https://github.com/flow-php/etl
PHP - ETL (Extract Transform Load) data processing library
data-engineering data-processing etl flow-php
Last synced: 11 Jun 2024
https://github.com/flow-php/flow
Flow PHP - strongly typed data processing framework
etl etl-framework etl-pipeline
Last synced: 11 Jun 2024
https://github.com/turbot/steampipe-sqlite
Steampipe SQLite is a zero-ETL engine for SQLite. Virtual tables translate queries into live API calls for cloud services and APIs. Hundreds of plugins with thousands of documented examples.
aws azure data devsecops etl gcp golang kubernetes security sql sqlite steampipe steampipe-engine zero-etl
Last synced: 10 Jun 2024
https://github.com/m-lab/etl-schema
All schema and views related to the etl pipeline and public bigquery tables.
Last synced: 10 Jun 2024
https://github.com/tmusabbir/glue-utils
Few AWS Glue Utility Scripts
amazon-web-services aws emr etl glue lakeformation
Last synced: 10 Jun 2024
https://github.com/vh-d/Rflow
Rflow is a general-purpose workflow management framework for R
data-processing database dataflow etl etl-framework r reproducibility rlang rstats rstats-package workflow-management
Last synced: 10 Jun 2024
https://github.com/vh-d/RETL
R package for ETL
etl etl-framework transformations
Last synced: 10 Jun 2024
https://github.com/LukasLoeffler/data-graph
Flow and event based data processing
data-processing etl etl-pipeline flow-based-programming graph graphical-user-interface low-code no-code
Last synced: 09 Jun 2024
https://github.com/hofstadter-io/cuetils
CLI and library for diff, patch, and ETL operations on CUE, JSON, and Yaml
configuration cue cuelang diff etl golang jq json structural-diff yaml
Last synced: 09 Jun 2024
https://github.com/jupyter-naas/naas
Low-code Python library to safely use notebooks in production: schedule workflows, generate assets, trigger webhooks, send notifications, build pipelines, manage secrets (Cloud-only)
ai binder data data-science data-transformation engine etl integration jupyter jupyterlab notebooks open-source pipeline
Last synced: 08 Jun 2024
https://github.com/NeumTry/NeumAI
Neum AI is a best-in-class framework to manage the creation and synchronization of vector embeddings at large scale.
ai chatgpt data data-engineering database embeddings etl llm llmops mlops ops pipeline python rag retrieval vector-database vectors
Last synced: 08 Jun 2024
https://github.com/halestudio/hale
(Spatial) data harmonisation with hale studio (formerly HUMBOLDT Alignment Editor)
data-harmonisation database eclipse-rcp etl etl-framework geospatial-data gml groovy hale hale-studio humboldt-alignment-editor inspire java scala transformation xml
Last synced: 08 Jun 2024
https://github.com/ananas-analytics/ananas-desktop
A hackable data integration & analysis tool to enable non technical users to edit data processing jobs and visualise data on demand.
analytics business-intelligence data-modeling etl hackable-data visualization
Last synced: 07 Jun 2024
https://github.com/twineworks/ruby-for-pentaho-kettle
Ruby scripting for pentaho-kettle
etl java jurby kettle pdi pentaho-kettle ruby
Last synced: 07 Jun 2024
https://github.com/zhaoyachao/zdh_web
大数据采集,抽取平台,zdh_web是zdh系列服务的可视化管理平台,包含数据采集,调度,权限,审批流,私域营销等模块
bigdata collection data data-collection datapipeline datax-web etl pipline scheduler spark sparketl
Last synced: 07 Jun 2024
https://github.com/ICIJ/extract
A cross-platform command line tool for parallelised content extraction and analysis.
ediscovery etl index solr tika
Last synced: 07 Jun 2024
https://github.com/beneath-hq/beneath
Beneath is a serverless real-time data platform ⚡️
analytics beneath data-engineering data-pipelines data-science data-warehouse dataops developer-tools etl go kubernetes mlops python sql streaming
Last synced: 07 Jun 2024
https://github.com/PeerDB-io/peerdb
Fast, Simple and a cost effective tool to replicate data from Postgres to Data Warehouses, Queues and Storage
bigquery cdc clickhouse cloud-native distributed-systems etl eventhubs kafka postgres postgresql realtime rust s3 snowflake sql stream-processing
Last synced: 07 Jun 2024
https://github.com/turbot/steampipe-plugin-code
Use SQL to instantly query secrets and more from source code. Open source CLI. No DB required.
backup code-scanner etl hacktoberfest postgresql postgresql-fdw secrets-detection sql sqlite steampipe steampipe-plugin zero-etl
Last synced: 06 Jun 2024
https://github.com/grailbio/bigslice
A serverless cluster computing system for the Go programming language
bigdata cluster computing etl go golang machinelearning mapreduce
Last synced: 05 Jun 2024
https://github.com/Multiwoven/multiwoven
🔥🔥🔥 Open Source Alternative to Hightouch, Census, and RudderStack. Leading Reverse ETL and Customer Data Platform (CDP) for Data Teams.
bigquery cdp customer-data-platform data-activation data-engineering data-pipeline data-warehouse databricks dbt etl hacktoberfest open-source postresql react redshift reverse-etl ruby self-hosted snowflake typescript
Last synced: 05 Jun 2024
https://github.com/sdcastillo/ExamPAData
A container for data sets to help actuaries who are practicing predictive analytics
content-marketing cran education etl etl-pipeline
Last synced: 04 Jun 2024
https://github.com/BetweenTwoTests/between_dbs
DDL & test data for different databases for ETL data quality checks / data loading tests
Last synced: 04 Jun 2024
https://github.com/hackersandslackers/bigquery-sqlalchemy-tutorial
:bar_chart: :arrow_right: :floppy_disk: ETL script to migrate data from BigQuery to SQL.
bigquery bigquery-sqlalchemy-tutorial databases etl mysql postgres python sql sqlalchemy tutorial
Last synced: 03 Jun 2024
https://github.com/seanharr11/etlalchemy
Extract, Transform, Load: Any SQL Database in 4 lines of Code.
database etl etl-framework migrations python sqlalchemy
Last synced: 02 Jun 2024
https://github.com/compose/transporter
Sync data between persistence engines, like ETL only not stodgy
elasticsearch etl go mongodb mysql postgresql rabbitmq rethinkdb
Last synced: 02 Jun 2024
https://github.com/opencultureconsulting/openrefine-batch
Shell script to run OpenRefine in batch mode (import, transform, export). It orchestrates OpenRefine (server) and a python client that communicates with the OpenRefine API.
bash-script batch-processing code4lib docker etl openrefine
Last synced: 01 Jun 2024
https://github.com/raystack/optimus
Optimus is an easy-to-use, reliable, and performant workflow orchestrator for data transformation, data modeling, pipelines, and data quality management.
airflow analytics analytics-engineering automation bigquery business-intelligence data-modelling data-pipelines data-transformation data-warehouse dataops elt etl golang workflows
Last synced: 01 Jun 2024
https://github.com/2ndQuadrant/pglogical
Logical Replication extension for PostgreSQL 15, 14, 13, 12, 11, 10, 9.6, 9.5, 9.4 (Postgres), providing much faster replication than Slony, Bucardo or Londiste, as well as cross-version upgrades.
cdc data-transformation data-transport database-replication etl logical-decoding postgresql publish-subscribe replication subscription zero-downtime
Last synced: 01 Jun 2024
https://github.com/dataform-co/dataform
Dataform is a framework for managing SQL based data operations in BigQuery
analytics business-intelligence data-engineering data-pipelines elt etl hacktoberfest
Last synced: 01 Jun 2024
https://github.com/apache/incubator-devlake
Apache DevLake is an open-source dev data platform to ingest, analyze, and visualize the fragmented data from DevOps tools, extracting insights for engineering excellence, developer experience, and community growth.
dashboard-friendly data data-analysis data-engineering data-integration data-transfers devops domain-layer dora etl golang hacktoberfest integration jira open-source user-friendly
Last synced: 31 May 2024
https://github.com/jbogard/bulk-writer
Provides guidance for fast ETL jobs, an IDataReader implementation for SqlBulkCopy (or the MySql or Oracle equivalents) that wraps an IEnumerable, and libraries for mapping entites to table columns.
bulk-writer etl etl-job pipeline pipeline-stage sql sqlbulkcopy stream-data
Last synced: 31 May 2024
https://github.com/ariacom/Seal-Report
Database Reporting Tool and Tasks (.Net)
business-intelligence chart dashboards etl excel html-report linq mongodb mysql pdf report-generator reporting-engine sqlserver task-scheduler tasks
Last synced: 31 May 2024
https://github.com/iftech-engineering/mongo-es
A MongoDB to Elasticsearch connector
connector elasticsearch etl mongodb
Last synced: 30 May 2024
https://github.com/xyflow/awesome-node-based-uis
A curated list with resources about node-based UIs
awesome-list etl node-based-ui visual-programming workflow-editor
Last synced: 30 May 2024
https://github.com/redpanda-data/connect
Fancy stream processing made operationally mundane
amqp cqrs data-engineering data-ops etl event-sourcing go golang kafka logs message-bus message-queue nats rabbitmq stream-processing stream-processor streaming-data
Last synced: 30 May 2024
https://github.com/data-engineering-community/data-engineering-wiki
The best place to learn data engineering. Built and maintained by the data engineering community.
data data-engineer data-engineering data-modeling data-pipelines database etl sql
Last synced: 29 May 2024
https://github.com/nucleuscloud/neosync
Open source data anonymization and synthetic data orchestration for developers. Create high fidelity synthetic data and sync it across your environments.
benthos docker etl faker fine-tuning golang kubernetes nextjs open-source orchestration protobuf react reactjs self-hosted synthetic-data synthetic-data-generation test-data-generator testing typescript
Last synced: 28 May 2024
https://github.com/zsvoboda/dbd
dbd is a database prototyping tool that enables data analysts and engineers to quickly load and transform data in SQL databases.
bigquery csv database database-schemas elt etl excel json mysql parquet postgresql python python3 redshift snowflake sql sqlite xls xlsx
Last synced: 27 May 2024
https://github.com/miztiik/s3-to-rds-with-glue
Extract, transform, and load data for analytic processing using AWS Glue
cdk cloud-development-kit etl glue glue-catalog glue-job miztiik-automation s3-to-rds spark
Last synced: 27 May 2024
https://github.com/moj-analytical-services/etl_manager
A python package to create a database on the platform using our moj data warehousing framework
Last synced: 27 May 2024
https://github.com/datamill-co/target-postgres
A Singer.io Target for Postgres
etl json-schema postgres singer stream
Last synced: 27 May 2024
https://github.com/shipyardapp/postgresql-blueprints
Simplified blueprints for building data pipelines with PostgreSQL.
cli data-analysis data-engineering data-pipeline data-science database elt etl postgres postgresql
Last synced: 27 May 2024
https://github.com/shipyardapp/amazonathena-blueprints
Simplified blueprints for building data pipelines with Amazon Athena.
amazon-athena athena cli data-analysis data-engineering data-science elt etl
Last synced: 27 May 2024
https://github.com/turbot/steampipe-plugin-aws
Use SQL to instantly query AWS resources across regions and accounts. Open source CLI. No DB required.
aws aws-cli backup etl hacktoberfest postgresql postgresql-fdw sql sqlite steampipe steampipe-plugin zero-etl
Last synced: 27 May 2024
https://github.com/thbar/kiba
Data processing & ETL framework for Ruby
data etl etl-ruby ruby rubydatascience
Last synced: 26 May 2024
https://github.com/wx-chevalier/sentinel-crawler
Xenomorph Crawler, a Concise, Declarative and Observable Distributed Crawler(Node / Go / Java / Rust) For Web, RDB, OS, also can act as a Monitor(with Prometheus) or ETL for Infrastructure :dizzy: 多语言执行器,分布式爬虫
crawler etl koa2 monitor nodejs react wx-code
Last synced: 26 May 2024
https://github.com/opencultureconsulting/openrefine-client
The OpenRefine Python Client from Paul Makepeace provides a library for communicating with an OpenRefine server. This fork extends the command line interface (CLI) and is distributed as a convenient one-file-executable (Windows, Linux, Mac). It is also available via Docker Hub, PyPI and Binder.
binder code4lib docker etl openrefine pypi python
Last synced: 26 May 2024
https://github.com/albertovpd/automated_etl_google_cloud-social_dashboard
A dashboard is worth a thousand words => https://datastudio.google.com/reporting/755f3183-dd44-4073-804e-9f7d3d993315
bigquery-table cloud-functions cloud-scheduler cloud-storage dashboard data-studio dataprep etl etl-jobs etl-pipeline gdelt google-cloud google-cloud-platform google-trends python sql twitter-api
Last synced: 26 May 2024
https://github.com/onepanelio/onepanel
The open source, end-to-end computer vision platform. Label, build, train, tune, deploy and automate in a unified platform that runs on any cloud and on-premises.
ai aiops annotation computer-vision deeplearning etl hyperparameter-tuning inference jupyterlab labeling machinelearning mlops pipelines pytorch tensorboard tensorflow training workflows
Last synced: 19 May 2024
https://github.com/mara/mara-pipelines
A lightweight opinionated ETL framework, halfway between plain scripts and Apache Airflow
data data-integration etl pipeline postgresql python
Last synced: 18 May 2024
https://github.com/dalenewman/Transformalize
Configurable Extract, Transform, and Load
data-warehouse denormalize elasticsearch etl etl-framework excel files mysql postgresql solr sql-server sqlce sqlite ssas
Last synced: 17 May 2024
https://github.com/TobikoData/sqlmesh
Efficient data transformation and modeling framework that is backwards compatible with dbt.
dataengineering dataops dbt elt etl python sql transformation
Last synced: 16 May 2024
https://github.com/orchest/orchest
Build data pipelines, the easy way 🛠️
airflow cloud dag data-pipelines data-science deployment docker etl etl-pipeline ide jupyter jupyterlab kubernetes machine-learning notebooks orchest pipelines python self-hosted
Last synced: 16 May 2024
https://github.com/brexhq/substation
Substation is a cloud-native, event-driven data pipeline toolkit built for security teams.
aws data-engineering data-processing etl go security serverless
Last synced: 16 May 2024
https://github.com/dagster-io/dagster
An orchestration platform for the development, production, and observation of data assets.
analytics dagster data-engineering data-integration data-orchestrator data-pipelines data-science etl metadata mlops orchestration python scheduler workflow workflow-automation
Last synced: 16 May 2024
https://github.com/rudderlabs/rudder-server
Privacy and Security focused Segment-alternative, in Golang and React
bigquery customer-data customer-data-lake customer-data-pipeline customer-data-platform data-integration data-pipeline data-synchronization data-warehouse etl golang hybrid-cloud privacy redshift rudderstack security segment-alternative snowflake warehouse-first warehouse-management
Last synced: 16 May 2024
https://github.com/kestra-io/kestra
Infinitely scalable, event-driven, language-agnostic orchestration and scheduling platform to manage millions of workflows declaratively in code.
data data-engineering data-integration data-orchestration data-orchestrator data-pipeline data-quality elt etl low-code orchestration pipeline reverse-etl scheduler workflow workflow-engine
Last synced: 14 May 2024
https://github.com/adilkhash/luigi-telegram
Luigi Tasks status notifications to Telegram
data-pipeline data-processing etl luigi notification-plugin
Last synced: 13 May 2024
https://github.com/patterns-app/patterns-devkit
Data pipelines from re-usable components
data-analysis data-engineering data-pipeline data-pipelines data-science etl etl-framework etl-pipeline etl-pipelines functional-reactive-programming immutability pipelines sql
Last synced: 13 May 2024
https://github.com/swirrl/table2qb
A generic pipeline for converting tabular data into rdf data cubes
clojure csv csvw datacube etl linked-data qb rdf
Last synced: 13 May 2024
https://github.com/linkedpipes/etl
LinkedPipes ETL is an RDF based, lightweight ETL tool
etl linked-data linkedpipes rdf
Last synced: 13 May 2024