Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/quintoandar/butterfree
A tool for building feature stores.
data-engineering data-science etl etl-framework feature-store package pyspark python
Last synced: 29 Jun 2024
![](https://github.com/quintoandar.png)
https://github.com/noflo/noflo
Flow-based programming for JavaScript
etl-framework fbp flow-based-programming nocode noflo
Last synced: 25 Jun 2024
![](https://github.com/noflo.png)
https://github.com/singer-io/getting-started
This repository is a getting started guide to Singer.
data-analysis etl etl-framework python singer
Last synced: 17 Jun 2024
![](https://github.com/singer-io.png)
https://github.com/san089/goodreads_etl_pipeline
An end-to-end GoodReads Data Pipeline for Building Data Lake, Data Warehouse and Analytics Platform.
airflow airflow-dag apache-airflow apache-spark data-engineering data-engineering-pipeline data-lake data-migration emr-cluster etl-framework etl-job etl-pipeline goodreads-data-pipeline livy python redshift s3 scheduler spark warehouse
Last synced: 13 Jun 2024
![](https://github.com/san089.png)
https://github.com/flow-php/flow
Flow PHP - strongly typed data processing framework
etl etl-framework etl-pipeline
Last synced: 11 Jun 2024
![](https://github.com/flow-php.png)
https://github.com/vh-d/Rflow
Rflow is a general-purpose workflow management framework for R
data-processing database dataflow etl etl-framework r reproducibility rlang rstats rstats-package workflow-management
Last synced: 10 Jun 2024
![](https://github.com/vh-d.png)
https://github.com/vh-d/RETL
R package for ETL
etl etl-framework transformations
Last synced: 10 Jun 2024
![](https://github.com/vh-d.png)
https://github.com/halestudio/hale
(Spatial) data harmonisation with hale studio (formerly HUMBOLDT Alignment Editor)
data-harmonisation database eclipse-rcp etl etl-framework geospatial-data gml groovy hale hale-studio humboldt-alignment-editor inspire java scala transformation xml
Last synced: 08 Jun 2024
![](https://github.com/halestudio.png)
https://github.com/restarone/violet_rails
an app engine for your business. Seamlessly implement business logic with a powerful API. Out of the box CMS, blog, forum and email functionality. Developer friendly & easily extendable for your next SaaS/XaaS project. Built with Rails 6, Devise, Sidekiq & PostgreSQL
blog cms ember emberjs etl-automation etl-framework etl-pipeline forum multi-tenancy multitenancy rails ruby ruby-on-rails rubyonrails saas saas-boilerplate template violet-rails wordpress-replacement xaas
Last synced: 02 Jun 2024
![](https://github.com/restarone.png)
https://github.com/seanharr11/etlalchemy
Extract, Transform, Load: Any SQL Database in 4 lines of Code.
database etl etl-framework migrations python sqlalchemy
Last synced: 02 Jun 2024
![](https://github.com/seanharr11.png)
https://github.com/data-dot-all/dataall
A modern data marketplace that makes collaboration among diverse users (like business, analysts and engineers) easier, increasing efficiency and agility in data projects on AWS.
aws aws-glue aws-lake-formation aws-s3 data data-science etl-framework lakeformation lakehouse redshift
Last synced: 27 May 2024
![](https://github.com/data-dot-all.png)
https://github.com/DrSnowbird/openrefine
OpenRefine Docker for Data ETL/ELT
big-data docker etl-framework openrefine
Last synced: 26 May 2024
![](https://github.com/DrSnowbird.png)
https://github.com/dalenewman/Transformalize
Configurable Extract, Transform, and Load
data-warehouse denormalize elasticsearch etl etl-framework excel files mysql postgresql solr sql-server sqlce sqlite ssas
Last synced: 17 May 2024
![](https://github.com/dalenewman.png)
https://github.com/patterns-app/patterns-devkit
Data pipelines from re-usable components
data-analysis data-engineering data-pipeline data-pipelines data-science etl etl-framework etl-pipeline etl-pipelines functional-reactive-programming immutability pipelines sql
Last synced: 13 May 2024
![](https://github.com/patterns-app.png)
https://github.com/cefriel/chimera
Composable Semantic Transformation Pipelines
apache-camel etl-framework mediator pipelines rdf semantic-web
Last synced: 13 May 2024
![](https://github.com/cefriel.png)
https://github.com/globalbioticinteractions/globalbioticinteractions
Global Biotic Interactions provides access to existing species interaction datasets
biodiversity bioinformatics biology diet diseases ecoinformatics ecology eol etl-framework food-webs globi parasites pollinators species-interactions
Last synced: 08 May 2024
![](https://github.com/globalbioticinteractions.png)
https://github.com/cloudquery/cloudquery
The open source high performance ELT framework powered by Apache Arrow
airbyte attack-surface-management aws azure bigquery cspm data data-analysis data-collection data-engineering data-integration elt etl etl-framework gcp github-api go google kubernetes sql
Last synced: 08 May 2024
![](https://github.com/cloudquery.png)
https://github.com/Cinchoo/ChoETL
ETL framework for .NET (Parser / Writer for CSV, Flat, Xml, JSON, Key-Value, Parquet, Yaml, Avro formatted files)
avro cinchoo-etl csharp csv dotnet etl etl-framework flat json keyvalue parquet parquet-files parser reader writer xml yaml
Last synced: 05 May 2024
![](https://github.com/Cinchoo.png)
https://github.com/usc-isi-i2/kgtk
Knowledge Graph Toolkit
embeddings etl-framework graphs kg knowledge-graphs rdf toolkit wikidata
Last synced: 04 May 2024
![](https://github.com/usc-isi-i2.png)
https://github.com/elastic/logstash
Logstash - transport and process your logs, events, or other data
etl-framework java jruby logging real-time-processing streaming
Last synced: 01 May 2024
![](https://github.com/elastic.png)
https://github.com/DAGWorks-Inc/hamilton
Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage and metadata. Runs and scales everywhere python does.
dag data-analysis data-engineering data-science dataframe etl etl-framework etl-pipeline feature-engineering featurization hacktoberfest lineage llmops machine-learning mlops numpy orchestration pandas python software-engineering
Last synced: 28 Apr 2024
![](https://github.com/DAGWorks-Inc.png)
https://github.com/frankframework/frankframework
The Frank!Framework is an easy-to-use, stateless integration framework which allows (transactional) messages to be modified and exchanged between different systems.
community-driven data-flow elt-framework erp erp-framework etl-framework framework frank integration integration-framework integration-platform ipaas java low-code low-code-development-platform low-code-platform open-source self-hosted system-integration xml-configuration
Last synced: 26 Apr 2024
![](https://github.com/frankframework.png)
https://github.com/ceumicrodata/mETL
mito ETL tool
data-integration etl etl-framework pipeline python
Last synced: 22 Apr 2024
![](https://github.com/ceumicrodata.png)
https://github.com/stitchfix/hamilton
A scalable general purpose micro-framework for defining dataflows. THIS REPOSITORY HAS BEEN MOVED TO www.github.com/dagworks-inc/hamilton
dag data-engineering data-platform data-science dataframe etl etl-framework etl-pipeline feature-engineering featurization hamilton hamiltonian machine-learning numpy pandas python software-engineering stitch-fix
Last synced: 20 Apr 2024
![](https://github.com/stitchfix.png)
https://github.com/marsupialtail/quokka
Making data lake work for time series
data-lake-analytics distributed etl-framework mlops sql
Last synced: 16 Apr 2024
![](https://github.com/marsupialtail.png)
https://github.com/apache/seatunnel-web
SeaTunnel is a distributed, high-performance data integration platform for the synchronization and transformation of massive data (offline & real-time).
apache data-integration data-pipeline etl-framework high-performance offline real-time seatunnel sql-engine
Last synced: 16 Apr 2024
![](https://github.com/apache.png)
https://github.com/MassStreetAnalytics/etl-framework
A framework for moving data into a data warehouse.
data-warehouse etl etl-components etl-framework etl-pipeline python sql sqlserver
Last synced: 01 Apr 2024
![](https://github.com/MassStreetAnalytics.png)
https://github.com/BitwiseInc/Hydrograph
A visual ETL development and debugging tool for big data
apache-spark big-data cascading etl etl-framework
Last synced: 01 Apr 2024
![](https://github.com/BitwiseInc.png)
https://github.com/aws-samples/amazon-redshift-serverless-rsql-etl-framework
Amazon Redshift Serverless RSQL ETL Framework
amazon-redshift aws aws-batch aws-step-functions etl-automation etl-framework
Last synced: 30 Mar 2024
![](https://github.com/aws-samples.png)
https://github.com/TriplyDB/Documentation
Documentation for the TriplyDB and TriplyETL products
etl-framework etl-pipeline graph-database linked-data production-systems semantic-web
Last synced: 21 Mar 2024
![](https://github.com/TriplyDB.png)