Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

https://github.com/flow-php/flow

Flow PHP - strongly typed data processing framework

etl etl-framework etl-pipeline

Last synced: 11 Jun 2024

https://github.com/sdcastillo/ExamPAData

A container for data sets to help actuaries who are practicing predictive analytics

content-marketing cran education etl etl-pipeline

Last synced: 04 Jun 2024

https://github.com/restarone/violet_rails

an app engine for your business. Seamlessly implement business logic with a powerful API. Out of the box CMS, blog, forum and email functionality. Developer friendly & easily extendable for your next SaaS/XaaS project. Built with Rails 6, Devise, Sidekiq & PostgreSQL

blog cms ember emberjs etl-automation etl-framework etl-pipeline forum multi-tenancy multitenancy rails ruby ruby-on-rails rubyonrails saas saas-boilerplate template violet-rails wordpress-replacement xaas

Last synced: 02 Jun 2024

https://github.com/apache/incubator-streampark

Make stream processing easier! Easy-to-use streaming application development framework and operation platform.

apache development-framework easy-to-use etl-pipeline operation-platform streaming streampark

Last synced: 31 May 2024

https://github.com/NitinSPatil15/Project-3-Data-Warehouse-with-AWS

An ETL pipeline that extracts data from S3, stages them in Redshift, and transforms data into a set of dimensional tables

etl-pipeline python redshift-database s3-bucket sql

Last synced: 27 May 2024

https://github.com/AuFeld/Data_Engineering_Projects

A collection of data engineering projects: data modeling, ETL pipelines, data lakes, infrastructure configuration on AWS, data warehousing, containerization, and a dashboard to monitor data pipeline KPIs

airflow aws cassandra data-engineering data-lake data-warehouse docker emr etl-pipeline infrastructure-as-code infrastructure-setup postgresql python redshift s3 spark

Last synced: 27 May 2024

https://github.com/techascent/tech.ml.dataset

A Clojure high performance data processing system

clojure csv dataframe datascience dataset etl-pipeline java machine-learning xlsx

Last synced: 11 May 2024

https://github.com/DAGWorks-Inc/hamilton

Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage and metadata. Runs and scales everywhere python does.

dag data-analysis data-engineering data-science dataframe etl etl-framework etl-pipeline feature-engineering featurization hacktoberfest lineage llmops machine-learning mlops numpy orchestration pandas python software-engineering

Last synced: 28 Apr 2024

https://github.com/stitchfix/hamilton

A scalable general purpose micro-framework for defining dataflows. THIS REPOSITORY HAS BEEN MOVED TO www.github.com/dagworks-inc/hamilton

dag data-engineering data-platform data-science dataframe etl etl-framework etl-pipeline feature-engineering featurization hamilton hamiltonian machine-learning numpy pandas python software-engineering stitch-fix

Last synced: 20 Apr 2024

https://github.com/cyber-drop/ethereum_analytical_db

Ethereum Analytical Database - Ethereum data access solution that can be used for analytics and application development. The solution works on a fast DB - Clickhouse.

api blockchain clickhouse dex erc20 erc223 erc721 eth ethereum ethereum-etl etl etl-pipeline

Last synced: 13 Apr 2024

https://github.com/Zipstack/unstract

No-code LLM Platform to launch APIs and ETL Pipelines to structure unstructured documents

etl-pipeline llm-platform unstructured-data

Last synced: 11 Apr 2024

https://github.com/MassStreetAnalytics/etl-framework

A framework for moving data into a data warehouse.

data-warehouse etl etl-components etl-framework etl-pipeline python sql sqlserver

Last synced: 01 Apr 2024

https://github.com/michalmiki/postgresql-etl

Building Python ETL pipeline for PostgreSQL DB

etl-pipeline postgresql python

Last synced: 01 Apr 2024

https://github.com/TriplyDB/Documentation

Documentation for the TriplyDB and TriplyETL products

etl-framework etl-pipeline graph-database linked-data production-systems semantic-web

Last synced: 21 Mar 2024