Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

https://github.com/bruin-data/ingestr

ingestr is a CLI tool to copy data between any databases with a single command seamlessly.

bigquery copy-database data-ingestion data-integration data-pipeline duckdb ingestion-pipeline mssql postgresql snowflake

Last synced: 15 Jun 2024

https://github.com/apache/seatunnel

SeaTunnel is a next-generation super high-performance, distributed, massive data integration tool.

apache batch cdc change-data-capture data-ingestion data-integration elt high-performance offline real-time streaming

Last synced: 27 May 2024

https://github.com/dashbitco/broadway

Concurrent and multi-stage data ingestion and data processing with Elixir

broadway concurrent data-ingestion data-processing elixir genstage

Last synced: 01 May 2024

https://github.com/merantix-momentum/squirrel-core

A Python library that enables ML teams to share, load, and transform data in a collaborative, flexible, and efficient way :chestnut:

ai cloud-computing collaboration computer-vision cv data-ingestion data-mesh data-science dataops datasets deep-learning distributed jax machine-learning ml natural-language-processing nlp python pytorch tensorflow

Last synced: 16 Apr 2024

https://github.com/pravega/pravega

Pravega - Streaming as a new software defined storage primitive

data-ingestion distributed-storage real-time-data streaming streaming-data

Last synced: 31 Mar 2024