An open API service indexing awesome lists of open source software.

Projects in Awesome Lists tagged with data-infrastructure

A curated list of projects in awesome lists tagged with data-infrastructure .

https://github.com/zalando/postgres-operator

Postgres operator creates and manages PostgreSQL clusters running in Kubernetes

cluster data-infrastructure database-as-a-service golang kubernetes managed-services operator postgres postgres-operator postgresql

Last synced: 13 May 2025

https://github.com/StructuredLabs/preswald

Preswald is a WASM packager for Python-based interactive data apps: bundle full complex data workflows, particularly visualizations, into single files, runnable completely in-browser, using Pyodide, DuckDB, Pandas, and Plotly, Matplotlib, etc. Build dashboards, reports, and notebooks that run offline, load fast, and share like a document.

ai analytics analytics-engineering copilot data data-applications data-infrastructure data-pipelines data-sdk data-visualization gpt llm open-source python schema-management vscode

Last synced: 11 May 2025

https://github.com/structuredlabs/preswald

Preswald is a framework for building and deploying interactive data apps, internal tools, and dashboards with Python. With one command, you can launch, share, and deploy locally or in the cloud, turning Python scripts into powerful shareable apps.

ai analytics analytics-engineering copilot data data-applications data-infrastructure data-pipelines data-sdk data-visualization gpt llm open-source python schema-management vscode

Last synced: 13 May 2025

https://github.com/zalando/spilo

Highly available elephant herd: HA PostgreSQL cluster using Docker

data-infrastructure docker docker-image high-availability patroni postgresql python

Last synced: 14 May 2025

https://github.com/zalando/nakadi

A distributed event bus that implements a RESTful API abstraction on top of Kafka-like queues

apis data-infrastructure event-bus java java-8 kafka microservices postgresql restful

Last synced: 04 Oct 2025

https://github.com/zalando/PGObserver

A battle-tested, flexible & comprehensive monitoring solution for your PostgreSQL databases

data-infrastructure monitoring

Last synced: 29 Jul 2025

https://github.com/uktrade/sqlite-s3vfs

Python writable virtual filesystem for SQLite on S3

data-infrastructure diapp

Last synced: 16 Apr 2025

https://github.com/uktrade/stream-zip

Python function to construct a ZIP archive on the fly

data-infrastructure diapp zip

Last synced: 04 Feb 2026

https://github.com/zalando-incubator/spark-json-schema

JSON schema parser for Apache Spark

data-infrastructure

Last synced: 14 Apr 2025

https://github.com/abhishek-ch/data-machinelearning-the-boring-way

Build & Learn Data Engineering,Machine Learning over Kubernetes. No Shortcut approach.

data-infrastructure dataengineering datascience kubernetes machine-learning mlops

Last synced: 21 Mar 2025

https://github.com/uktrade/fargatespawner

Spawns JupyterHub single user servers in Docker containers running in AWS Fargate

data-infrastructure

Last synced: 12 Apr 2025

https://github.com/uktrade/stream-sqlite

Python function to extract rows from a SQLite file while iterating over its bytes

data-infrastructure sqlite

Last synced: 13 Jul 2025

https://github.com/bizzabo/elasticsearch_to_bigquery_data_pipeline

A generic data pipeline which will map Elasticsearch documents to Bigquery table rows

data-infrastructure rnd

Last synced: 14 Apr 2025

https://github.com/alphagov/consent-api

Service for sharing user consent to cookies across multiple domains

cookie-consent data-infrastructure data-infrastructure-team data-services sde

Last synced: 08 May 2025

https://github.com/uktrade/stream-unzip

Python function to stream unzip all the files in a ZIP archive on the fly

data-infrastructure diapp zip

Last synced: 13 Jan 2026

https://github.com/alphagov/sde-prototype-govuk

A fake GOV.UK homepage and start pages for SDE prototype services

cpto data-infrastructure data-infrastructure-team data-services sde

Last synced: 25 Jan 2026

https://github.com/mjdevaccount/market-data-store

Production market data infrastructure: TimescaleDB + FastAPI control-plane, async sinks, Python client. Handles OHLCV bars, fundamentals, news, options. Features: RLS isolation, backpressure mgmt, Prometheus metrics, cross-repo testing. Built for scale.

async data-infrastructure fastapi financial-data market-data postgresql prometheus python time-series timescaledb

Last synced: 31 Oct 2025

https://github.com/uktrade/stream-read-xbrl

Python package to parse Companies House accounts data in a streaming way

data-infrastructure diapp

Last synced: 24 Feb 2026

https://github.com/apelullo/yelp_health_data_curation_ops

An AWS-based data pipeline to extract, process, store, and monitor Yelp "health-related" facility data in support of ongoing health system initiatives.

academic-research automation aws data-access data-curation data-infrastructure data-pipelines health-data operations operations-research python yelp-dataset

Last synced: 08 Apr 2026