Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

https://github.com/raystack/meteor

Meteor is an easy-to-use, plugin-driven metadata collection framework to extract data from different sources and sink to any data catalog.

bigdata collector data-catalog data-management dataops extractors metadata scraper sinks

Last synced: 01 Jul 2024

https://github.com/intake/intake

Intake is a lightweight package for finding, investigating, loading and disseminating data.

data-access data-catalog python

Last synced: 29 Jun 2024

https://github.com/datahub-project/datahub

The Metadata Platform for your Data Stack

data-catalog data-discovery datahub hacktoberfest linkedin metadata

Last synced: 28 Jun 2024

https://github.com/datastrato/gravitino

World's most powerful data catalog service with providing a high-performance, geo-distributed and federated metadata lake.

ai-catalog data-catalog datalake federated-query lakehouse metadata metalake model-catalog skycomputing stratosphere

Last synced: 07 Jun 2024

https://github.com/sahays/serverless-analytics

AWS Serverless Analytics using Amazon S3, Athena, Glue, and QuickSight

athena aws-cli data-catalog dataset glue quicksight transform-data visualization

Last synced: 27 May 2024

https://github.com/carte-data/carte

A Python library to generate static data catalog sites. Carte scrapes metadata from your data assets and generates a fully searchable front end that's just HTML.

carte data-catalog data-discovery data-documentation lightweight-data-catalogs python-library

Last synced: 27 May 2024

https://github.com/intake/intake-esm

An intake plugin for parsing an Earth System Model (ESM) catalog and loading assets into xarray datasets.

cesm-lens climate-datasets cmip6 data-access data-catalog earth-system-model hacktoberfest intake pangeo

Last synced: 09 May 2024

https://github.com/getstrm/pace

Data policy IN, dynamic view OUT: PACE is the Policy As Code Engine. It helps you to programatically create and apply a data policy to a processing platform like Databricks, Snowflake or BigQuery, with definitions imported from Collibra, Datahub, ODD and the like.

bigquery data-catalog data-contracts data-governance data-processing databricks policy-enforcement snowflake

Last synced: 11 Apr 2024

https://github.com/recap-build/recap

Work with your web service, database, and streaming schemas in a single format.

data-catalog data-discovery data-engineering data-integration data-pipelines etl metadata recap

Last synced: 01 Apr 2024

https://github.com/rsyi/whale

🐳 The stupidly simple CLI workspace for your data warehouse.

data-catalog data-discovery data-documentation

Last synced: 31 Mar 2024

https://github.com/amundsen-io/amundsen

Amundsen is a metadata driven application for improving the productivity of data analysts, data scientists and engineers when interacting with data.

amundsen data-catalog data-discovery linuxfoundation metadata

Last synced: 23 Mar 2024

https://github.com/tokern/piicatcher

Scan databases and data warehouses for PII data. Tag tables and columns in data catalogs like Amundsen and Datahub

aws-athena aws-glue aws-redshift catalog data data-catalog database phi pii python snowflake

Last synced: 20 Mar 2024