An open API service indexing awesome lists of open source software.

Projects in Awesome Lists tagged with great-expectations

A curated list of projects in awesome lists tagged with great-expectations .

https://github.com/iusztinpaul/energy-forecasting

🌀 𝗧𝗵𝗲 𝗙𝘂𝗹𝗹 𝗦𝘁𝗮𝗰𝗸 𝟳-𝗦𝘁𝗲𝗽𝘀 𝗠𝗟𝗢𝗽𝘀 𝗙𝗿𝗮𝗺𝗲𝘄𝗼𝗿𝗸 | 𝗟𝗲𝗮𝗿𝗻 𝗠𝗟𝗘 & 𝗠𝗟𝗢𝗽𝘀 for free by designing, building and deploying an end-to-end ML batch system ~ 𝘴𝘰𝘶𝘳𝘤𝘦 𝘤𝘰𝘥𝘦 + 2.5 𝘩𝘰𝘶𝘳𝘴 𝘰𝘧 𝘳𝘦𝘢𝘥𝘪𝘯𝘨 & 𝘷𝘪𝘥𝘦𝘰 𝘮𝘢𝘵𝘦𝘳𝘪𝘢𝘭𝘴

3-pipeline-design airflow batch-processing cicd data-versioning docker fastapi feature-store gcp github-actions great-expectations hopsworks ml-monitoring mlops model-registry poetry python sktime streamlit weights-and-biases

Last synced: 15 May 2025

https://github.com/adidas/lakehouse-engine

The Lakehouse Engine is a configuration driven Spark framework, written in Python, serving as a scalable and distributed engine for several lakehouse algorithms, data flows and utilities for Data Products.

big-data configuration-driven data-engineering data-quality databricks delta-lake framework great-expectations lakehouse spark

Last synced: 12 Apr 2025

https://github.com/josephmachado/data_engineering_best_practices

Sample project to demonstrate data engineering best practices

data-engineering delta-lake etl great-expectations minio pyspark spark

Last synced: 15 Apr 2025

https://github.com/gokumohandas/testing-ml

Learn how to create reliable ML systems by testing code, data and models.

great-expectations machine-learning mlops pytest testing

Last synced: 30 Apr 2025

https://github.com/prefecthq/prefect-great-expectations

Prefect integrations for interacting with Great Expectations

expectations great great-expectations prefect

Last synced: 18 Feb 2025

https://github.com/hoangsonww/end-to-end-data-pipeline

📈 A scalable, production-ready data pipeline for real-time streaming & batch processing, integrating Kafka, Spark, Airflow, AWS, Kubernetes, and MLflow. Supports end-to-end data ingestion, transformation, storage, monitoring, and AI/ML serving with CI/CD automation using Terraform & GitHub Actions.

airflow apache docker elasticsearch flink grafana great-expectations hadoop influxdb kafka kubernetes looker minio mlflow postgresql prometheus python spark sql terraform

Last synced: 09 Apr 2025

https://github.com/moritzkoerber/covid-19-data-engineering-pipeline

A Covid-19 data pipeline on AWS featuring PySpark/Glue, Docker, Great Expectations, Airflow, and Redshift, templated in CloudFormation and CDK, deployable via Github Actions.

apache-airflow apache-spark api aws aws-cdk aws-cloudformation aws-ecr aws-glue aws-lambda aws-redshift aws-s3 docker great-expectations pyspark spark

Last synced: 28 Apr 2025

https://github.com/josephmachado/data_engineering_best_practices_log

Code to demonstrate data engineering metadata & logging best practices

grafana great-expectations logging metadata minio postgresql prometheus spark

Last synced: 15 Apr 2025

https://github.com/grillazz/fastapi-greatexpectations

Run greatexpectations.io on ANY SQL Engine using REST API. Supported by FastAPI, Pydantic and SQLAlchemy as best data quality tool

dataquality dataqualitycheck fastapi great-expectations pydantic python python3 sql sqlalchemy

Last synced: 14 May 2025

https://github.com/serialbandicoot/great-assertions

This library is inspired by the Great Expectations library. The library has made the various expectations found in Great Expectations available when using the inbuilt python unittest assertions.

data-science data-testing databricks great-expectations jupyter-notebook python python3 quality-assurance testing

Last synced: 13 Feb 2025

https://github.com/datarootsio/notion-dbs-data-quality

Using Great Expectations and Notion's API, this repo aims to provide data quality for our databases in Notion.

data-engineering-pipeline data-quality great-expectations notion notion-api notion-database

Last synced: 11 Apr 2025

https://github.com/adidas/lakehouse-engine-docs

The Goal of this project is to provide documentation for the Lakehouse Engine framework.

big-data data-engineering data-quality databricks delta-lake framework great-expectations lakehouse lakehouse-engine spark

Last synced: 14 Feb 2025

https://github.com/great-expectations/cloud

Source code for the gx cloud agent

great-expectations

Last synced: 07 Apr 2025

https://gitlab.com/anacision/kedro-expectations

Our fork of https://github.com/joao-pampanin/kedro-expectations "Tool to better integrate Kedro and Great Expectations" which supports newer versions of Kedro and Great Expectations, and has integrated some cool new features like Email alerts, delayed failure raising and performance gains.

datascience great-expectations kedro python

Last synced: 19 Feb 2025

https://github.com/paulf-999/data_profiling_w_great_expectations

Bulk Data Profiling Solution using Great Expectations

great-expectations makefile python3

Last synced: 22 Mar 2025

https://github.com/firoz-ahmad-likhon/great-expectations-example

Sample project to demonstrate the use of Great Expectations

data-engineering data-quality data-validation great-expectations python

Last synced: 21 Feb 2025

https://github.com/artemantonov/gearbox-speed-estimation-via-vibration-analysis

Gearbox speed estimation using vibration data transformed via FFT and a lightweight PyTorch CNN

cnn early-stopping fft-analysis great-expectations mlflow pandas python pytorch seaborn time-series vibrational-analysis

Last synced: 06 May 2025