Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

BigQuery

Google BigQuery enables companies to handle large amounts of data without having to manage infrastructure. Google’s documentation describes it as a « serverless architecture (that) lets you use SQL queries to answer your organization’s biggest questions with zero infrastructure management. BigQuery’s scalable, distributed analysis engine lets you query terabytes in seconds and petabytes in minutes. » Its client libraries allow the use of widely known languages such as Python, Java, JavaScript, and Go. Federated queries are also supported, making it flexible to read data from external sources.

📖 A highly rated canonical book on it is « Google BigQuery: The Definitive Guide », a comprehensive reference. Another enriching read on the subject is the inside story told in the article by the founding product manager of BigQuery celebrating its 10th anniversary.

https://github.com/hirosassa/bqvalid

SQL linter tool for BigQuery GoogleSQL (formerly known as StandardSQL).

bigquery google linter sql

Last synced: 02 Nov 2024

https://github.com/hackersandslackers/bigquery-sqlalchemy-tutorial

:bar_chart: :arrow_right: :floppy_disk: ETL script to migrate data from BigQuery to SQL.

bigquery bigquery-sqlalchemy-tutorial databases etl mysql postgres python sql sqlalchemy tutorial

Last synced: 09 Nov 2024

https://github.com/digitalghost-dev/stock-data-pipeline

Visualizing S&P 500 data on a webpage with Python.

bigquery google-cloud-platform python

Last synced: 06 Nov 2024

https://github.com/naseemkullah/gcp-accountant

A tool to identify high cost resources in GCP at a granular level

bigquery cost cost-engineering cost-resources gcp gcp-accountant

Last synced: 09 Nov 2024

https://github.com/shinichi-takii/vscode-language-sql-bigquery

Syntax highlighting and code snippets for BigQuery SQL in Visual Studio Code

bigquery grammar snippets sql syntax-highlighting vscode vscode-extension

Last synced: 31 Oct 2024

https://github.com/ottogroup/bquest

Effortlessly validate and test your Google BigQuery queries with the power of pandas DataFrames in Python.

bigquery google-big-query google-cloud integration testing

Last synced: 18 Nov 2024

https://github.com/medjed/embulk-input-bigquery

BigQuery input plugin for Embulk loads records from BigQuery

bigquery embulk

Last synced: 12 Oct 2024

https://github.com/yoheimuta/dbq

CLI tool to easily Decorate BigQuery table name

bigquery bq cli golang table-decorator

Last synced: 13 Oct 2024

https://github.com/mesmacosta/bq-fake-pii-table-creator

Library for creating BigQuery tables with fake PII data

bigquery fake-data faker governance-dapps metadata piidata piii

Last synced: 11 Nov 2024

https://github.com/ksalama/datalab-notebooks

This repository includes end-to-end labs on how to use GCP for applied data science

bigquery cloudml dataflow datalab gcp

Last synced: 04 Dec 2024

https://github.com/fivetran/zetasql-npm

npm package for ZetaSQL library

bigquery grpc sql zetasql

Last synced: 12 Dec 2024

https://github.com/modataconsulting/dbt_ga4_project

This project uses Google Analytics 4 BigQuery Exports as its source data, and offers useful base transformations to provide report-ready dimension & fact models that can be used for reporting purposes, blending with other data, and/or feature engineering for ML models.

bigquery bq data-build-tool dbt ga4 google-analytics-4 sql

Last synced: 12 Oct 2024

https://github.com/googlecloudplatform/datacatalog-tag-history

Historical metadata of your data warehouse is a treasure trove to discover not just insights about changing data patterns, but also quality and user behaviour. This solution creates Data Catalog Tags history in BigQuery since Data Catalog keeps only the latest version of metadata for fast searchability.

analytics bigquery data-catalog data-governance metadata-management

Last synced: 22 Jan 2025

https://github.com/googlecloudplatform/cloud-composer-mssql-dataflow-bigquery

This repository contains an example of how to leverage Cloud Composer and Cloud Dataflow to move data from a Microsoft SQL Server to BigQuery. The diagrams below demonstrate the workflow pipeline.

airflow bigquery cloud-composer dataflow microsoft-sql-server

Last synced: 07 Oct 2024

https://github.com/livebook-dev/req_bigquery

Conveniences for querying Google BigQuery with Req

bigquery req

Last synced: 11 Nov 2024

https://github.com/data-tools/big-data-types

A library to transform Scala product types and Schemes from different systems into other Schemes. Any implemented type automatically gets methods to convert it into the rest of the types and vice versa. E.g: a Spark Schema can be transformed into a BigQuery table.

apache-spark bigquery bigquery-tables cassandra circe database-types scala schemas spark typeclass typeclass-derivation typesafe

Last synced: 12 Oct 2024

https://github.com/oliveroneill/bigqueryswift

BigQuery client for Swift

bigquery google-cloud-platform swift

Last synced: 11 Oct 2024

https://github.com/sungchun12/schedule-python-script-using-google-cloud

:clock4: Schedules a Python script to append data into Bigquery using Google Cloud's App Engine with a cron job

appengine-python bigquery chicago-traffic cron google-cloud python-script

Last synced: 28 Oct 2024

https://github.com/openbridge/ob_datastash

Stream your CSV files to an HTTP API

aws bigquery csv csv-files logstash parquet redshift

Last synced: 14 Nov 2024

https://github.com/badal-io/gcp-airflow-foundations

Opinionated framework based on Airflow 2.0 for building pipelines to ingest data into a BigQuery data warehouse

airflow apache-airflow bigquery dags data-engineering data-pipeline etl-pipeline

Last synced: 29 Oct 2024

https://github.com/toddbirchard/ghost-webhook-api

📑 🎛️ API to automate optimizations for self-hosted blogging platforms.

api automation bigquery blogging ghost github-api google-cloud-storage python webhook-api

Last synced: 16 Nov 2024

https://github.com/google-marketing-solutions/cwv_from_ga4_exports

Simple solution to make reporting on CWVs from BQ simpler to set up.

analytics bigquery google google-cloud-platform

Last synced: 05 Dec 2024

https://github.com/googlecloudplatform/google-cloud-abap

ABAP SDK for Google Cloud and BigQuery Connector for SAP enable customers to easily consume Google Products and Services natively from their SAP Landscape.

abap abap-development abapsdk abapsdkforgcp bigquery google-cloud-platform google-generative-ai google-maps-api vertex-ai

Last synced: 07 Oct 2024

https://github.com/orisano/bqspec

SQL testing tool for Google BigQuery.

bigquery cli python test yaml

Last synced: 09 Nov 2024

https://github.com/vickyjkwan/sqlanalyzer

A SQL parser and analyzer for sql flavors including MySQL, PostgreSQL, BigQuery Standard SQL, Presto SQL and Hive SQL.

athena bigquery hiveql metastore presto sqlparser standardsql

Last synced: 12 Oct 2024

https://github.com/splitmedialabslimited/supermigration

A CLI tool to perform migrations on BigQuery tables

bigquery bigquery-schema gcp node nodejs

Last synced: 06 Dec 2024

https://github.com/kesin11/ts-junit2json

Convert JUnit XML format to JSON with TypeScript

bigquery junit-xml

Last synced: 10 Nov 2024

https://github.com/manuelguerra1987/data-engineering-zoomcamp-notes

Notes and material from 2025 Data Engineering Zoomcamp by Datatalks.Club

airflow bigquery data-engineering docker kubernetes

Last synced: 16 Jan 2025

https://github.com/jashparekh/bigquery-action

This Github action can be used to deploy tables/views schemas to BigQuery.

actions bigquery gbq github-actions google google-bigquery google-cloud-platform hacktoberfest

Last synced: 23 Oct 2024

https://github.com/pcorbel/metaquery

An API to analyze BigQuery metadata

bigquery golang gorm vue-router vuejs vuetifyjs vuex

Last synced: 10 Nov 2024

https://github.com/hackersandslackers/bigquery-python-tutorial

:bar_chart: :snake: Create tables in Google BigQuery, auto-generate their schemas, and retrieve said schemas.

bigquery data-warehouse gcs google-bigquery google-cloud google-cloud-sdk google-cloud-storage python tutorial

Last synced: 09 Nov 2024

https://github.com/stape-io/request-to-gcs-function

Google Cloud Function that saves everything that came in request to Google Cloud Storage

bigquery gtm gtm-server-side stape

Last synced: 11 Dec 2024

https://github.com/tufin/espresso

A framework for writing testable BigQuery queries

bigquery sql testing

Last synced: 23 Jan 2025

https://github.com/urish/nn-function-generator

Experimenting with automatic generation of TS function bodies using ANN models

bigquery tensorflow tsquery typescript

Last synced: 12 Nov 2024

https://github.com/minodisk/zoq

Convert Zod to BigQuery Schema

bigquery bigquery-schema bigquery-schema-converter zod

Last synced: 19 Oct 2024

https://github.com/janaom/gcp-de-project-streaming-pubsub-beam-dataflow

This project demonstrates an end-to-end solution for processing and analyzing real-time conversations data from a JSON file using GCP services and infrastructure automation, showcasing data storage, streaming, processing, and analysis at scale.

apache-beam bigquery dataflow de-project gcp pubsub streaming-data

Last synced: 20 Jan 2025

https://github.com/janaom/gcp-de-project-uber-etl-pipeline

Technologies used: GCS, Compute Engine, Mage, BigQuery, Looker, Python

bigquery gcp looker mage

Last synced: 20 Jan 2025

https://github.com/ksalama/data2cooc2emb2ann

Learning embeddings from item co-occurrence statistics, and building an approx. nearest neighbour index

apache-beam bigquery dataflow embeddings machine-learning python3 tensorflow

Last synced: 04 Dec 2024

https://github.com/viant/bigquery

BigQuery database/sql golang driver

bigquery driver golang sql

Last synced: 18 Nov 2024

https://github.com/snithish/tpc-di_benchmark

Benchmark for Airflow with BigQuery as the Data Warehouse using TPC - DI

airflow benchmark bigquery tpc-di

Last synced: 17 Nov 2024

https://github.com/nodefluent/purpur

:diamond_shape_with_a_dot_inside: kafka-connectors as a service | ETL :purple_heart:

bigquery connectors etl gcloud kafka kafka-connect mysql nodejs redis saas

Last synced: 23 Jan 2025

https://github.com/dp6/raft-suite-hub

O Hub é a solução responsável por centralizar a consolidação dos dados no BigQuery, ferramenta escolhida para servir de data warehouse do raft-suite.

bigquery data data-quality google-cloud google-cloud-functions hacktoberfest

Last synced: 04 Dec 2024

https://github.com/tomayac/http-archive-progressive-web-apps

Different approaches to estimate the number of Progressive Web Apps in the HTTP Archive

bigquery httparchive

Last synced: 16 Oct 2024

https://github.com/edgarrmondragon/meltano-dogfood

Personal dogfood Meltano project

bigquery dbt dogfood elt evidence-dev meltano

Last synced: 15 Oct 2024

https://github.com/mchmarny/pubsub-to-bigquery-pump

Simple utility combining Cloud Run and Stackdriver metrics to drain JSON messages from PubSub topic into BigQuery table

bigquery cloudrun events golang metrics pubsub stackdriver

Last synced: 18 Oct 2024

https://github.com/memsjava/bigquery-helper

A helper package for Google BigQuery operations

bigquery google pandas-dataframe

Last synced: 14 Oct 2024

https://github.com/christippett/bigquery-geo-router

Calculate routes from long/lat coordinates in BigQuery using OpenStreetMap/OSRM

bigquery geospatial google openstreetmap osrm

Last synced: 19 Nov 2024

https://github.com/wintermi/imdb-dataform

An example Dataform project to load and transform the publicly available dataset from IMDB.

bigquery dataform google-cloud google-cloud-platform

Last synced: 09 Nov 2024

https://github.com/armanbilge/gcp4s

Cross-platform JVM/JS Google Cloud Platform integrations for fs2 and friends

bigquery google-cloud scalajs

Last synced: 12 Oct 2024

https://github.com/snithish/tpc-ds_big-query

Scripts to execute TPC - DS on Big Query

benchmark bigquery tpc-ds-benchmark tpc-ds-queries

Last synced: 17 Nov 2024

https://github.com/tamanobi/bq-query-unittest

BigQueryのクエリのロジックをデータ走査量を最小限してテストできるツール

bigquery sql unittest

Last synced: 19 Nov 2024

https://github.com/gr8distance/blanton

BigQuery API wrapped by Elixir

bigquery bigquery-schema elixir

Last synced: 29 Oct 2024

https://github.com/kitagry/bqls

WIP: BigQuery language server

bigquery language-server

Last synced: 02 Nov 2024

https://github.com/tobked/fetch-apache-ga-stats

Repository to make "snapshots" of GitHub Action queue for later analysis

bigquery gcp github github-actions

Last synced: 15 Oct 2024

https://github.com/sigpwned/litecene

A simple cross-data store full-text search language for Java 8+

bigquery full-text-search java query-language search

Last synced: 14 Oct 2024

https://github.com/aliasoblomov/bigquery-ga4-queries

List of all queries for Google Analytics 4 data export in BigQuery

bigquery ga4 gcp sql

Last synced: 13 Dec 2024

https://github.com/wayfair-incubator/gbq

Python wrapper for interacting with Google BigQuery.

bigquery gbq google google-bigquery google-cloud-platform hacktoberfest python

Last synced: 12 Oct 2024

https://github.com/corneliusweig/krew-index-tracker

Saves download statistics of `krew.dev` plugins to BigQuery

bigquery history krew krew-index statistics

Last synced: 18 Oct 2024

https://github.com/k1low/tbls-meta

tbls-meta is an external subcommand of tbls for applying metadata managed by tbls to the datasource.

bigquery data-catalog-management

Last synced: 12 Oct 2024

https://github.com/kellyjadams/run-sql-in-python

Scripts to connect python to BigQuery or a PostgreSQL database.

bigquery postgresql python

Last synced: 13 Oct 2024

https://github.com/dataform-co/bigquery-ml-pipeline

An example of machine pipeline on Bigquery ML using Dataform

bigquery bigquery-ml dataform machine-learning-pip sql

Last synced: 12 Jan 2025

https://github.com/fairscript/interact

A database interaction library for node.js/JavaScript/TypeScript that uses code reflection to maximize type safety and minimize friction. Supports PostgreSQL, Google BigQuery and SQLite.

bigquery data database linq orm postgresql reflection sql sqlite typesafe

Last synced: 12 Dec 2024

https://github.com/doitintl/terraform-bq-scheduled-queries

This is a demo project to use Terraform to manage BigQuery scheduled queries with Cloud Build CI/CD

bigquery cicd cloudbuild terraform

Last synced: 12 Nov 2024

https://github.com/shnewto/bqjson

bqjson - Serialize/Deserialzie BigQuery TableResults to/from JSON

bigquery java json maven serde serde-json serialization serializer tableresult testing tests

Last synced: 27 Oct 2024

https://github.com/cata-network/cadence-docs

cadence document, Chinese version

bigquery

Last synced: 07 Nov 2024

https://github.com/ergut/mcp-bigquery-server

A Model Context Protocol (MCP) server that provides secure, read-only access to BigQuery datasets. Enables Large Language Models (LLMs) to safely query and analyze data through a standardized interface.

bigquery google-cloud mcp mcp-servers model-context-protocol sql

Last synced: 13 Dec 2024

https://github.com/webocs/mining-github-microservices

Gihub mining replication package for the article "Microservices in the Wild: the Github Landscape". It's A short node program that takes a prefiltered set of github repositories (Filtered with Google BigQuery) and uses GitHub API to find the ones that have a X nubmer of stars

bigdata bigquery microservices node

Last synced: 16 Jan 2025

https://github.com/hsbc/bqtools

The code repo for bqtools-json a package for managing bigquery using json exemplar data structure and home of the bqsync utility.

backup bigquery google-cloud

Last synced: 23 Nov 2024

https://github.com/pierrec1024/airflow-provider-bigquery-reservation

Airflow provider for bigquery reservation operators.

airflow bigquery reservation

Last synced: 12 Oct 2024

https://github.com/trocco-io/embulk-output-bigquery_java

Java flavor faster Embulk output plugin to load/insert data into Google BigQuery

bigquery embulk etl java

Last synced: 12 Nov 2024

https://github.com/jordicenzano/brighcove-live-ssai-ccu

POC (proof of concept) to show a possible way to calculate the real time CCU (concurrent viewers) for any Brightcove live stream with SSAI

analytics bigquery brightcove gcp-ap gcp-appengine-flex gcp-cloud-functions hls live streaming video

Last synced: 06 Jan 2025

https://github.com/fivetran/zetasql-npm-examples

This repo contains examples of usage of zetasql-npm library

bigquery grpc sql zetasql

Last synced: 12 Dec 2024

https://github.com/wintermi/movielens-dataform

An example Dataform project which will use the publicly available Movielens dataset to demonstrate how to upload your product catalog and user events into either the Google Cloud Retail API or Google Cloud Discovery Engine and train a personalised product recommendation model.

bigquery dataform google-cloud google-cloud-platform vertex-ai

Last synced: 09 Nov 2024

https://github.com/e-nikitin/laravel-bigquery

Laravel BigQuery Wrapper

bigquery laravel-bigquery

Last synced: 25 Nov 2024

https://github.com/terashim/dataform-google-analytics-4-example

Dataform による Google アナリティクス 4 エクスポートデータの変換パイプライン

bigquery dataform google-analytics

Last synced: 30 Nov 2024

https://github.com/blockchain-etl/iotex-etl

ETL (extract, transform and load) tools for ingesting IoTeX blockchain data to Google BigQuery and Pub/Sub

bigquery blockchain-data iotex sql

Last synced: 21 Jan 2025

https://github.com/trk54ylmz/spark-bigquery

Google BigQuery support for Spark SQL

bigquery spark

Last synced: 18 Nov 2024

https://github.com/adam-cowley/neo4j-bigquery

Yo dawg, I heard you like queries so we put some BigQuery in your query so you can query BigQuery from your query

bigquery cypher neo4j neo4j-procedures

Last synced: 18 Dec 2024

https://github.com/phstudy/postgresql-zetasketch

ZetaSketch HLL++ functions for PostgreSQL

bigquery hll java postgresql postgresql-extension zetasketch

Last synced: 21 Jan 2025

https://github.com/ymyzk/prom2bq

Copy data from Prometheus to BigQuery

bigquery go prometheus

Last synced: 27 Oct 2024

https://github.com/viant/bqwt

BigQuery Windowed Tables

bigquery etl

Last synced: 07 Dec 2024

https://github.com/bzzt/alchemy_table

Opinionated framework for working with Bigtable and BigQuery

bigquery bigtable database elixir gcp googlecloud googlecloudplatform

Last synced: 19 Oct 2024

https://github.com/sukanyabag/gcp-ai-notebooks

This repository contains all practice notebooks with which I performed hands-on labs in Google Cloud Training Program's "Cloud ML-AI Track"

bigquery cloudml-samples data-science dataprep tensorflow-tutorials

Last synced: 21 Dec 2024

https://github.com/takegue/bqmake

BigQuery Powered Data Build Suite.

bigquery sql

Last synced: 12 Oct 2024

https://github.com/badal-io/dataflow-timeseries-iot-gas-demo

Dataflow code for integration with GCP Core IoT and FogLamp

bigquery dataflow foglamp

Last synced: 11 Nov 2024

https://github.com/nodeart/koatuu-to-bigquery

load koatuu from https://data.gov.ua/dataset/dc081fb0-f504-4696-916c-a5b24312ab6e to Google BigQuey in denormalized form

bigquery google-bigquey koatuu

Last synced: 18 Nov 2024

https://github.com/wintermi/bqe-dataform

A Dataform project which aggregates BigQuery system metadata for the purpose of analysing the slot usage and storage within an organization by project.

bigquery dataform google-cloud google-cloud-platform

Last synced: 09 Nov 2024

BigQuery Awesome Lists
BigQuery Categories