An open API service indexing awesome lists of open source software.

BigQuery

Google BigQuery enables companies to handle large amounts of data without having to manage infrastructure. Google’s documentation describes it as a « serverless architecture (that) lets you use SQL queries to answer your organization’s biggest questions with zero infrastructure management. BigQuery’s scalable, distributed analysis engine lets you query terabytes in seconds and petabytes in minutes. » Its client libraries allow the use of widely known languages such as Python, Java, JavaScript, and Go. Federated queries are also supported, making it flexible to read data from external sources.

📖 A highly rated canonical book on it is « Google BigQuery: The Definitive Guide », a comprehensive reference. Another enriching read on the subject is the inside story told in the article by the founding product manager of BigQuery celebrating its 10th anniversary.

https://github.com/keminghe/medo

An automated, cloud-agnostic platform that unifies enterprise data silos into actionable insights while optimizing cross-cloud costs and compliance.

adk bigquery gcp nl2sql

Last synced: 31 Jul 2025

https://github.com/wintermi/tmdb-dataform

An example Dataform project to load and transform the publicly available dataset from The Movie Database into a format which could be imported into Vertex AI Search for Media, allowing you to build a search engine for movies.

bigquery dataform google-cloud google-cloud-platform

Last synced: 07 Feb 2026

https://github.com/brews/bucket2bq

Create an inventory of objects in GCS Bucket with metadata and upload to Big Query

bigquery gcp golang google-cloud-storage

Last synced: 24 Jan 2026

https://github.com/tknishh/case-study-ueats-ghub-sql

Analyzing the Impact of Business Hour Mismatch on Order Volume in the Food Delivery Industry: A Case Study of UEats and Ghub

assignment bigquery case-study-analysis loop product-analyst sql

Last synced: 23 Aug 2025

https://github.com/mehmoodulhaq570/bigquery_machine_learning_project

Developed a machine learning model to predict incident groups based on data from the London Fire Brigade service calls.

bigquery bigquery-dataset cloud database jupyter-notebook machine-learning machine-learning-algorithms ml models prediction-algorithm prediction-model python

Last synced: 07 May 2026

https://github.com/evry-ace/statsbot

Slack Bot to forward message statistics to BigQuery

bigquery slack slack-bot slackbot

Last synced: 31 Oct 2025

https://github.com/takegue/bigquery-porter

BigQuery Deployment and Metadata Management tool

bigquery

Last synced: 12 Feb 2026

https://github.com/xlfe/pyjdbq

The easiest way to ship journald logs to Google BigQuery

bigquery journald journald-logs logging security

Last synced: 25 Aug 2025

https://github.com/rezuankassim/bqanalytic

Laravel package to use analytic data imported to Big Query from Firebase Analytic

bigquery firebase-analytics laravel

Last synced: 01 Feb 2026

https://github.com/jugnuarora/france-courses-enrollments

Data Pipeline creation of france courses enrollments. Every month the providers report the enrollments in their programs. The idea is to get the courses listed as well as the enrollments every month and look at the trend of enrolments and the inter comparison of the trainings s providers for different courses.

bigquery data-analytics data-engineering data-ingestion-and-infrastructure data-pipeline dbt gcp gcs kestra-workflows looker-studio

Last synced: 27 Apr 2026

https://github.com/yandex-cloud-examples/yc-bigquery-to-object-storage

Экспорт данных из Google Big Query через Google Storage в Object Storage Yandex Cloud.

bigquery object-storage python3 yandex-cloud yandexcloud

Last synced: 05 May 2026

https://github.com/lpraat/inbq

A library for parsing BigQuery queries and extracting schema-aware, column-level lineage.

bigquery data-lineage parser sql

Last synced: 26 Apr 2026

https://github.com/jasontanx/gsheet-to-bq-ingestion

Data ingestion from Google Sheet to BigQuery

bigquery data-engineering data-ingestion gsheets

Last synced: 02 May 2026

https://github.com/ovotech/bigquery-metrics-exporter

A Golang application to export table level metrics from BigQuery into Datadog.

bigquery company-ovo datadog

Last synced: 06 Feb 2026

https://github.com/leandronasx/agro-data

Projeto final da formação de analista de dados e dashboard da SoulCode Academy.

bigquery data-analysis gcp looker pandas powerbi python

Last synced: 18 Jul 2025

https://github.com/t3n/gtmetrix-bq

A script running browser test of specified urls through GTmetrix and saving metrics in BigQuery.

bigquery gtmetrix

Last synced: 15 May 2026

https://github.com/oalles/avro-schema-from-bq-table

Get Avro Schema from Google Cloud Big Query table

avro avro-schema bigquery google-cloud java spring spring-boot

Last synced: 04 May 2026

https://github.com/pcorbel/go-bigquery-acl

Simply apply ACL on BigQuery resources

acl bigquery config golang security

Last synced: 19 Apr 2026

https://github.com/jehiah/socrata_to_bigquery

A tool to copy public data to BigQuery

bigquery opendata socrata

Last synced: 05 Mar 2026

https://github.com/moh-ayman/stripeapi-to-bq---cfunc-etl

Google Cloud Function built to perform an ETL Job to Collect StripeAPI Data and Transform it to be able to Import it to Bigquery.

bigquery dataengineering etl-pipeline gcp gcp-cloud-functions pandas-dataframe python stripe-api

Last synced: 18 Apr 2026

https://github.com/deepraj1729/gcp-cloud-billing-api

Cloud Billing - Cost Monitoring and Alerting API for Google Cloud (Billing Exports)

bigquery fastapi gcp python redis

Last synced: 10 Apr 2026

https://github.com/triglav-dataflow/triglav-agent-bigquery

BigQuery agent for Triglav, data-driven workflow tool

bigquery ruby triglav-agent

Last synced: 14 Feb 2026

https://github.com/taquynhnga2001/proptech-dagster

Build an ELT pipeline with dagster and dbt to schedule loading HDB resale transactions in Singapore into Google BigQuery data warehouse, then create Power BI dashboard to enhance insight exploration.

bigquery dagster data-integration data-orchestration data-warehouse dbt elt etl powerbi python

Last synced: 14 Feb 2026

https://github.com/oguzgn/firebase-ab-test-analysis-for-a-mobile-race-game

This repository showcases an infrastructure designed for analyzing A/B tests in mobile games. It leverages BigQuery to process Firebase and GA4-based event data and uses Looker Studio for dynamic visualization. The project simplifies A/B test comparisons, enabling stakeholders to view results directly through interactive dashboards.

ab-testing ab-testing-analysis bigquery event-based-tracking firebase looker-studio mobile-game-analytics race-game sql

Last synced: 19 May 2026

https://github.com/justinbeckwith/bisquick

🥞Synchronize your GitHub issues with BigQuery. Do neat stuff.

bigquery dotnet github

Last synced: 28 Apr 2026

https://github.com/loinguyen3108/sportify-music-analysis

Engineered the streaming crawler pipeline using Kafka to extract, transform, and load Spotify data into PostgreSQL and ClickHouse for real-time analytics. Additionally, developed an automated batching pipeline using Airflow and Spark to efficiently ETL crawled data into BigQuery.

airflow bigquery clickhouse kafka pyspark spotify

Last synced: 09 Apr 2025

https://github.com/tosh2230/pubsub-dataflow-bigquery

Google Cloud Dataflow for 'Exactly-Once' streaming insertion, from Google Cloud Pub/Sub to Google BigQuery.

bigquery dataflow gcp google-cloud google-cloud-platform pubsub

Last synced: 15 May 2026

https://github.com/morphl-ai/morphl-model-publishers-churning-users-bigquery

BigQuery connector, pre-processor and model for predicting churning users for digital publishers using Google Analytics 360

bigquery google-analytics machine-learning morphl-platform pipeline preprocessor pyspark

Last synced: 15 May 2026

https://github.com/romange/puma

Bigquery-like engine for processing structured json-like records

bigquery cpp11 engine

Last synced: 18 May 2026

https://github.com/varun-khorgade/weatherflow-etl-data-pipeline

ETL pipeline to fetch, clean, and load weather datasets for structured analysis.

bigquery data-engineering etl-pipeline pandas postresql psycopg2 sql

Last synced: 16 May 2026

https://github.com/nghiant3110/google_analytic_4

This is a DA project based on the GA4 Sample dataset on Big Query

bigquery google-analytics looker-studio sql

Last synced: 11 Apr 2025

https://github.com/Miguelapp10/ETL_OperadorLogistico

extraer datos de la API de SimpliRoute, AndesExpress y Urbano en un rango de fechas específico y procesarlos para su análisis y almacenamiento en Google BigQuery

api-client bigquery pandas python

Last synced: 11 Jul 2025

https://github.com/metrics-pli/bigquery-export

Exports collected metrics to Google Big Query

bigquery datastudio lighthouse metrics metrics-pli performance pupeteer

Last synced: 16 May 2026

https://github.com/rohitsanj/superset-dbt-demo

This repository contains an example project (Jaffle Shop) demonstrating integration between Superset and dbt, with BigQuery as the data warehouse.

apache-superset bigquery dbt superset

Last synced: 17 May 2026

https://github.com/mchmarny/xstreams

Stream processing using Cloud PubSub and Dataflow SQL in BigQuery

bigquery dataflow gce gcp golang pubsub stream

Last synced: 17 May 2026

https://github.com/misicode/Kaggle-Intro_to_SQL

Solutions of the exercises from the course "Introduction to SQL with BigQuery" by @Kaggle.

bigquery kaggle kaggle-intro-to-sql sql

Last synced: 10 Mar 2025

https://github.com/thedumbterminal/bigquery-js-udf-example

Example of Javascript UDFs for use with Google BigQuery

bigquery google javascript

Last synced: 18 May 2026

https://github.com/icarusso/bigqueryexporter

Export query data from google bigquery to local machine

bigquery csv export python

Last synced: 11 Jul 2025

https://github.com/danlessa/meta_qa

A practical one-liner metalanguage for describing common-sense in an machine-friendly way.

bigquery metalanguage

Last synced: 03 May 2026

https://github.com/landerox/cloud-landerox-data

Reference architecture baseline for GCP data platforms (Apache Beam, BigQuery, Cloud Functions, Pub/Sub). Hybrid warehouse/lakehouse with batch + streaming, Medallion layering. Consumed by private runtime repos.

apache-beam batch-processing bigquery cloud-functions cloud-storage data-engineering data-platform dataform gcp google-cloud-dataflow iceberg lakehouse medallion-architecture opentelemetry pubsub python reference-architecture slsa streaming supply-chain-security

Last synced: 21 May 2026

https://github.com/landerox/cloud-landerox-infra

GCP Terraform baseline and reference architecture — multi-environment CI/CD, defense-in-depth (validations + Conftest + Sigstore plan attestation), Workload Identity Federation, BigQuery medallion, recipes per module. OpenSSF Best Practices silver.

artifact-registry bigquery checkov cicd cloud-run cloud-scheduler conftest devsecops gcp iam infrastructure-as-code openssf reference-architecture secret-manager sigstore slsa terraform terraform-modules workload-identity-federation

Last synced: 21 May 2026

https://github.com/oguzgn/a-case-study-for-a-livestreaming-platform

This project aims to analyze livestream watch times of users across different regions. The goal is to identify the top 5 users with the highest watch time for each region. The analysis involves multiple SQL transformations to extract meaningful insights from the data.

bigquery data data-analysis data-modeling live-streaming sql

Last synced: 23 Jun 2025

https://github.com/gdbecker/dbtlabslearning

Learn the foundational steps of transforming data in dbt Cloud. Start by connecting dbt Cloud to a data warehouse and Git repository, then explore key concepts like modeling, sources, testing, documentation, and deployment. Get hands-on by building a model and running tests in dbt Cloud.

analytics-engineering bigquery dbt dbt-cloud jinja macros models packages sql testing

Last synced: 02 Jan 2026

https://github.com/antoinegiraud/dataform_hypermarche

SQL repo orchestrated by Dataform for BigQuery

bigquery dataform

Last synced: 12 Sep 2025

https://github.com/ajaxbarcelonacruyff/gcp_cost

Monitoring Google Cloud costs with Looker Studio.

bigquery googlecloud googlecloudplatform lookerstudio

Last synced: 14 May 2025

https://github.com/ajaxbarcelonacruyff/ga4_bigquery_session_source

Creating GA4 session references in BigQuery.

bigquery ga4 googleanalytics

Last synced: 14 May 2025

https://github.com/loozhengyuan/datafeeds-sql

SQL code snippets for deriving dimensions and metrics from Adobe Analytics Data Feeds

adobe adobe-analytics adobe-analytics-data-feeds bigquery sql

Last synced: 03 Feb 2026

https://github.com/essien1990/etl_pipeline_airflow

Creating pipelines using Python3 and Apache Airflow to load tables into Google Big Query Dataware House

airflow airflow-dags airflow-operators bash bigquery bq datawarehouse etl-pipeline python3

Last synced: 03 Jan 2026

https://github.com/ostrokach/uniparc_xml_parser

UniParc dataset describing ~300 million protein sequences converted into relational tables accessible through Google BigQuery (and as Parquet files).

bigquery bioinformatics csv-files parquet-files protein-domains protein-sequences

Last synced: 03 Jan 2026

https://github.com/jjviscomi/bqemulator

Local emulator for Google BigQuery. DuckDB-backed, SQLGlot-powered. Drop-in replacement for the real service in dev, CI, and offline replicas.

bigquery bq-cli duckdb emulator fastapi pytest-plugin python sqlglot testing

Last synced: 23 May 2026

https://github.com/kyoshidajp/bqcop

Save your BigQuery cost.

bigquery golang

Last synced: 17 May 2026

https://github.com/matt-strautmann/dbt-bigquery-ecommerce-quickstart

Welcome to the dbt-BigQuery Quickstart Project! 🎉 This repository is designed as a hands-on guide to help you build a modern data stack leveraging powerful tools like Airbyte for ingestion, dbt for transformation, and BigQuery for storage and analytics.

bigquery dbt e-commerce quickstarts

Last synced: 30 Jul 2025

https://github.com/datacody/dbt-jaffle-shop

A hands-on project built to deepen understanding of dbt modeling, testing, and documentation. Based on the Jaffle Shop dataset, the project showcases best practices in transforming and validating source data for business analytics using the modern data stack.

analytics bigquery data-eng data-modeling dbt etl-pipeline sql transformation

Last synced: 02 Aug 2025

https://github.com/dav009/bqt

Local unit tests for your BigQuery queries

bigquery bq data test unittest

Last synced: 11 Feb 2026

https://github.com/windi-wulandari/pbi_kimia-farma-x-rakamin

A data-driven analytics project for Kimia Farma to evaluate business performance from 2020-2023 using BigQuery. Focused on transaction data, inventory, branch operations, and product insights. Results were visualized through an interactive dashboard to support strategic decisions and optimizations.

big-data-analytics bigquery datawarehouse googlelooker sql

Last synced: 03 Jan 2026

https://github.com/wintermi/bqdo

bqdo is a CLI for executing BigQuery SQL as part of a pipeline.

bigquery google-cloud google-cloud-platform

Last synced: 18 May 2026

https://github.com/chandanpasunoori/event-sync

Event Sync is for syncing events from multiple sources to multiple destinations, targetted for adhoc events, where sources support acknowledgement functionality.

bigquery golang-tools google-cloud-platform pubsub

Last synced: 21 Aug 2025

https://github.com/kabeera1007/bike_data_play

End to End Data Engineering project with multiple ETL & ELT pipelines.

airflow anaconda bigquery cloud dbt docker gcs python spark terraform

Last synced: 10 Apr 2026

https://github.com/miguelapp10/workinghoursbetweentwodate_bigquery

Este proyecto es una calculadora de horas laborales que determina la cantidad de horas trabajadas entre dos fechas, teniendo en cuenta días hábiles y horas de trabajo especificadas con Bigquery

bigquery bigquery-dataset bigquery-table querying sql sql-query

Last synced: 24 Aug 2025

https://github.com/kusmn/tableau-googletrends-canada-analytics

Visualizes Canada’s top Google search terms (2020–2024) using BigQuery and Tableau to explore regional and temporal trends.

bigquery data-visualization tableau

Last synced: 26 Aug 2025

https://github.com/prajakta1321/san-francisco-bike-share-analysis-using-bigquery-and-lookerstudio

This project describes the analysis of San Franciso dataset using SQL in Bigquery and Looker Studio

bigquery lookerstudio sql

Last synced: 13 Jul 2025

https://github.com/getconversio/go-utils

A collection of utility functionality for Go

amqp bigquery geoip golang logrus openexchangerates utilities

Last synced: 19 Apr 2026

https://github.com/ritu456286/smartstockai

SmartStockAI uses AI to predict inventory trends, minimize deadstock risks, and provide actionable insights through advanced models and interactive visualizations.

bigquery bigquery-ml cloud-storage cloudrun cloudsql gemini google-maps-api

Last synced: 18 Apr 2026

https://github.com/tomgorb/project-template-for-production

project template to (help) put a Machine/Deep learning algorithm into production

airflow bigquery gcp

Last synced: 15 May 2026

https://github.com/antjes88/asset-valuation-ingestion

The solution involves the DW Ingestion Layer, where data from CSV files is loaded into BigQuery

bigquery csv python terraform

Last synced: 18 Jan 2026

https://github.com/thunchanokbow/audiblebook-revenue

Manage big data on cloud computing to find a list of best-selling audible books, generate reports and dashboards, and provide products and sales promotions that meet the needs of consumers in Thailand

apache-airflow bigquery cloudcomposer data-visualization datalake datawarehouse googlecloudstorage lookerstudio pandas python3

Last synced: 11 Apr 2026

https://github.com/cartodb/carto-auth

Python library to authenticate with CARTO

auth bigquery carto carto-dw oauth

Last synced: 10 Mar 2026

https://github.com/pvoo/bigquery-mcp

Practical MCP server for quickly navigating BigQuery datasets and tables. Suitable for larger projects with many datasets/tables, optimized to keep LLM context small while staying fast and safe.

bigquery claude-code cursor fastmcp mcp mcp-server mcp-servers mcp-tools windsurf

Last synced: 11 Apr 2026

https://github.com/yu-iskw/terraform-google-copy-bq-datasets

A terraform module to copy BigQuery datasets across regions

bigquery data-engineering google-cloud terraform

Last synced: 19 Jan 2026

https://github.com/dreamdata-io/free-email-providers

A list of free email domain providers so that you can easily spot business users from consumers!

bigquery dataset email-parsing email-verification

Last synced: 18 Jan 2026

https://github.com/anilkhichar/bq-table-copy-automation

Copy table from one dataset to another in google big query using bash script

automation bash bash-script big-query bigquery bigquery-cp gcp google

Last synced: 18 Apr 2026

https://github.com/chelseammatta/nopd-cad-data-analysis

Analysis of 911 call data from New Orleans' 3rd & 4th police districts (2019-2022) using BigQuery

911-calls 911-data bigquery cad-data crime-analysis data-analysis emergency-response new-orleans public-safety sql

Last synced: 01 Jul 2025

https://github.com/oguzgn/fully-automated-performance-marketing-dashboard

This project integrates data from multiple ad platforms with Google Analytics to track marketing campaigns. It uses a structured naming system and UTM tags. Data is visualized in Looker Studio dashboards to analyze campaign performance and ad spend.

bigquery data-analysis data-engineering data-modeling marketing-analytics marketing-automation marketing-data-science marketingdata sql

Last synced: 24 Mar 2025

https://github.com/esanchezros/bigquery-maven-plugin

Maven plugin for managing BigQuery datasets, tables and views

bigquery java maven maven-plugin

Last synced: 03 Oct 2025

https://github.com/thunchanokbow/inventory-amazon

Inventory value is also important for determining a company's liquidity, or its ability to meet its short-term financial obligations. A high inventory value can indicate that a company has too much money tied up in inventory, which could make it difficult for the company to pay its bills.

azure bigquery cloudcomposer clouddatabase cloudstorage compute-engine dataproc postgresql powerbi pyspark-sql python3

Last synced: 12 Apr 2026

https://github.com/greenpeace/gpes-old-en-petitions-api-emulator

Emulates the deprecated EN petition's API. Useful if you have legacy microsites with petitions.

bigquery mysql petitions sqlite3

Last synced: 11 May 2025

https://github.com/secmon-lab/overseer

A security log analysis tool for data lake with combination of SQL query and Rego policy

bigquery detection open-policy-agent security-monitoring sql

Last synced: 11 Mar 2026

https://github.com/samedhi/gaend

Convert GAE Models into endpoints

bigquery elasticsearch google-app-engine restful taskqueue

Last synced: 03 May 2026

https://github.com/mlabarrere/pygquery

🐷 Multitread your data with Google BigQuery

bigquery dataframe google-bigquery multithreading pandas python

Last synced: 04 Feb 2026

BigQuery Awesome Lists
BigQuery Categories