BigQuery
Google BigQuery enables companies to handle large amounts of data without having to manage infrastructure. Google’s documentation describes it as a « serverless architecture (that) lets you use SQL queries to answer your organization’s biggest questions with zero infrastructure management. BigQuery’s scalable, distributed analysis engine lets you query terabytes in seconds and petabytes in minutes. » Its client libraries allow the use of widely known languages such as Python, Java, JavaScript, and Go. Federated queries are also supported, making it flexible to read data from external sources.
📖 A highly rated canonical book on it is « Google BigQuery: The Definitive Guide », a comprehensive reference. Another enriching read on the subject is the inside story told in the article by the founding product manager of BigQuery celebrating its 10th anniversary.
- GitHub: https://github.com/topics/bigquery
- Wikipedia: https://en.wikipedia.org/wiki/BigQuery/
- Repo: https://github.com/GoogleCloudPlatform/bigquery-utils/
- Released: May 19, 2010
- Related Topics: cloud-computing,
- Aliases: bq,
- Last updated: 2026-06-15 00:03:38 UTC
- JSON Representation
https://github.com/brews/bucket2bq
Create an inventory of objects in GCS Bucket with metadata and upload to Big Query
bigquery gcp golang google-cloud-storage
Last synced: 24 Jan 2026
https://github.com/achint08/tech-diffusion
Patents data analysis on PySpark
bert big-data bigquery google-patents-dataset machine-learning nlp pagera patents-analysis pyspark
Last synced: 15 Feb 2026
https://github.com/taquynhnga2001/proptech-dagster
Build an ELT pipeline with dagster and dbt to schedule loading HDB resale transactions in Singapore into Google BigQuery data warehouse, then create Power BI dashboard to enhance insight exploration.
bigquery dagster data-integration data-orchestration data-warehouse dbt elt etl powerbi python
Last synced: 14 Feb 2026
https://github.com/mehmoodulhaq570/bigquery_machine_learning_project
Developed a machine learning model to predict incident groups based on data from the London Fire Brigade service calls.
bigquery bigquery-dataset cloud database jupyter-notebook machine-learning machine-learning-algorithms ml models prediction-algorithm prediction-model python
Last synced: 07 May 2026
https://github.com/vertexclique/olayufku
Schema Registry for BigQuery
bigquery bigquery-schema migration schema-migrations schema-registry
Last synced: 17 Apr 2026
https://github.com/evry-ace/statsbot
Slack Bot to forward message statistics to BigQuery
bigquery slack slack-bot slackbot
Last synced: 31 Oct 2025
https://github.com/lpraat/inbq
A library for parsing BigQuery queries and extracting schema-aware, column-level lineage.
bigquery data-lineage parser sql
Last synced: 26 Apr 2026
https://github.com/pirate-emperor/k2bq
K2BQ is a dataflow pipeline that streams data from Kafka to BigQuery. It uses Google Cloud’s managed Kafka, Dataflow for processing, and BigQuery for real-time analytics, offering scalable, automated data integration for fast insights.
bigquery cloud-computing cloud-infrastructure data-integration data-streaming dataflow google-cloud infrastructure-as-code kafka python realtime-analytics terraform
Last synced: 28 Jan 2026
https://github.com/leandronasx/agro-data
Projeto final da formação de analista de dados e dashboard da SoulCode Academy.
bigquery data-analysis gcp looker pandas powerbi python
Last synced: 18 Jul 2025
https://github.com/triglav-dataflow/triglav-agent-bigquery
BigQuery agent for Triglav, data-driven workflow tool
Last synced: 14 Feb 2026
https://github.com/jasontanx/gsheet-to-bq-ingestion
Data ingestion from Google Sheet to BigQuery
bigquery data-engineering data-ingestion gsheets
Last synced: 02 May 2026
https://github.com/keminghe/medo
An automated, cloud-agnostic platform that unifies enterprise data silos into actionable insights while optimizing cross-cloud costs and compliance.
Last synced: 31 Jul 2025
https://github.com/farbodahm/delephon
Native desktop BigQuery client built with Go, supporting multiple projects, schema explorer, fuzzy search, history, favorite queries, AI assistant
Last synced: 03 Apr 2026
https://github.com/benitomartin/benitomartin
Personal profile 😎
anaconda artificial-intelligence aws bash-script bigquery data-science gcp lambda-functions large-language-models linux machine-learning python pytorch retrieval-augmented-generation sagemaker scikit-learn tensorflow terraform
Last synced: 12 Apr 2025
https://github.com/toddbirchard/bigquery-to-sql
:bar_chart: :arrow_right: :floppy_disk: Lightweight ETL script to migrate data from BigQuery to SQL.
bigquery etl google-cloud python sql sqlalchemy
Last synced: 09 Feb 2026
https://github.com/tuanai-vireox/gcp-udfs-example
Google BigQuery Javascript UDF Function Examples
bigquery gcp javascript nodejs npm udf
Last synced: 19 Apr 2026
https://github.com/ryanmcdowell/dataflow-bigquery-dynamic-destinations
An example pipeline for dynamically routing events from Pub/Sub to different BigQuery tables based on a message attribute.
apache-beam bigquery google-cloud-dataflow google-cloud-platform
Last synced: 09 Sep 2025
https://github.com/xlfe/pyjdbq
The easiest way to ship journald logs to Google BigQuery
bigquery journald journald-logs logging security
Last synced: 25 Aug 2025
https://github.com/jasontanx/data-engineer-project-1
End-to-end data engineering project
airline bigquery data-engineering etl-pipeline looker-studio mage-ai
Last synced: 27 Apr 2026
https://github.com/takegue/bigquery-porter
BigQuery Deployment and Metadata Management tool
Last synced: 12 Feb 2026
https://github.com/phenrickson/bgg-data-warehouse
ETL process for BGG cloud data warehouse
bigquery data-engineering elt-pipeline python
Last synced: 15 Apr 2026
https://github.com/tknishh/case-study-ueats-ghub-sql
Analyzing the Impact of Business Hour Mismatch on Order Volume in the Food Delivery Industry: A Case Study of UEats and Ghub
assignment bigquery case-study-analysis loop product-analyst sql
Last synced: 23 Aug 2025
https://github.com/masmovil/rx-gcloud-connectors
bigquery datastore firestore gcloud gcloud-sdk pubsub reactive rxjava2 vertx
Last synced: 20 Apr 2026
https://github.com/paty-oliveira/dbt-playground
Repository for data modelling with dbt
analytics-engineering bigquery dbt docker-compose jinja postgresql python sql
Last synced: 10 Feb 2026
https://github.com/akihokurino/recruitment-server-gae
job listings api server. create go application in google app engine 2nd. use twirp for api interface and use sops with kms for secure environment. use cloud build for cicd. use algolia for search engine. sync datastore to bigquery.
algolia bigquery boom cloud-scheduler cloudbuild datastore firebase-auth gae gcp golang grpc kms realtime-database twirp
Last synced: 10 May 2026
https://github.com/ovotech/bigquery-metrics-exporter
A Golang application to export table level metrics from BigQuery into Datadog.
Last synced: 06 Feb 2026
https://github.com/oguzgn/firebase-ab-test-analysis-for-a-mobile-race-game
This repository showcases an infrastructure designed for analyzing A/B tests in mobile games. It leverages BigQuery to process Firebase and GA4-based event data and uses Looker Studio for dynamic visualization. The project simplifies A/B test comparisons, enabling stakeholders to view results directly through interactive dashboards.
ab-testing ab-testing-analysis bigquery event-based-tracking firebase looker-studio mobile-game-analytics race-game sql
Last synced: 19 May 2026
https://github.com/mchmarny/sbomer
Generates daily SBOM and vulnerability reports for container images and saves resulting files into GCS bucket and data into BigQuery tables.
bigquery gcp gcs grype report sbom syft vex vulnerability
Last synced: 23 Mar 2026
https://github.com/seandavi/aisr-data-warehouse
Animal Image Shared Resource PACS/Viewer
api bigquery clinical-information-system dicom dicom-files gcp image-analysis pacs radiology
Last synced: 27 Apr 2026
https://github.com/tupizz/fiap_pnad-covid-19
Este projeto realiza a análise e transformação de dados da PNAD COVID-19 de maio a julho de 2020, utilizando PySpark para processamento de dados em larga escala e BigQuery como destino para armazenamento e análise posterior. O objetivo é consolidar os dados mensais em um único conjunto de dados transformado.
analysis bigquery pyspark python
Last synced: 27 Apr 2026
https://github.com/hariprasath-v/mh_google_cloud_bigquery_ltv_prediction_challenge
Build a model that can predict customers' Long Term Value (LTV).
bigquery colab-notebook klib machine-learning matplotlib numpy pandas python python3 seaborn
Last synced: 09 Apr 2026
https://github.com/jjviscomi/bqemulator
Local emulator for Google BigQuery. DuckDB-backed, SQLGlot-powered. Drop-in replacement for the real service in dev, CI, and offline replicas.
bigquery bq-cli duckdb emulator fastapi pytest-plugin python sqlglot testing
Last synced: 23 May 2026
https://github.com/chukwuemekaaham/data-engineering-zoomcamp
Datatalks Club Free Data Engineering Zoomcamp Project
bigquery dbt docker-compose duckdb gcp gcp-cloud-storage github-actions jupyter-notebook kafka linux looker-studio mageai pandas postgresql prefect python redpanda risingwave spark terraform
Last synced: 24 Feb 2026
https://github.com/chukwuemekaaham/uber-gcp-etl-project
Data Engineering Zoomcamp Final Project
bigquery cloud-storage csv docker-compose gcp jupyter-notebook looker-studio mageai python spark spreadsheets terraform
Last synced: 13 Apr 2026
https://github.com/95xin/data-engineering-project---automatic-batch-data-processing
Data Engineering Project - Automated Batch Data Processing
airflow bigquery data-engineering data-pipeline data-schema elt postgresql-database pyspark python
Last synced: 28 Apr 2026
https://github.com/chandanpasunoori/event-sync
Event Sync is for syncing events from multiple sources to multiple destinations, targetted for adhoc events, where sources support acknowledgement functionality.
bigquery golang-tools google-cloud-platform pubsub
Last synced: 21 Aug 2025
https://github.com/shinichi-takii/atom-language-sql-bigquery
BigQuery SQL language support in Atom
atom atom-package bigquery grammar snippets sql syntax-highlighting
Last synced: 28 Apr 2026
https://github.com/pmhalvor/whale-speech
A pipeline to map whale sightings to hydrophone audio
beam bigquery gcs mle model-as-a-service python tensorflow2
Last synced: 24 Feb 2026
https://github.com/thunchanokbow/inventory-amazon
Inventory value is also important for determining a company's liquidity, or its ability to meet its short-term financial obligations. A high inventory value can indicate that a company has too much money tied up in inventory, which could make it difficult for the company to pay its bills.
azure bigquery cloudcomposer clouddatabase cloudstorage compute-engine dataproc postgresql powerbi pyspark-sql python3
Last synced: 12 Apr 2026
https://github.com/ackeecz/terraform-gcp-dataflow_pubsub_to_bq
Dataflow job subscriber to PubSub subscription. It takes message from subscription and push it into BigQuery table.
bigquery dataflow pubsub terraform-module
Last synced: 13 May 2026
https://github.com/pkpkpk/gcp
clojure bindings for select GCP sdks
bigquery cloudstorage gcp gemini google-cloud-platform pubsub vertexai
Last synced: 28 Apr 2026
https://github.com/misicode/Kaggle-Intro_to_SQL
Solutions of the exercises from the course "Introduction to SQL with BigQuery" by @Kaggle.
bigquery kaggle kaggle-intro-to-sql sql
Last synced: 10 Mar 2025
https://github.com/hyangminj/ddl2data
Turn any SQL schema into realistic test data — instantly
bigquery cli data-engineering date-testing dynamodb etl faker postgresql python rds synthetic-data test-data-generation
Last synced: 18 May 2026
https://github.com/miguelapp10/workinghoursbetweentwodate_bigquery
Este proyecto es una calculadora de horas laborales que determina la cantidad de horas trabajadas entre dos fechas, teniendo en cuenta días hábiles y horas de trabajo especificadas con Bigquery
bigquery bigquery-dataset bigquery-table querying sql sql-query
Last synced: 24 Aug 2025
https://github.com/nghiant3110/google_analytic_4
This is a DA project based on the GA4 Sample dataset on Big Query
bigquery google-analytics looker-studio sql
Last synced: 11 Apr 2025
https://github.com/edonosotti/gmail-accounting-automation
Automate accounting from invoices in Gmail, using Apps Script, Google Sheets and optionally BigQuery.
accounting apps-script automation bigquery expense-tracker expenses finance finance-automation finances gmail google google-api google-apis google-sheets
Last synced: 29 Apr 2026
https://github.com/ostrokach/uniparc_xml_parser
UniParc dataset describing ~300 million protein sequences converted into relational tables accessible through Google BigQuery (and as Parquet files).
bigquery bioinformatics csv-files parquet-files protein-domains protein-sequences
Last synced: 03 Jan 2026
https://github.com/loozhengyuan/datafeeds-sql
SQL code snippets for deriving dimensions and metrics from Adobe Analytics Data Feeds
adobe adobe-analytics adobe-analytics-data-feeds bigquery sql
Last synced: 03 Feb 2026
https://github.com/greenpeace/gpes-old-en-petitions-api-emulator
Emulates the deprecated EN petition's API. Useful if you have legacy microsites with petitions.
bigquery mysql petitions sqlite3
Last synced: 11 May 2025
https://github.com/justinbeckwith/bisquick
🥞Synchronize your GitHub issues with BigQuery. Do neat stuff.
Last synced: 28 Apr 2026
https://github.com/varun-khorgade/weatherflow-etl-data-pipeline
ETL pipeline to fetch, clean, and load weather datasets for structured analysis.
bigquery data-engineering etl-pipeline pandas postresql psycopg2 sql
Last synced: 16 May 2026
https://github.com/brunopata/adventureworks-sql-analysis
SQL-driven analysis of sales, customer behavior, time trends and regional performance using the AdventureWorks dataset. Built using Google BigQuery and SQL to uncover key business insights. Data is structured through clean queries and views designed to support product, customer and geographic decisions.
bigquery business-intelligence data-analytics data-engineering etl google-cloud sales-analysis sql
Last synced: 30 Apr 2026
https://github.com/ajaxbarcelonacruyff/ga4_bigquery_session_source
Creating GA4 session references in BigQuery.
Last synced: 14 May 2025
https://github.com/aleenprd/docbt
Documentation Build Tool - Generate YAML documentation for dbt models with optional AI assistance. Built with Streamlit for an intuitive and familiar web interface.
ai analytics-engineering bigquery data data-modeling data-science dbt docker llm lmstudio ollama openai snowflake sql streamlit
Last synced: 11 Nov 2025
https://github.com/zkan/running-bigquery-query-from-airflow-using-bigqueryexecuteoperator
Running BigQuery Query from Airflow using BigQueryExecuteOperator
Last synced: 10 May 2026
https://github.com/getconversio/go-utils
A collection of utility functionality for Go
amqp bigquery geoip golang logrus openexchangerates utilities
Last synced: 19 Apr 2026
https://github.com/anilkhichar/bq-table-copy-automation
Copy table from one dataset to another in google big query using bash script
automation bash bash-script big-query bigquery bigquery-cp gcp google
Last synced: 18 Apr 2026
https://github.com/khdevops/spotify_data_pipeline
Spotify Top 50 Data Pipeline
apache-airflow bash bigquery docker git google-cloud-platform pandas python sql terraform
Last synced: 28 Oct 2025
https://github.com/redis-developer/demo-redis-bigquery
This app uses Redis and BigQuery. Data is prefetched from BigQuery and queried using Redis Search and JSON.
bigquery demo express formula1 google-cloud javascript node-redis react redis
Last synced: 13 Apr 2026
https://github.com/kusmn/tableau-googletrends-canada-analytics
Visualizes Canada’s top Google search terms (2020–2024) using BigQuery and Tableau to explore regional and temporal trends.
bigquery data-visualization tableau
Last synced: 26 Aug 2025
https://github.com/i62navpm/vue-front-app
Front app + Google cloud tools
appengine bigquery firebase firestore-database google-cloud google-functions google-storage puppeteer
Last synced: 01 May 2026
https://github.com/lokimcpuniverse/gcp-mcp-server
MCP server for Google Cloud Platform - Complete GCP services integration for GenAI
ai-agents bigquery cloud gcp genai google-cloud infrastructure mcp model-context-protocol vertex-ai
Last synced: 02 May 2026
https://github.com/ajaxbarcelonacruyff/gcp_cost
Monitoring Google Cloud costs with Looker Studio.
bigquery googlecloud googlecloudplatform lookerstudio
Last synced: 14 May 2025
https://github.com/salrashid123/gcp_cloud_status_dataset
BigQuery Dataset to query GCP Cloud Status Dashboard (https://status.cloud.google.com/)
bigquery gcp google-cloud google-cloud-platform
Last synced: 20 Apr 2026
https://github.com/fpopic/bigquery-schema-select
(Script) Generates SQL query that selects all fields (recursively for nested fields) from the provided BigQuery schema file.
bigquery bigquery-schema scala sql
Last synced: 15 Mar 2026
https://github.com/danlessa/meta_qa
A practical one-liner metalanguage for describing common-sense in an machine-friendly way.
Last synced: 03 May 2026
https://github.com/secmon-lab/overseer
A security log analysis tool for data lake with combination of SQL query and Rego policy
bigquery detection open-policy-agent security-monitoring sql
Last synced: 11 Mar 2026
https://github.com/essien1990/etl_pipeline_airflow
Creating pipelines using Python3 and Apache Airflow to load tables into Google Big Query Dataware House
airflow airflow-dags airflow-operators bash bigquery bq datawarehouse etl-pipeline python3
Last synced: 03 Jan 2026
https://github.com/alimarzouk/paris-aq
ELTL pipeline to monitor air quality in the Paris Île-de-France area
airflow airquality big-data bigquery dataengineering gcs spark
Last synced: 06 Feb 2026
https://github.com/alterra-greeve/de-capstone
Capstone Project SIB Batch 6 x Alterra Academy - Data Engineer
bigquery cloud-function data-engineering docker googlefirebase looker-studio python
Last synced: 26 Jan 2026
https://github.com/nais/bqrator
Operator for creating BigQuery datasets
bigquery bigquery-operator kubernetes kubernetes-operator nais-features
Last synced: 03 May 2026
https://github.com/topefolorunso/musicaly-project
An end-to-end data pipeline that ingests simulated music stream data, structures, cleans and models the raw data, and visualizes clean data.
airflow bigquery data-pipeline dbt google-cloud-platform kafka python spark-streaming
Last synced: 04 May 2026
https://github.com/camilajaviera91/mock-data-factory
Generate large-scale synthetic datasets using SQL and BigQuery.
bigquery dbt dotenv exceute-batch faker load-dotenv os postgresql psycopg2 psycopg2-extras random sql yml
Last synced: 04 May 2026
https://github.com/chelseammatta/nopd-cad-data-analysis
Analysis of 911 call data from New Orleans' 3rd & 4th police districts (2019-2022) using BigQuery
911-calls 911-data bigquery cad-data crime-analysis data-analysis emergency-response new-orleans public-safety sql
Last synced: 01 Jul 2025
https://github.com/tamanyan/digdag-embulk-server
Digdag Server for building Data Lake
bigquery digdag docker docker-compose embulk etl
Last synced: 04 May 2026
https://github.com/rohitsanj/superset-dbt-demo
This repository contains an example project (Jaffle Shop) demonstrating integration between Superset and dbt, with BigQuery as the data warehouse.
apache-superset bigquery dbt superset
Last synced: 17 May 2026
https://github.com/miguelapp10/etl_operadorlogistico
extraer datos de la API de SimpliRoute, AndesExpress y Urbano en un rango de fechas específico y procesarlos para su análisis y almacenamiento en Google BigQuery
api-client bigquery pandas python
Last synced: 13 Apr 2026
https://github.com/rbmuller/scherlok
A detective for your data. Zero-config data quality monitoring — works with dbt, Postgres, BigQuery, Snowflake. No YAML.
anomaly-detection bigquery cli data-engineering data-observability data-quality dbt etl monitoring open-source postgres postgresql python snowflake
Last synced: 15 May 2026
https://github.com/shubhammohanty680/uber_data_analysis
bigquery data-analysis gcp-compute gcp-project looker-studio mageai python
Last synced: 17 Feb 2026
https://github.com/thumbtack/becquerel
Gateway server that provides an OData interface to BigQuery and Elasticsearch
bigquery elasticsearch odata play play-scala scala sql
Last synced: 05 Feb 2026
https://github.com/george-nyamao/gcp_etl_project
An ETL pipeline to move an uploaded flat file ffrom GCS, mask PII, store Big Query, and Create a report in Looker.
airflow bigquery cloudcomposer data-fusion gcs-bucket looker python3 wrangler
Last synced: 07 Feb 2026
https://github.com/squidmin/java17-spring-gradle-bigquery-reference
Java v17⋅ Spring v3 ⋅ Gradle ⋅ BigQuery
bigquery gradle java java-17-gradle java17 java17-spring-boot spring-boot-3
Last synced: 01 Aug 2025
https://github.com/samedhi/gaend
Convert GAE Models into endpoints
bigquery elasticsearch google-app-engine restful taskqueue
Last synced: 03 May 2026
https://github.com/elithrar/finding-bugs-with-bigquery
A talk on using BigQuery, the GitHub Public Data & some elbow grease to find bugs in OSS projects.
big-data bigquery bugs github golang open-source
Last synced: 06 May 2026
https://github.com/pvoo/bigquery-mcp
Practical MCP server for quickly navigating BigQuery datasets and tables. Suitable for larger projects with many datasets/tables, optimized to keep LLM context small while staying fast and safe.
bigquery claude-code cursor fastmcp mcp mcp-server mcp-servers mcp-tools windsurf
Last synced: 11 Apr 2026
https://github.com/icarusso/bigqueryexporter
Export query data from google bigquery to local machine
Last synced: 11 Jul 2025
https://github.com/yu-iskw/terraform-google-copy-bq-datasets
A terraform module to copy BigQuery datasets across regions
bigquery data-engineering google-cloud terraform
Last synced: 19 Jan 2026
https://github.com/undisputed-jay/etl-on-gcp-with-apache-airflow
In this project, files were ingested to Google Cloud Storage and later to moved to BigQuery so as to perform some queries and the result moved back to Google Cloud Storage.
apache-airflow bigquery data-engineering data-warehouse docker etl-pipeline google-cloud-platform
Last synced: 06 May 2026
https://github.com/rembertdesigns/smart-vinyl-catalog
AI-powered vinyl cataloging and music discovery platform leveraging BigQuery’s generative AI. Processes mixed-format data to deliver personalized recommendations, collection analytics, and intelligent search. Created for the Kaggle BigQuery AI Challenge to showcase real-world, scalable AI solutions for music lovers.
ai bigquery data-science data-visualization generative-ai hackathon kaggle kaggle-competition machine-learning music-analytics music-recommendation-algorithm python recommender-system vinyl
Last synced: 07 May 2026