BigQuery
Google BigQuery enables companies to handle large amounts of data without having to manage infrastructure. Google’s documentation describes it as a « serverless architecture (that) lets you use SQL queries to answer your organization’s biggest questions with zero infrastructure management. BigQuery’s scalable, distributed analysis engine lets you query terabytes in seconds and petabytes in minutes. » Its client libraries allow the use of widely known languages such as Python, Java, JavaScript, and Go. Federated queries are also supported, making it flexible to read data from external sources.
📖 A highly rated canonical book on it is « Google BigQuery: The Definitive Guide », a comprehensive reference. Another enriching read on the subject is the inside story told in the article by the founding product manager of BigQuery celebrating its 10th anniversary.
- GitHub: https://github.com/topics/bigquery
- Wikipedia: https://en.wikipedia.org/wiki/BigQuery/
- Repo: https://github.com/GoogleCloudPlatform/bigquery-utils/
- Released: May 19, 2010
- Related Topics: cloud-computing,
- Aliases: bq,
- Last updated: 2026-06-15 00:03:38 UTC
- JSON Representation
https://github.com/bankyadam/not-so-bigquery
An emulator for the Google BigQuery, that can be run locally, backed by PostgreSQL.
bigquery development devtool emulator sql
Last synced: 03 Oct 2025
https://github.com/cvs-health/coldstart
A package for automatic data collection and feature engineering
bigquery feature-engineering python sql sqlalchemy
Last synced: 03 Oct 2025
https://github.com/mercari/dataflowtemplates
Convenient Dataflow pipelines for transforming data between cloud data sources
apache-beam bigquery dataflow dataflow-templates spanner
Last synced: 25 Oct 2025
https://github.com/googlecloudplatform/deviceconnect
https://deviceconnect.readthedocs.io/
Last synced: 20 Oct 2025
https://github.com/mlr-org/mlr3db
Data Backends to let mlr3 work transparently with (remote) data bases
bigquery data-backend database duckdb machine-learning mariadb mlr3 mysql odbc postgresql r r-package spark sqlite
Last synced: 28 Feb 2026
https://github.com/wittline/pydag
Scheduling Big Data Workloads and Data Pipelines in the Cloud with pyDag
big-data bigquery cloud dag data-engineering data-pipeline dataengineering dataproc dataproc-cluster directed-acyclic-graph google-cloud google-cloud-platform parallel-processing task-scheduler task-scheduling workflow-engine
Last synced: 13 Apr 2025
https://github.com/kitta65/bq-extension-vscode
Visual Studio Code extension for GoogleSQL
bigquery visual-studio-code vscode
Last synced: 22 Feb 2026
https://github.com/googlecloudplatform/bq-utilization-alerts
A serverless bot which periodically checks configured BigQuery capacity commitments, reservations and assignments against actual slot consumption of running jobs and reports findings to Slack/Google Chat.
bigquery bot chat-ops cloud-run cloud-scheduler google-chat google-cloud serverless slack slots
Last synced: 20 Oct 2025
https://github.com/banditml/faucetml
High speed mini-batch data reading & preprocessing from BigQuery.
bigquery feature-engineering features machine-learning ml preprocessing pytorch
Last synced: 11 Feb 2026
https://github.com/digitalghost-dev/stock-data-pipeline
Code Repository for my 1st Data Project.
bigquery google-cloud-platform python
Last synced: 06 Apr 2025
https://github.com/canner/vulcan-sql-examples
Curated VulcanSQL show cases
analytics api-builder bigquery data data-lake data-warehouse database duckdb examples postgresql reporting restful-api sql vulcan-sql vulcansql
Last synced: 19 Jul 2025
https://github.com/googlecloudplatform/bigquery-dlp-remote-function
Use Remote Functions to tokenize data with DLP in BigQuery using SQL
bigquery cloud-run data-loss-prevention dlp google-cloud
Last synced: 20 Oct 2025
https://github.com/shinichi-takii/vscode-language-sql-bigquery
Syntax highlighting and code snippets for BigQuery SQL in Visual Studio Code
bigquery grammar snippets sql syntax-highlighting vscode vscode-extension
Last synced: 11 Apr 2025
https://github.com/InosRahul/f1-data-pipeline
F1 Data Pipeline
bigquery data-engineering-pipeline dbt gcs looker prefect python terraform
Last synced: 05 May 2025
https://github.com/captaincodeman/datastore-mapper
Appengine Datastore Mapper in Go
appengine bigquery cloud-storage datastore datastore-entities datastore-mapper go map-reduce shards
Last synced: 26 Apr 2025
https://github.com/nownabe/go-bqloader
bqloader is a simple ETL framework to load data from Cloud Storage into BigQuery.
bigquery etl golang google-cloud google-cloud-functions google-cloud-storage
Last synced: 04 Aug 2025
https://github.com/flamingo-run/gcp-pilot
api bigquery calendar cloud-scheduler cloud-tasks gcs google google-cloud-platform pubsub sheets speech
Last synced: 17 Jan 2026
https://github.com/miraisolutions/sparkbq
Sparklyr extension package to connect to Google BigQuery
Last synced: 04 Sep 2025
https://github.com/kitta65/prettier-plugin-bq
Prettier plugin for GoogleSQL
Last synced: 22 Feb 2026
https://github.com/ronoaldo/aetools
Utilities to build and manage Google App Engine apps
Last synced: 04 Oct 2025
https://github.com/miraisolutions/spark-bigquery
Google BigQuery data source for Apache Spark
bigquery google-dataproc spark spark-datasource
Last synced: 04 Sep 2025
https://github.com/Canner/vulcan-sql-examples
Curated VulcanSQL show cases
analytics api-builder bigquery data data-lake data-warehouse database duckdb examples postgresql reporting restful-api sql vulcan-sql vulcansql
Last synced: 11 Apr 2025
https://github.com/modataconsulting/dbt_ga4_project
This project uses Google Analytics 4 BigQuery Exports as its source data, and offers useful base transformations to provide report-ready dimension & fact models that can be used for reporting purposes, blending with other data, and/or feature engineering for ML models.
bigquery bq data-build-tool dbt ga4 google-analytics-4 sql
Last synced: 10 Apr 2025
https://github.com/hackersandslackers/bigquery-sqlalchemy-tutorial
:bar_chart: :arrow_right: :floppy_disk: ETL script to migrate data from BigQuery to SQL.
bigquery bigquery-sqlalchemy-tutorial databases etl mysql postgres python sql sqlalchemy tutorial
Last synced: 24 Aug 2025
https://github.com/ginokent/bqschema-gen-go
BigQuery table schema Go struct generator
bigquery bigquery-schema gcp gcp-bigquery go golang
Last synced: 07 May 2025
https://github.com/ocadaruma/scalikejdbc-bigquery
ScalikeJDBC extension for Google BigQuery
Last synced: 10 Apr 2025
https://github.com/ottogroup/bquest
Effortlessly validate and test your Google BigQuery queries with the power of pandas DataFrames in Python.
bigquery google-big-query google-cloud integration testing
Last synced: 14 Aug 2025
https://github.com/googlecloudplatform/cloud-composer-mssql-dataflow-bigquery
This repository contains an example of how to leverage Cloud Composer and Cloud Dataflow to move data from a Microsoft SQL Server to BigQuery. The diagrams below demonstrate the workflow pipeline.
airflow bigquery cloud-composer dataflow microsoft-sql-server
Last synced: 08 Jul 2025
https://github.com/nodefluent/bigquery-kafka-connect
:cloud: nodejs kafka connect connector for Google BigQuery
big-data bigquery connect etl google-cloud kafka kafka-connect nodejs
Last synced: 26 Apr 2025
https://github.com/hirosassa/bqvalid
SQL linter tool for BigQuery GoogleSQL (formerly known as StandardSQL).
Last synced: 18 Jan 2026
https://github.com/naseemkullah/gcp-accountant
A tool to identify high cost resources in GCP at a granular level
bigquery cost cost-engineering cost-resources gcp gcp-accountant
Last synced: 30 Apr 2025
https://github.com/prologuetech/laravel-big
Google BigQuery for Laravel
bigquery google google-bigquery laravel laravel-5-package php
Last synced: 11 Apr 2025
https://github.com/zkan/dtc-data-engineering-zoomcamp-project
DataTalks.Club's Data Engineering Zoomcamp Project
apache-airflow bigquery data-engineering data-studio dbt docker kafka minio python stomp terraform
Last synced: 19 Aug 2025
https://github.com/googlecloudplatform/google-cloud-abap
ABAP SDK for Google Cloud and BigQuery Connector for SAP enable customers to easily consume Google Products and Services natively from their SAP Landscape.
abap abap-development abapsdk abapsdkforgcp bigquery google-cloud-platform google-generative-ai google-maps-api vertex-ai
Last synced: 17 Jun 2025
https://github.com/omeryasirkucuk/amx
AI-driven CLI for documenting database schemas. DB + docs + codebase agents, 10 backends, BYO LLM, human-in-the-loop review.
agentic-ai ai-agents bigquery cli data-catalog data-engineering database-documentation databricks human-in-the-loop llm metadata postgre python snowflake
Last synced: 07 Jun 2026
https://github.com/medjed/embulk-input-bigquery
BigQuery input plugin for Embulk loads records from BigQuery
Last synced: 30 Oct 2025
https://github.com/ayoisio/variant-agents
Variant Agents: Multi-Agent Genomic Analysis
adk bigquery clinvar gemini gke gnomad google-cloud multi-agent-systems variant-analysis vep
Last synced: 14 May 2026
https://github.com/unytics/catalog_builder
Data Catalogs Made Easy
bigquery data-catalog data-discovery databricks dbt redshift snowflake
Last synced: 12 Apr 2025
https://github.com/mesmacosta/bq-fake-pii-table-creator
Library for creating BigQuery tables with fake PII data
bigquery fake-data faker governance-dapps metadata piidata piii
Last synced: 30 Apr 2025
https://github.com/livebook-dev/req_bigquery
Conveniences for querying Google BigQuery with Req
Last synced: 27 Apr 2025
https://github.com/sungchun12/schedule-python-script-using-google-cloud
:clock4: Schedules a Python script to append data into Bigquery using Google Cloud's App Engine with a cron job
appengine-python bigquery chicago-traffic cron google-cloud python-script
Last synced: 01 Sep 2025
https://github.com/blockchain-etl/tezos-etl
Python scripts for ETL (extract, transform and load) jobs for Tezos blocks, balance updates, and operations
bigquery blockchain cryptocurrency csv sql tezos
Last synced: 26 Oct 2025
https://github.com/data-tools/big-data-types
A library to transform Scala product types and Schemes from different systems into other Schemes. Any implemented type automatically gets methods to convert it into the rest of the types and vice versa. E.g: a Spark Schema can be transformed into a BigQuery table.
apache-spark bigquery bigquery-tables cassandra circe database-types scala schemas spark typeclass typeclass-derivation typesafe
Last synced: 30 Oct 2025
https://github.com/yoheimuta/dbq
CLI tool to easily Decorate BigQuery table name
bigquery bq cli golang table-decorator
Last synced: 07 Mar 2026
https://github.com/terashim/dataform-google-analytics-4-example
Dataform による Google アナリティクス 4 エクスポートデータの変換パイプライン
bigquery dataform google-analytics
Last synced: 08 Oct 2025
https://github.com/k1low/setup-tbls
GitHub Action for tbls
bigquery continuous-integration database-document database-schema documentation-tool dynamodb er-diagram excel mariadb markdown mermaid mysql plantuml postgresql redshift snowflake spanner sqlite sqlserver
Last synced: 16 Apr 2025
https://github.com/googlecloudplatform/datacatalog-tag-history
Historical metadata of your data warehouse is a treasure trove to discover not just insights about changing data patterns, but also quality and user behaviour. This solution creates Data Catalog Tags history in BigQuery since Data Catalog keeps only the latest version of metadata for fast searchability.
analytics bigquery data-catalog data-governance metadata-management
Last synced: 03 Oct 2025
https://github.com/aeonasoft/audioflow
Open Source Audio News Subscription Service (Google Trends, Hacker News & more).
airflow bbc-news beautifulsoup bigquery celery docker fastapi gmail-smtp google-ai-studio google-trends hacker-news jinja2 postgresql pydantic redis requests sqlalchemy sqlite
Last synced: 01 Sep 2025
https://github.com/kesin11/ts-junit2json
Convert JUnit XML format to JSON with TypeScript
Last synced: 25 Apr 2025
https://github.com/oliveroneill/bigqueryswift
BigQuery client for Swift
bigquery google-cloud-platform swift
Last synced: 26 Oct 2025
https://github.com/future-architect/gbilling-plot
Create graphed invoice for Google Cloud Platform. You can see billing amount per GCP project.
bigquery billing cloud-scheduler gcp-billing go golang slack
Last synced: 17 Oct 2025
https://github.com/toddbirchard/ghost-webhook-api
📑 🎛️ API to automate optimizations for self-hosted blogging platforms.
api automation bigquery blogging ghost github-api google-cloud-storage python webhook-api
Last synced: 07 Mar 2026
https://github.com/victorcouste/google-cloudfunctions-dataprep
Google Cloud Functions examples for Google Cloud Dataprep
api api-rest bigquery cloud-functions cloudfunctions-dataprep dataprep dataprep-job google-bigquery google-cloud-dataprep google-sheet trifacta
Last synced: 20 Jul 2025
https://github.com/badal-io/gcp-airflow-foundations
Opinionated framework based on Airflow 2.0 for building pipelines to ingest data into a BigQuery data warehouse
airflow apache-airflow bigquery dags data-engineering data-pipeline etl-pipeline
Last synced: 24 Mar 2025
https://github.com/noahgift/pragmaticai-gcp
Pragmatic AI solutions on GCP
bigquery colaboratory gcp google jupyter-notebook pragai pragmaticai-gcp python sheets
Last synced: 28 Oct 2025
https://github.com/sweetpand/py_scripts_bots
The moderate bots for re-crawling from social medias.
bigquery bot bots crawling instagram-bot practice-programming python regex scrapy scripts social-networks tweetbot whatsapp-bot youtube-bot
Last synced: 28 Apr 2025
https://github.com/mondo-mob/mondokit
Simplify building NodeJS applications on cloud platforms
bigquery cloud cloud-storage cloud-tasks database-migrations datastore datastore-backups express firebase-auth firestore firestore-database firestore-database-backup firestore-migrations gcp gcs google-cloud google-cloud-platform google-cloud-tasks
Last synced: 10 Oct 2025
https://github.com/vickyjkwan/sqlanalyzer
A SQL parser and analyzer for sql flavors including MySQL, PostgreSQL, BigQuery Standard SQL, Presto SQL and Hive SQL.
athena bigquery hiveql metastore presto sqlparser standardsql
Last synced: 09 Apr 2025
https://github.com/vinted/flink-big-query-connector
Flink connector for BigQuery
bigquery flink flink-connector flink-connector-bigquery streaming
Last synced: 04 Apr 2026
https://github.com/wintermi/imdb-dataform
An example Dataform project to load and transform the publicly available dataset from IMDB.
bigquery dataform google-cloud google-cloud-platform
Last synced: 05 May 2025
https://github.com/kestra-io/plugin-gcp
bigquery firestore gcp google-cloud google-cloud-platform google-cloud-storage kestra plugin vertex-ai
Last synced: 12 Mar 2026
https://github.com/google-marketing-solutions/cwv_from_ga4_exports
Simple solution to make reporting on CWVs from BQ simpler to set up.
analytics bigquery google google-cloud-platform
Last synced: 01 Aug 2025
https://github.com/splitmedialabslimited/supermigration
A CLI tool to perform migrations on BigQuery tables
bigquery bigquery-schema gcp node nodejs
Last synced: 01 Aug 2025
https://github.com/naustica/openalex
Repository containing scripts for importing OpenAlex snapshots into BigQuery
bigquery openalex python scholarly-metadata
Last synced: 30 Apr 2025
https://github.com/janaom/gcp-de-project-streaming-pubsub-beam-dataflow
This project demonstrates an end-to-end solution for processing and analyzing real-time conversations data from a JSON file using GCP services and infrastructure automation, showcasing data storage, streaming, processing, and analysis at scale.
apache-beam bigquery dataflow de-project gcp pubsub streaming-data
Last synced: 18 Oct 2025
https://github.com/minodisk/zoq
Convert Zod to BigQuery Schema
bigquery bigquery-schema bigquery-schema-converter zod
Last synced: 22 Apr 2025
https://github.com/keito5656/firebase-authentication-to-bigquery-export
An automatic tool for copying and converting Firebase Authentication data to BigQuery.
bigquery firebase-auth typescript
Last synced: 16 Jan 2026
https://github.com/pcorbel/metaquery
An API to analyze BigQuery metadata
bigquery golang gorm vue-router vuejs vuetifyjs vuex
Last synced: 25 Apr 2025
https://github.com/jashparekh/bigquery-action
This Github action can be used to deploy tables/views schemas to BigQuery.
actions bigquery gbq github-actions google google-bigquery google-cloud-platform hacktoberfest
Last synced: 07 May 2025
https://github.com/hackersandslackers/bigquery-python-tutorial
:bar_chart: :snake: Create tables in Google BigQuery, auto-generate their schemas, and retrieve said schemas.
bigquery data-warehouse gcs google-bigquery google-cloud google-cloud-sdk google-cloud-storage python tutorial
Last synced: 28 Apr 2025
https://github.com/rajaprerak/twitteranalysis
Twitter sentiment analysis of trending movies and songs.
bigquery bootstrap css dataflow datastudio gae gcp google-app-engine google-cloud-platform html pubsub python sentiment-analysis spotipy tmdb-api tweepy twitter twitter-sentiment-analysis
Last synced: 07 Apr 2025
https://github.com/christippett/bigquery-geo-router
Calculate routes from long/lat coordinates in BigQuery using OpenStreetMap/OSRM
bigquery geospatial google openstreetmap osrm
Last synced: 30 Oct 2025
https://github.com/manuelguerra1987/data-engineering-zoomcamp-notes
Notes and material from 2025 Data Engineering Zoomcamp by Datatalks.Club
airflow bigquery data-engineering docker kubernetes
Last synced: 23 Aug 2025
https://github.com/stape-io/request-to-gcs-function
Google Cloud Function that saves everything that came in request to Google Cloud Storage
bigquery gtm gtm-server-side stape
Last synced: 14 Apr 2025
https://github.com/kununu/mysql-to-bigquery-schema-converter
Python lib and cli tool to convert MySQL schemas into BigQuery schemas
bigquery bigquery-schema converter converter-app converter-library des mysql mysql-schema
Last synced: 15 Apr 2025
https://github.com/janaom/gcp-de-project-uber-etl-pipeline
Technologies used: GCS, Compute Engine, Mage, BigQuery, Looker, Python
Last synced: 12 Apr 2025
https://github.com/snithish/tpc-di_benchmark
Benchmark for Airflow with BigQuery as the Data Warehouse using TPC - DI
airflow benchmark bigquery tpc-di
Last synced: 07 May 2025
https://github.com/wayfair-incubator/bigquery-buildkite-plugin
Buildkite Plugin to create/update structures on BigQuery
bigquery buildkite buildkite-plugin gbq google google-bigquery google-cloud-platform hacktoberfest
Last synced: 21 Aug 2025
https://github.com/tufin/espresso
A framework for writing testable BigQuery queries
Last synced: 04 Oct 2025
https://github.com/nathadriele/data-engineering-zoomcamp
The Data Engineering Zoomcamp covers essential skills in containerization, workflow orchestration, data warehousing, analytics engineering, batch, and streaming processing. It includes tools like Docker, Terraform, BigQuery, dbt, Spark, Kafka, Kestra, Postgres, Google Data Studio, and Metabase.
bigquery containerization data-engineering dbt docker google-data-studio kafka kestra metabase orchestration postgresql spark streaming terraform warehousing workflow-automation
Last synced: 21 Mar 2025
https://github.com/ExpediaGroup/circus-train-bigquery
Circus Train plugin which replicates BigQuery tables to Hive
bigquery circus-train google-cloud hive replication
Last synced: 24 Mar 2025
https://github.com/edgarrmondragon/meltano-dogfood
Personal dogfood Meltano project
bigquery dbt dogfood elt evidence-dev meltano
Last synced: 14 Apr 2025
https://github.com/urish/nn-function-generator
Experimenting with automatic generation of TS function bodies using ANN models
bigquery tensorflow tsquery typescript
Last synced: 08 Sep 2025
https://github.com/expediagroup/circus-train-bigquery
Circus Train plugin which replicates BigQuery tables to Hive
bigquery circus-train google-cloud hive replication
Last synced: 23 Sep 2025
https://github.com/wintermi/bqe-dataform
A Dataform project which aggregates BigQuery system metadata for the purpose of analysing the slot usage and storage within an organization by project.
bigquery dataform google-cloud google-cloud-platform
Last synced: 05 May 2025
https://github.com/ksalama/data2cooc2emb2ann
Learning embeddings from item co-occurrence statistics, and building an approx. nearest neighbour index
apache-beam bigquery dataflow embeddings machine-learning python3 tensorflow
Last synced: 13 Jun 2025
https://github.com/gbotemib/gharchive_de_project
An end-to-end data engineering project on github activities data
bigquery dbtcloud docker gcp gcs-bucket looker-studio prefect spark terraform
Last synced: 27 Feb 2025
https://github.com/tomayac/http-archive-progressive-web-apps
Different approaches to estimate the number of Progressive Web Apps in the HTTP Archive
Last synced: 15 Apr 2025
https://github.com/sigpwned/litecene
A simple cross-data store full-text search language for Java 8+
bigquery full-text-search java query-language search
Last synced: 12 Apr 2025
https://github.com/snithish/tpc-ds_big-query
Scripts to execute TPC - DS on Big Query
benchmark bigquery tpc-ds-benchmark tpc-ds-queries
Last synced: 07 May 2025