BigQuery
Google BigQuery enables companies to handle large amounts of data without having to manage infrastructure. Google’s documentation describes it as a « serverless architecture (that) lets you use SQL queries to answer your organization’s biggest questions with zero infrastructure management. BigQuery’s scalable, distributed analysis engine lets you query terabytes in seconds and petabytes in minutes. » Its client libraries allow the use of widely known languages such as Python, Java, JavaScript, and Go. Federated queries are also supported, making it flexible to read data from external sources.
📖 A highly rated canonical book on it is « Google BigQuery: The Definitive Guide », a comprehensive reference. Another enriching read on the subject is the inside story told in the article by the founding product manager of BigQuery celebrating its 10th anniversary.
- GitHub: https://github.com/topics/bigquery
- Wikipedia: https://en.wikipedia.org/wiki/BigQuery/
- Repo: https://github.com/GoogleCloudPlatform/bigquery-utils/
- Released: May 19, 2010
- Related Topics: cloud-computing,
- Aliases: bq,
- Last updated: 2026-06-15 00:03:38 UTC
- JSON Representation
https://github.com/keminghe/medo
An automated, cloud-agnostic platform that unifies enterprise data silos into actionable insights while optimizing cross-cloud costs and compliance.
Last synced: 31 Jul 2025
https://github.com/wintermi/tmdb-dataform
An example Dataform project to load and transform the publicly available dataset from The Movie Database into a format which could be imported into Vertex AI Search for Media, allowing you to build a search engine for movies.
bigquery dataform google-cloud google-cloud-platform
Last synced: 07 Feb 2026
https://github.com/googleapis/google-cloud-cpp-bigquery
C++ Client Library for Google Cloud BigQuery
bigquery cloud cpp cpp17 google google-cloud-bigquery google-cloud-platform
Last synced: 23 Aug 2025
https://github.com/benitomartin/benitomartin
Personal profile 😎
anaconda artificial-intelligence aws bash-script bigquery data-science gcp lambda-functions large-language-models linux machine-learning python pytorch retrieval-augmented-generation sagemaker scikit-learn tensorflow terraform
Last synced: 12 Apr 2025
https://github.com/brews/bucket2bq
Create an inventory of objects in GCS Bucket with metadata and upload to Big Query
bigquery gcp golang google-cloud-storage
Last synced: 24 Jan 2026
https://github.com/tknishh/case-study-ueats-ghub-sql
Analyzing the Impact of Business Hour Mismatch on Order Volume in the Food Delivery Industry: A Case Study of UEats and Ghub
assignment bigquery case-study-analysis loop product-analyst sql
Last synced: 23 Aug 2025
https://github.com/mehmoodulhaq570/bigquery_machine_learning_project
Developed a machine learning model to predict incident groups based on data from the London Fire Brigade service calls.
bigquery bigquery-dataset cloud database jupyter-notebook machine-learning machine-learning-algorithms ml models prediction-algorithm prediction-model python
Last synced: 07 May 2026
https://github.com/evry-ace/statsbot
Slack Bot to forward message statistics to BigQuery
bigquery slack slack-bot slackbot
Last synced: 31 Oct 2025
https://github.com/takegue/bigquery-porter
BigQuery Deployment and Metadata Management tool
Last synced: 12 Feb 2026
https://github.com/xlfe/pyjdbq
The easiest way to ship journald logs to Google BigQuery
bigquery journald journald-logs logging security
Last synced: 25 Aug 2025
https://github.com/prodriguezdefino/apache-beam-streaming-tests
A testing suite for Dataflow streaming pipelines
aggregation bigquery bigtable dataflow gcp kafka pubsub pubsublite streaming
Last synced: 27 Oct 2025
https://github.com/rezuankassim/bqanalytic
Laravel package to use analytic data imported to Big Query from Firebase Analytic
bigquery firebase-analytics laravel
Last synced: 01 Feb 2026
https://github.com/vertexclique/olayufku
Schema Registry for BigQuery
bigquery bigquery-schema migration schema-migrations schema-registry
Last synced: 17 Apr 2026
https://github.com/jugnuarora/france-courses-enrollments
Data Pipeline creation of france courses enrollments. Every month the providers report the enrollments in their programs. The idea is to get the courses listed as well as the enrollments every month and look at the trend of enrolments and the inter comparison of the trainings s providers for different courses.
bigquery data-analytics data-engineering data-ingestion-and-infrastructure data-pipeline dbt gcp gcs kestra-workflows looker-studio
Last synced: 27 Apr 2026
https://github.com/yandex-cloud-examples/yc-bigquery-to-object-storage
Экспорт данных из Google Big Query через Google Storage в Object Storage Yandex Cloud.
bigquery object-storage python3 yandex-cloud yandexcloud
Last synced: 05 May 2026
https://github.com/lpraat/inbq
A library for parsing BigQuery queries and extracting schema-aware, column-level lineage.
bigquery data-lineage parser sql
Last synced: 26 Apr 2026
https://github.com/jasontanx/gsheet-to-bq-ingestion
Data ingestion from Google Sheet to BigQuery
bigquery data-engineering data-ingestion gsheets
Last synced: 02 May 2026
https://github.com/ovotech/bigquery-metrics-exporter
A Golang application to export table level metrics from BigQuery into Datadog.
Last synced: 06 Feb 2026
https://github.com/leandronasx/agro-data
Projeto final da formação de analista de dados e dashboard da SoulCode Academy.
bigquery data-analysis gcp looker pandas powerbi python
Last synced: 18 Jul 2025
https://github.com/t3n/gtmetrix-bq
A script running browser test of specified urls through GTmetrix and saving metrics in BigQuery.
Last synced: 15 May 2026
https://github.com/jasontanx/data-engineer-project-1
End-to-end data engineering project
airline bigquery data-engineering etl-pipeline looker-studio mage-ai
Last synced: 27 Apr 2026
https://github.com/oalles/avro-schema-from-bq-table
Get Avro Schema from Google Cloud Big Query table
avro avro-schema bigquery google-cloud java spring spring-boot
Last synced: 04 May 2026
https://github.com/jehiah/socrata_to_bigquery
A tool to copy public data to BigQuery
Last synced: 05 Mar 2026
https://github.com/moh-ayman/stripeapi-to-bq---cfunc-etl
Google Cloud Function built to perform an ETL Job to Collect StripeAPI Data and Transform it to be able to Import it to Bigquery.
bigquery dataengineering etl-pipeline gcp gcp-cloud-functions pandas-dataframe python stripe-api
Last synced: 18 Apr 2026
https://github.com/triglav-dataflow/triglav-agent-bigquery
BigQuery agent for Triglav, data-driven workflow tool
Last synced: 14 Feb 2026
https://github.com/taquynhnga2001/proptech-dagster
Build an ELT pipeline with dagster and dbt to schedule loading HDB resale transactions in Singapore into Google BigQuery data warehouse, then create Power BI dashboard to enhance insight exploration.
bigquery dagster data-integration data-orchestration data-warehouse dbt elt etl powerbi python
Last synced: 14 Feb 2026
https://github.com/oguzgn/firebase-ab-test-analysis-for-a-mobile-race-game
This repository showcases an infrastructure designed for analyzing A/B tests in mobile games. It leverages BigQuery to process Firebase and GA4-based event data and uses Looker Studio for dynamic visualization. The project simplifies A/B test comparisons, enabling stakeholders to view results directly through interactive dashboards.
ab-testing ab-testing-analysis bigquery event-based-tracking firebase looker-studio mobile-game-analytics race-game sql
Last synced: 19 May 2026
https://github.com/justinbeckwith/bisquick
🥞Synchronize your GitHub issues with BigQuery. Do neat stuff.
Last synced: 28 Apr 2026
https://github.com/loinguyen3108/sportify-music-analysis
Engineered the streaming crawler pipeline using Kafka to extract, transform, and load Spotify data into PostgreSQL and ClickHouse for real-time analytics. Additionally, developed an automated batching pipeline using Airflow and Spark to efficiently ETL crawled data into BigQuery.
airflow bigquery clickhouse kafka pyspark spotify
Last synced: 09 Apr 2025
https://github.com/tosh2230/pubsub-dataflow-bigquery
Google Cloud Dataflow for 'Exactly-Once' streaming insertion, from Google Cloud Pub/Sub to Google BigQuery.
bigquery dataflow gcp google-cloud google-cloud-platform pubsub
Last synced: 15 May 2026
https://github.com/morphl-ai/morphl-model-publishers-churning-users-bigquery
BigQuery connector, pre-processor and model for predicting churning users for digital publishers using Google Analytics 360
bigquery google-analytics machine-learning morphl-platform pipeline preprocessor pyspark
Last synced: 15 May 2026
https://github.com/romange/puma
Bigquery-like engine for processing structured json-like records
Last synced: 18 May 2026
https://github.com/varun-khorgade/weatherflow-etl-data-pipeline
ETL pipeline to fetch, clean, and load weather datasets for structured analysis.
bigquery data-engineering etl-pipeline pandas postresql psycopg2 sql
Last synced: 16 May 2026
https://github.com/nghiant3110/google_analytic_4
This is a DA project based on the GA4 Sample dataset on Big Query
bigquery google-analytics looker-studio sql
Last synced: 11 Apr 2025
https://github.com/Miguelapp10/ETL_OperadorLogistico
extraer datos de la API de SimpliRoute, AndesExpress y Urbano en un rango de fechas específico y procesarlos para su análisis y almacenamiento en Google BigQuery
api-client bigquery pandas python
Last synced: 11 Jul 2025
https://github.com/metrics-pli/bigquery-export
Exports collected metrics to Google Big Query
bigquery datastudio lighthouse metrics metrics-pli performance pupeteer
Last synced: 16 May 2026
https://github.com/shubhammohanty680/uber_data_analysis
bigquery data-analysis gcp-compute gcp-project looker-studio mageai python
Last synced: 17 Feb 2026
https://github.com/rohitsanj/superset-dbt-demo
This repository contains an example project (Jaffle Shop) demonstrating integration between Superset and dbt, with BigQuery as the data warehouse.
apache-superset bigquery dbt superset
Last synced: 17 May 2026
https://github.com/hyangminj/ddl2data
Turn any SQL schema into realistic test data — instantly
bigquery cli data-engineering date-testing dynamodb etl faker postgresql python rds synthetic-data test-data-generation
Last synced: 18 May 2026
https://github.com/misicode/Kaggle-Intro_to_SQL
Solutions of the exercises from the course "Introduction to SQL with BigQuery" by @Kaggle.
bigquery kaggle kaggle-intro-to-sql sql
Last synced: 10 Mar 2025
https://github.com/jaehyeon-kim/dbt-cicd-demo
DBT CI/CD Demo
bigquery cicd dataengineering dbt gcp github-actions
Last synced: 12 Jul 2025
https://github.com/thedumbterminal/bigquery-js-udf-example
Example of Javascript UDFs for use with Google BigQuery
Last synced: 18 May 2026
https://github.com/icarusso/bigqueryexporter
Export query data from google bigquery to local machine
Last synced: 11 Jul 2025
https://github.com/danlessa/meta_qa
A practical one-liner metalanguage for describing common-sense in an machine-friendly way.
Last synced: 03 May 2026
https://github.com/landerox/cloud-landerox-data
Reference architecture baseline for GCP data platforms (Apache Beam, BigQuery, Cloud Functions, Pub/Sub). Hybrid warehouse/lakehouse with batch + streaming, Medallion layering. Consumed by private runtime repos.
apache-beam batch-processing bigquery cloud-functions cloud-storage data-engineering data-platform dataform gcp google-cloud-dataflow iceberg lakehouse medallion-architecture opentelemetry pubsub python reference-architecture slsa streaming supply-chain-security
Last synced: 21 May 2026
https://github.com/landerox/cloud-landerox-infra
GCP Terraform baseline and reference architecture — multi-environment CI/CD, defense-in-depth (validations + Conftest + Sigstore plan attestation), Workload Identity Federation, BigQuery medallion, recipes per module. OpenSSF Best Practices silver.
artifact-registry bigquery checkov cicd cloud-run cloud-scheduler conftest devsecops gcp iam infrastructure-as-code openssf reference-architecture secret-manager sigstore slsa terraform terraform-modules workload-identity-federation
Last synced: 21 May 2026
https://github.com/oguzgn/a-case-study-for-a-livestreaming-platform
This project aims to analyze livestream watch times of users across different regions. The goal is to identify the top 5 users with the highest watch time for each region. The analysis involves multiple SQL transformations to extract meaningful insights from the data.
bigquery data data-analysis data-modeling live-streaming sql
Last synced: 23 Jun 2025
https://github.com/gdbecker/dbtlabslearning
Learn the foundational steps of transforming data in dbt Cloud. Start by connecting dbt Cloud to a data warehouse and Git repository, then explore key concepts like modeling, sources, testing, documentation, and deployment. Get hands-on by building a model and running tests in dbt Cloud.
analytics-engineering bigquery dbt dbt-cloud jinja macros models packages sql testing
Last synced: 02 Jan 2026
https://github.com/shrawans007/google_cyclistic_2023
Google Data Analytics Capstone Case Study (SQL and Tableau)
big-query bigquery coursera-assignment cyclistic cyclistic-bike-share-analysis-case-study cyclistic-bikshare data-analysis data-analysis-project data-analytics data-cleaning data-combination data-exploration data-science google-data-analytics sql tableau tableau-dashboard tableau-public
Last synced: 25 Feb 2025
https://github.com/antoinegiraud/dataform_hypermarche
SQL repo orchestrated by Dataform for BigQuery
Last synced: 12 Sep 2025
https://github.com/ajaxbarcelonacruyff/gcp_cost
Monitoring Google Cloud costs with Looker Studio.
bigquery googlecloud googlecloudplatform lookerstudio
Last synced: 14 May 2025
https://github.com/ajaxbarcelonacruyff/ga4_bigquery_session_source
Creating GA4 session references in BigQuery.
Last synced: 14 May 2025
https://github.com/loozhengyuan/datafeeds-sql
SQL code snippets for deriving dimensions and metrics from Adobe Analytics Data Feeds
adobe adobe-analytics adobe-analytics-data-feeds bigquery sql
Last synced: 03 Feb 2026
https://github.com/essien1990/etl_pipeline_airflow
Creating pipelines using Python3 and Apache Airflow to load tables into Google Big Query Dataware House
airflow airflow-dags airflow-operators bash bigquery bq datawarehouse etl-pipeline python3
Last synced: 03 Jan 2026
https://github.com/ostrokach/uniparc_xml_parser
UniParc dataset describing ~300 million protein sequences converted into relational tables accessible through Google BigQuery (and as Parquet files).
bigquery bioinformatics csv-files parquet-files protein-domains protein-sequences
Last synced: 03 Jan 2026
https://github.com/jjviscomi/bqemulator
Local emulator for Google BigQuery. DuckDB-backed, SQLGlot-powered. Drop-in replacement for the real service in dev, CI, and offline replicas.
bigquery bq-cli duckdb emulator fastapi pytest-plugin python sqlglot testing
Last synced: 23 May 2026
https://github.com/matt-strautmann/dbt-bigquery-ecommerce-quickstart
Welcome to the dbt-BigQuery Quickstart Project! 🎉 This repository is designed as a hands-on guide to help you build a modern data stack leveraging powerful tools like Airbyte for ingestion, dbt for transformation, and BigQuery for storage and analytics.
bigquery dbt e-commerce quickstarts
Last synced: 30 Jul 2025
https://github.com/squidmin/java17-spring-gradle-bigquery-reference
Java v17⋅ Spring v3 ⋅ Gradle ⋅ BigQuery
bigquery gradle java java-17-gradle java17 java17-spring-boot spring-boot-3
Last synced: 01 Aug 2025
https://github.com/datacody/dbt-jaffle-shop
A hands-on project built to deepen understanding of dbt modeling, testing, and documentation. Based on the Jaffle Shop dataset, the project showcases best practices in transforming and validating source data for business analytics using the modern data stack.
analytics bigquery data-eng data-modeling dbt etl-pipeline sql transformation
Last synced: 02 Aug 2025
https://github.com/hariprasath-v/mh_google_cloud_bigquery_ltv_prediction_challenge
Build a model that can predict customers' Long Term Value (LTV).
bigquery colab-notebook klib machine-learning matplotlib numpy pandas python python3 seaborn
Last synced: 09 Apr 2026
https://github.com/fabioba/netflix-analytics
Analyse personal Netflix usage
airflow astronomer bigquery google-cloud-platform netflix tableau
Last synced: 15 May 2026
https://github.com/windi-wulandari/pbi_kimia-farma-x-rakamin
A data-driven analytics project for Kimia Farma to evaluate business performance from 2020-2023 using BigQuery. Focused on transaction data, inventory, branch operations, and product insights. Results were visualized through an interactive dashboard to support strategic decisions and optimizations.
big-data-analytics bigquery datawarehouse googlelooker sql
Last synced: 03 Jan 2026
https://github.com/wintermi/bqdo
bqdo is a CLI for executing BigQuery SQL as part of a pipeline.
bigquery google-cloud google-cloud-platform
Last synced: 18 May 2026
https://github.com/chandanpasunoori/event-sync
Event Sync is for syncing events from multiple sources to multiple destinations, targetted for adhoc events, where sources support acknowledgement functionality.
bigquery golang-tools google-cloud-platform pubsub
Last synced: 21 Aug 2025
https://github.com/miguelapp10/workinghoursbetweentwodate_bigquery
Este proyecto es una calculadora de horas laborales que determina la cantidad de horas trabajadas entre dos fechas, teniendo en cuenta días hábiles y horas de trabajo especificadas con Bigquery
bigquery bigquery-dataset bigquery-table querying sql sql-query
Last synced: 24 Aug 2025
https://github.com/kusmn/tableau-googletrends-canada-analytics
Visualizes Canada’s top Google search terms (2020–2024) using BigQuery and Tableau to explore regional and temporal trends.
bigquery data-visualization tableau
Last synced: 26 Aug 2025
https://github.com/prajakta1321/san-francisco-bike-share-analysis-using-bigquery-and-lookerstudio
This project describes the analysis of San Franciso dataset using SQL in Bigquery and Looker Studio
Last synced: 13 Jul 2025
https://github.com/getconversio/go-utils
A collection of utility functionality for Go
amqp bigquery geoip golang logrus openexchangerates utilities
Last synced: 19 Apr 2026
https://github.com/ritu456286/smartstockai
SmartStockAI uses AI to predict inventory trends, minimize deadstock risks, and provide actionable insights through advanced models and interactive visualizations.
bigquery bigquery-ml cloud-storage cloudrun cloudsql gemini google-maps-api
Last synced: 18 Apr 2026
https://github.com/tomgorb/project-template-for-production
project template to (help) put a Machine/Deep learning algorithm into production
Last synced: 15 May 2026
https://github.com/antjes88/asset-valuation-ingestion
The solution involves the DW Ingestion Layer, where data from CSV files is loaded into BigQuery
Last synced: 18 Jan 2026
https://github.com/antjes88/exchange-rates-ingestion
Repo to source Exchange Rates
bigquery cloudfunctions gcp python terraform
Last synced: 18 Jan 2026
https://github.com/thunchanokbow/audiblebook-revenue
Manage big data on cloud computing to find a list of best-selling audible books, generate reports and dashboards, and provide products and sales promotions that meet the needs of consumers in Thailand
apache-airflow bigquery cloudcomposer data-visualization datalake datawarehouse googlecloudstorage lookerstudio pandas python3
Last synced: 11 Apr 2026
https://github.com/chukwuemekaaham/data-engineering-zoomcamp
Datatalks Club Free Data Engineering Zoomcamp Project
bigquery dbt docker-compose duckdb gcp gcp-cloud-storage github-actions jupyter-notebook kafka linux looker-studio mageai pandas postgresql prefect python redpanda risingwave spark terraform
Last synced: 24 Feb 2026
https://github.com/pvoo/bigquery-mcp
Practical MCP server for quickly navigating BigQuery datasets and tables. Suitable for larger projects with many datasets/tables, optimized to keep LLM context small while staying fast and safe.
bigquery claude-code cursor fastmcp mcp mcp-server mcp-servers mcp-tools windsurf
Last synced: 11 Apr 2026
https://github.com/yu-iskw/terraform-google-copy-bq-datasets
A terraform module to copy BigQuery datasets across regions
bigquery data-engineering google-cloud terraform
Last synced: 19 Jan 2026
https://github.com/misszeferino/sql-projects
bigquery data-analysis mysql queries sql sqlite3
Last synced: 29 Jan 2026
https://github.com/stkchan/web-scraping-with-selenium
bigquery pandas python selenium-webdriver webscraping
Last synced: 02 Jan 2026
https://github.com/dreamdata-io/free-email-providers
A list of free email domain providers so that you can easily spot business users from consumers!
bigquery dataset email-parsing email-verification
Last synced: 18 Jan 2026
https://github.com/anilkhichar/bq-table-copy-automation
Copy table from one dataset to another in google big query using bash script
automation bash bash-script big-query bigquery bigquery-cp gcp google
Last synced: 18 Apr 2026
https://github.com/chelseammatta/nopd-cad-data-analysis
Analysis of 911 call data from New Orleans' 3rd & 4th police districts (2019-2022) using BigQuery
911-calls 911-data bigquery cad-data crime-analysis data-analysis emergency-response new-orleans public-safety sql
Last synced: 01 Jul 2025
https://github.com/oguzgn/fully-automated-performance-marketing-dashboard
This project integrates data from multiple ad platforms with Google Analytics to track marketing campaigns. It uses a structured naming system and UTM tags. Data is visualized in Looker Studio dashboards to analyze campaign performance and ad spend.
bigquery data-analysis data-engineering data-modeling marketing-analytics marketing-automation marketing-data-science marketingdata sql
Last synced: 24 Mar 2025
https://github.com/esanchezros/bigquery-maven-plugin
Maven plugin for managing BigQuery datasets, tables and views
bigquery java maven maven-plugin
Last synced: 03 Oct 2025
https://github.com/qualabs/cmcd-toolkit
Tools to collect and analize CMCD v2 locally and in different cloud providers.
big-data bigquery cmcd dash dashjs gcloud hls hls-live-streaming influxdb mpeg-dash open-source qoe-measurements shaka-player video video-player video-streaming
Last synced: 02 Jul 2025
https://github.com/thunchanokbow/inventory-amazon
Inventory value is also important for determining a company's liquidity, or its ability to meet its short-term financial obligations. A high inventory value can indicate that a company has too much money tied up in inventory, which could make it difficult for the company to pay its bills.
azure bigquery cloudcomposer clouddatabase cloudstorage compute-engine dataproc postgresql powerbi pyspark-sql python3
Last synced: 12 Apr 2026
https://github.com/greenpeace/gpes-old-en-petitions-api-emulator
Emulates the deprecated EN petition's API. Useful if you have legacy microsites with petitions.
bigquery mysql petitions sqlite3
Last synced: 11 May 2025
https://github.com/secmon-lab/overseer
A security log analysis tool for data lake with combination of SQL query and Rego policy
bigquery detection open-policy-agent security-monitoring sql
Last synced: 11 Mar 2026
https://github.com/samedhi/gaend
Convert GAE Models into endpoints
bigquery elasticsearch google-app-engine restful taskqueue
Last synced: 03 May 2026
https://github.com/mlabarrere/pygquery
🐷 Multitread your data with Google BigQuery
bigquery dataframe google-bigquery multithreading pandas python
Last synced: 04 Feb 2026