BigQuery
Google BigQuery enables companies to handle large amounts of data without having to manage infrastructure. Google’s documentation describes it as a « serverless architecture (that) lets you use SQL queries to answer your organization’s biggest questions with zero infrastructure management. BigQuery’s scalable, distributed analysis engine lets you query terabytes in seconds and petabytes in minutes. » Its client libraries allow the use of widely known languages such as Python, Java, JavaScript, and Go. Federated queries are also supported, making it flexible to read data from external sources.
📖 A highly rated canonical book on it is « Google BigQuery: The Definitive Guide », a comprehensive reference. Another enriching read on the subject is the inside story told in the article by the founding product manager of BigQuery celebrating its 10th anniversary.
- GitHub: https://github.com/topics/bigquery
- Wikipedia: https://en.wikipedia.org/wiki/BigQuery/
- Repo: https://github.com/GoogleCloudPlatform/bigquery-utils/
- Released: May 19, 2010
- Related Topics: cloud-computing,
- Aliases: bq,
- Last updated: 2026-06-22 00:03:35 UTC
- JSON Representation
https://github.com/f4rkh4d/drift
sql linter and formatter in rust. 7 dialects, 80 rules, schema-aware. single binary. 50-200x sqlfluff. demo at drift.frkhd.com/play/
bigquery cli database dbt formatter linter lsp migration mysql postgres rust schema-aware snowflake sql sql-linter sqlfluff sqlite static-analysis tsql wasm
Last synced: 21 May 2026
https://github.com/anpandu/ps2bq
Stream insert GCP PubSub messages into BigQuery table.
Last synced: 12 Feb 2026
https://github.com/sahilgundu/tier1-uk-bank-fx-streaming-gcp
Sanitized case study — Tier-1 UK bank FX streaming on GCP (Pub/Sub → Dataflow → BigQuery, Composer, VPC-SC/CMEK). Patterns only; no client code/data.
architecture bigquery case-study data-engineering dataflow gcp mermaid pubsub streaming
Last synced: 10 Jun 2026
https://github.com/v-cth/database_audit
Audit your data quality in seconds
bigquery database dataengineering snowflake
Last synced: 20 Nov 2025
https://github.com/sahilgundu/tier1-swiss-bank-regulatory-reporting-lakehouse-gcp
GCP-based Regulatory Reporting Lakehouse — Tier-1 Swiss Bank (Simulated Case Study):- Documentation-only repo illustrating a cloud-native data lakehouse architecture for regulatory reporting on Google Cloud Platform (GCS + BigQuery + Dataflow + Composer). Includes ADRs, runbooks, and compliance data contracts.
adr bfsi bigquery composer data-engineering data-pipeline dataflow gcp lakehouse pubsub regulatory-reporting runbook
Last synced: 16 May 2026
https://github.com/markjamesbutler/dbt-fundamentals-bigquery
Implementation of dbt fundamentals training course material using BigQuery.
bigquery dbt dbt-fundamentals fundamentals jinja2 practice-tasks sql
Last synced: 27 May 2026
https://github.com/oleksiilatypov/google_cloud
AI & Data, Google Cloud Skills Boost
bigquery document-ai ml vertexai
Last synced: 23 Apr 2026
https://github.com/mdornseif/datastore-to-bigquery
The missing Data Transfer Tool: Dump Google Cloud Datastore contents and load them into BigQuery.
backup bigquery bigquery-backup cloud datastore google
Last synced: 02 Jan 2026
https://github.com/lupusruber/music_analytics
This project processes real-time music event data using Kafka, Apache Spark on Google Cloud Dataproc, and stores the transformed data in BigQuery for analytics, all orchestrated by Airflow and managed with Terraform.
bigquery data-proc dimensional-modeling gcp-project kafka spark-structured-streaming
Last synced: 01 May 2026
https://github.com/vidyadnina/cyclistic-sql-tableau-project
Trip data analysis for a bike-sharing service company using SQL and Tableau.
bigquery dashboard data-analysis data-analytics-sql data-cleaning data-visualization sql
Last synced: 02 Jan 2026
https://github.com/pratshrestha/cochin-traders---sql--sales-analysis
Cochin Traders imports and exports specialty foods globally. This project analyzes sales and operational data to enhance business efficiency, supply chain management, and sales performance. Key areas of focus include
bigquery customer-engagement employee-performance inventory-management sales-trends sql
Last synced: 02 Jan 2026
https://github.com/moeabbas6/bq_data_loader
A Python script for executing and logging batch SQL commands in Google BigQuery. Includes tracking of execution times, unique job and statement IDs, and automated logging to a specified BigQuery table.
Last synced: 24 Mar 2025
https://github.com/wooyakob/music-recommendation-engine
Using Gemini API to generate personalized music recommendations.
ai bigquery gemini-api google-cloud-platform
Last synced: 23 Mar 2025
https://github.com/gabrieladados/analise-ecommerce
Análise SQL para E-commerce: Estratégias de Crescimento para Impulsionar Vendas
bigquery data-analysis ecommerce sql
Last synced: 31 Mar 2025
https://github.com/akansharajput280799/strategic-analysis-of-retail-brand-in-south-america-using-sql
Leveraged Big Query and MySQL to analyze 100K records for sales optimization, trend identification, and enhancing customer satisfaction for a retail brand in South America and to provide insights and recommendations to improve their userbase and improve their services
bigquery data-analysis data-science database database-schema google-bigquery mysql-server sql
Last synced: 19 May 2026
https://github.com/bsrikanth24/etl-pipeline-project-sales
This project implements a pipeline ETL to process fictitious sales data.
bigquery pandas-dataframe python
Last synced: 03 May 2026
https://github.com/prashhhant213/strategic-analysis-of-retail-brand-in-south-america-using-sql
Leveraged Big Query and MySQL to analyze 100K records for sales optimization, trend identification, and enhancing customer satisfaction for a retail brand in South America and to provide insights and recommendations to improve their userbase and improve their services
bigquery database mysql-server sql
Last synced: 11 Apr 2026
https://github.com/janaom/gcp-de-project-data-pipeline-with-cloud-run-functions-airflow-biggueryml
Build a data pipeline on Google Cloud using an event-driven architecture, leveraging GCS, Cloud Run functions, and BigQuery. Explore both VM and Composer options for Airflow management, and utilize Logging & Monitoring for pipeline health. Discover how SQL-based BigQuery ML can be used for initial ML implementation in specific scenarios.
airflow bigquery bigqueryml cloud-functions cloud-run-functions composer data-engineering-project google-cloud-platform
Last synced: 05 Jan 2026
https://github.com/lucashomuniz/Project-04
STATISTICAL ANALYSIS FOR DEMAND PLANNING IN POWERBI
bigquery data-analysis data-structures data-visualization database google-cloud-platform powerbi powerbi-visuals sql sql-query
Last synced: 11 Oct 2025
https://github.com/hardik-agrl/ecommerce-user-conversion-dashboard
This project analyzes user behavior in an ecommerce setting using BigQuery and Looker Studio. It tracks offer notification clicks, session activity, and purchase completions to identify how many users convert within the same session after engaging with an offer.
bigquery dashboard-visualization looker-studio sql
Last synced: 02 Jul 2025
https://github.com/rolandbende/python-bigquery-migrations
Python bigquery-migrations package is for creating and manipulating BigQuery databases easily.
bigquery google migration-automation migration-scripts migration-tool migrations python
Last synced: 02 Feb 2026
https://github.com/thenazar9/user-behavior-email-campaign-analysis-sql
Analysis of user behavior and email campaign performance using BigQuery and Looker Studio, focusing on account creation trends, email engagement, and user segmentation.
analytics bigquery data-analysis data-visualization etl looker-studio sql structured-query-language
Last synced: 16 Oct 2025
https://github.com/ruohansong/sql_practice
SQL queries from both Leetcode SQL 50 and some BigQuery practices
Last synced: 02 Jul 2025
https://github.com/siobhan-doherty/ag_challenge
airflow bigquery csv-files data-engineering etl google-cloud-platform python sql
Last synced: 01 May 2026
https://github.com/francois-lenne/elt-mp4-quiberon
the goal of this project is to retrieve the video of the municipality of quiberon and see if a person is in or no
bigquery cicd data-engineering docker elt google-cloud-functions google-cloud-platform google-cloud-run google-cloud-storage pipeline python sql unstructured-data
Last synced: 11 Apr 2026
https://github.com/antahantah/simple-airflow-with-postgre
Capstone Project 3 - Data Pipeline Apache Airflow dari PostgreSQL ke BigQuery | Capstone Project 3: Data Pipeline with Apache Airflow - From PostgreSQL to Google BigQuery
airflow bigquery capstone capstone-project data-engineering docker extract-load postgresql python
Last synced: 14 Apr 2026
https://github.com/hckhanh/pg2bigquery
A CLI tool to convert query from PostgreSQL to BigQuery
big bigquery google pg pgsql postgres postgres-tool postgresql postgresql-database postgressql query query-parser querybuilder sql sql-toolkit sql-tools tool toolbox toolkit utility
Last synced: 06 Jan 2026
https://github.com/scraly/flume-bigquery-sink
An Apache Flume Sink implementation to publish data to Google BigQuery
Last synced: 25 Apr 2026
https://github.com/oguzgn/data-science-for-business-imp
a case study for business improvment
ab-testing bigquery data-science data-visualization debugging looker marketing-analytics sheets
Last synced: 15 Mar 2025
https://github.com/zenklinov/correlation-nybikers-with-weather-using-bigquery
Last synced: 02 Jan 2026
https://github.com/simhayn/genomics-cannabis-bigquery
BigQuery's Cannabis_Genomics Dataset Exploration using SQL in a Python Environment
big-data bigquery bioinformatics exploratory-data-analysis genomics python sql
Last synced: 02 Jan 2026
https://github.com/andre-gitdev/stocks-functions
This project is for EDA related to stock trading.
alpaca alpaca-trading-api bigquery google-cloud portfolio-optimization robinhood-api robinhood-portfolio stock-analysis stock-data stock-price-prediction stocks-api stocks-trading
Last synced: 02 Jan 2026
https://github.com/robinnoiret/internship-zendesk_reporting__migration
This project involves developing a Python script to import csv export from Zendesk to BigQuery. It is not intended for recurring use, but to enable an initial dump of historical data.
bigquery connector export-csvfile json zendesk
Last synced: 16 Mar 2025
https://github.com/janaom/gcp-de-project-elt-dbt-cloud-run-composer-dataplex
A Google Cloud ELT pipeline project with dbt, Cloud Run, Composer, BigQuery, and Dataplex. Includes event-driven automation, data quality checks, and complete setup guidance.
bigquery cloud-run cloud-run-functions dataplex dbt elt-pipeline google-cloud
Last synced: 28 May 2026
https://github.com/data-analitycs-pos-tech-fiap/covid19_ibge
Este repositório contém a análise de dados realizada no âmbito do Tech Challenge da pós-graduação em Data Analytics da FIAP. O objetivo do projeto é explorar e interpretar dados relevantes sobre o comportamento da população durante a pandemia da COVID-19, utilizando a base de dados PNAD-COVID19 do IBGE. Essa análise visa apoiar a tomada de decisão
bigquery excel google-bigquery google-cloud-platform powerbi python sql
Last synced: 08 Apr 2025
https://github.com/datasherlock/bqml-demo
BigQuery ML demo showcasing data prep, logistic regression training, evaluation, and explainable predictions for account fraud detection on Cloud Spanner data.
bigdata bigquery bigquery-ml sql
Last synced: 21 Jun 2026
https://github.com/jugnuarora/datatalks-de-zoomcamp-project
Data Pipeline creation of france courses enrollments. Every month the providers report the enrollments in their programs. The idea is to get the courses listed as well as the enrollments every month and look at the trend of enrolments and the inter comparison of the trainings s providers for different courses.
bigquery data-analytics data-engineering data-ingestion-and-infrastructure data-pipeline dbt gcp gcs kestra-workflows looker-studio
Last synced: 31 Mar 2025
https://github.com/ngangawairimu/sales-analysis-and-customer-insights
This project features SQL queries for detailed customer and sales analysis:Customer Analysis and Sales Reporting
bigquery bigquery-dataset excel sql
Last synced: 23 Mar 2025
https://github.com/zenklinov/predicting_visitor_purchases_using_google_cloud_bigquery-
This repository contains a project for predicting Apple Inc.'s stock prices using a Long Short-Term Memory (LSTM) neural network. The model is optimized using Keras Tuner, a library for hyperparameter tuning in deep learning models. The dataset used for training and testing the model is sourced from Yahoo Finance.
bigquery bigquery-ml classification
Last synced: 11 Feb 2026
https://github.com/jasontanx/mas-international-arrivals
Code repository about international arrivals into Malaysia
bigquery data-analytics data-engineering etl-pipeline international-arrivals
Last synced: 02 Jan 2026
https://github.com/quipper/send-ci-result-to-bigquery-action
Send test results to BigQuery in GitHub Actions
bigquery github-actions google-bigquery junit-xml
Last synced: 01 May 2026
https://github.com/tuancamtbtx/dataform-utils
Bigquery Dataform Javascript Utils Package - Support Ads, Query Common, ...
bigquery dataform datawarehouse
Last synced: 15 May 2025
https://github.com/kmohamedalie/bigquery-intro
Coursera BigQuery Introduction using Covid19 dataset
bigquery coursera covid-19 datavisualization looker-studio sql
Last synced: 02 Jan 2026
https://github.com/jwcheonx/export-bq-tables
A Bash script for exporting all tables from a BigQuery dataset to local storage
bash bigquery cloudshell tools
Last synced: 12 Apr 2026
https://github.com/owox/sgtm-owox-ga4-bigquery
OWOX BI Streaming is an advanced tracking to get the most from existing Google Analytics 4 installed on your website
Last synced: 07 Apr 2025
https://github.com/andrii04/ga4-gcs-to-bigquery-etl
Automated Data Pipeline that ingests daily GA4-formatted CSV files from a private Google Cloud Storage bucket, validates and loads them into BigQuery, and prepares analysis-ready views. The solution is built for deployment as a Cloud Function triggered by Cloud Scheduler and uses Python with the Google Cloud Storage and BigQuery client libraries.
automation bigquery cloud cloudfunctions data data-analysis data-engineering etl etlpipeline gcp google googlecloudplatform pipeline python sql
Last synced: 18 May 2026
https://github.com/ackeecz/terraform-gcp-cloud-function_pubsub_to_bq
Cloud function subscribes itself to given topic and inserts each message to BigQuery table.
bigquery cloud-functions pubsub terraform-module
Last synced: 16 May 2026
https://github.com/lu-sketch/google-big-query-sql---credit-risk-analysis
Big Query SQL Credit Risk Analysis
big-data bigquery credit-risk sql
Last synced: 25 May 2026
https://github.com/itsubaki/hermes-lambda
Transfers AWS cost data to BigQuery
Last synced: 01 Apr 2025
https://github.com/ansh-info/stockpulse
Real-time stock market analytics pipeline with live visualization dashboard. Built with Python and GCP, featuring automated data processing and interactive Streamlit analytics.
api big-data bigquery cloud cloud-computing cloud-native data-engineering data-pipeline docker docker-compose gcp gcp-automation-gitops gcp-cloud-run gcp-pubsub google-cloud-platform real-time realtime stock-market stocks streamlit
Last synced: 10 Apr 2026
https://github.com/ahmadluay9/hotel-mcp-bigquery-postgresql
AI-powered Hotel Management Assistant built with Streamlit and Google's ADK. Connects to PostgreSQL for operations and BigQuery for analytics.
adk adk-python agentic-ai bigquery mcp postgresql streamlit
Last synced: 10 Apr 2026
https://github.com/greatwoman23/car_insurance_analysis
The Car Insurance Analysis project aims to provide a comprehensive examination of a car insurance portfolio using advanced data analytics tools. The analysis offers valuable insights into policy demographics, claims patterns, and financial metrics, helping stakeholders make informed decisions.
bigquery data data-science dataanalytics insurance-claims looker-studio tableau
Last synced: 03 Feb 2026
https://github.com/isaacmg/mimic_iv_bq_queries
Queries needed to recreate time series features for model training
Last synced: 14 Mar 2025
https://github.com/noridj4/langchain-runnables
🚀 Explore LangChain Runnables to easily compose and manage components for efficient and flexible execution within the LangChain ecosystem.
ai bigquery chatbot chatgpt chroma docker fastapi langchain llama llm llm-agents local-first offline-first openai prompt-engineering rag text-to-sql vector-database
Last synced: 07 Apr 2026
https://github.com/misaober/datove_inzenyrstvi_projekt
Kurz Datové inzenýrství v praxi (Czechitas, 36 hod) - vytvoření vlastního projektu na reálných datech obsahující skripty pro vytvoření vrstev L1, L2, L3, datový model a design architektury projektu.
Last synced: 14 May 2026
https://github.com/iht/bigquery-dataflow-cdc-example
A Dataflow streaming pipeline written in Java, reading data from Pubsub and recovering the sessions from potentially unordered data, and upserting the session data into BigQuery with no duplicates
apache-beam bigquery cdc dataflow google-cloud pubsub
Last synced: 04 Jan 2026
https://github.com/sangnandar/insert-unique-record
This is Cloud Functions script to insert only unique records into BigQuery.
bigquery digital-marketing-analytics google-cloud-functions
Last synced: 03 May 2026
https://github.com/davelester/gharchive-bigquery-examples
Examples Using BigQuery to Analyze GH Archive Data
Last synced: 27 Mar 2025
https://github.com/data-platform-hq/terraform-google-bigquery
Terraform module for managing Google BigQuery datasets
bigquery google-cloud terraform-module
Last synced: 27 Aug 2025
https://github.com/rifqyirfanto21/ecommerce-data-pipeline-airflow-gcp-dbt
End-to-end automated data pipeline using Python, PostgreSQL, Airflow, GCS, BigQuery, and dbt — built to simulate a production-grade analytics workflow. Completed as part of Purwadhika’s Module 3 Data Engineering Program
airflow bigquery dataengineering dbt googlecloudplatform python
Last synced: 21 Apr 2026
https://github.com/ankita-selokar/fitbit-for-her-crafting-fitbit-s-strategy-for-women
This project analyzes smart device usage data to uncover trends and insights, guiding Fitbit by Google’s product and marketing strategies for their new women-focused product launch. It combines competitive market analysis with customer behavior insights to inform key decisions.
bigquery excel powerbi spreadsheet sql
Last synced: 20 Aug 2025
https://github.com/armahdavi/bigdata_pyspark_sales_analytics
Summarizing my big data code in python pyspark to analyze sales data with retail and walmart superstore to draw sales insights
big-data bigquery clustering dataframe hadoop k-means machine-learning pyspark pyspark-ml python spark unsupervised-learning
Last synced: 12 Apr 2026
https://github.com/zkan/data-engineering-on-gcp
Data Engineering on Google Cloud Platform (GCP)
bigquery data-engineering data-lake data-pipeline data-warehouse gcs google-cloud-platform machine-learning
Last synced: 19 Aug 2025
https://github.com/pittica/google-bigquery-helpers
Helpers for Google Cloud BigQuery.
bigquery gcp google-cloud-platform pittica
Last synced: 06 Jan 2026
https://github.com/chiamakaukwuoma/portfolio
This repository contains various projects I've been privileged to work on outside of work.
aws-rds azure-fabric bigquery data-analysis docker-container elasticsearch excel grafana hadoop looker-studio mssql mysql postgresql powerbi python sql tableau
Last synced: 10 Apr 2026
https://github.com/nghiant3110/b2b_crm_3
This is a DA project based on the B2B Sales CRM dataset from Maven Analytics
bigquery google-sheets looker-studio sql
Last synced: 18 Aug 2025
https://github.com/simoun-asmar/clinipet_project
BigQuery
bigquery looker-studio lookerstudio sql
Last synced: 03 Mar 2025