BigQuery
Google BigQuery enables companies to handle large amounts of data without having to manage infrastructure. Google’s documentation describes it as a « serverless architecture (that) lets you use SQL queries to answer your organization’s biggest questions with zero infrastructure management. BigQuery’s scalable, distributed analysis engine lets you query terabytes in seconds and petabytes in minutes. » Its client libraries allow the use of widely known languages such as Python, Java, JavaScript, and Go. Federated queries are also supported, making it flexible to read data from external sources.
📖 A highly rated canonical book on it is « Google BigQuery: The Definitive Guide », a comprehensive reference. Another enriching read on the subject is the inside story told in the article by the founding product manager of BigQuery celebrating its 10th anniversary.
- GitHub: https://github.com/topics/bigquery
- Wikipedia: https://en.wikipedia.org/wiki/BigQuery/
- Repo: https://github.com/GoogleCloudPlatform/bigquery-utils/
- Released: May 19, 2010
- Related Topics: cloud-computing,
- Aliases: bq,
- Last updated: 2026-06-22 00:03:35 UTC
- JSON Representation
https://github.com/vaxdata22/city-weather-and-s3file-rds-s3-bigquery-etl-by-airflow-on-ec2
This is my third AWS Cloud ETL project. This data pipeline orchestration uses Apache Airflow on AWS EC2. It demonstrates how to build an ETL data pipeline that would perform data extraction to a database in parallel to a loading process into the same database, join the tables, copy joined data to S3 and finally copy the S3 file to BigQuery DW.
apache-airflow aws-ec2 aws-rds-postgres aws-s3 bigquery business-intelligence dags data-warehousing etl-pipeline openweathermap-api orchestration python3 sql
Last synced: 21 May 2026
https://github.com/night-fury-me/real-time-vehicle-data-processing
A repository that contains implementation of a Real-Time Vehicle Data Processing Pipeline that efficiently manages and analyzes vehicle data through a cohesive system.
bigquery cpp data-engineering data-streaming flink grpc kafka python real-time-data-processing
Last synced: 02 Jan 2026
https://github.com/hackolade/bigquery
Hackolade(https://hackolade.com) plugin for BigQuery
bigquery bigquery-schema data-modeling data-models entity-relationship-diagram er-diagram nosql nosql-databases schema-design
Last synced: 12 Feb 2026
https://github.com/azapeti/bigquery-python-bash-automation
Since you're using the free version, you can only get data from your website through the Google Analytics API for the last 60 days. I would like to demonstrate in this repository how to run BigQuery queries in Python and automate it using bash and crontab for collecting historical data.
analytics automation bash bigquery cronjob crontab ga4 python python3
Last synced: 02 Jan 2026
https://github.com/jakwakwa/risk-management-system
modern technologies such as machine learning and AI for our internal risk team. These tools are expected to streamline operations, quickly highlight anomalies, and support more informed decision-making
bigquery bun ml nexjs16 rag react risk-analysis shadcn-ui temporal typescript vertex-ai
Last synced: 22 Jan 2026
https://github.com/nlgtuankiet/bq-noti
BigQuery notification
bigquery bq notification notifier
Last synced: 02 Jan 2026
https://github.com/sintef/bigquery-postgresql-wire-proxy
A PostgreSQL wire protocol proxy server for BigQuery.
Last synced: 05 May 2026
https://github.com/shvetsihorr/sql-projects
SQL and Google BigQuery-Portfolio Projects
azuredatastudio bigquery mssql postgresql sql
Last synced: 15 Mar 2026
https://github.com/abdullahasghar/sql
The repo includes all projects and assessments I have completed with SQL. IDE/s used: MS SQL Server, Google Big Query.
Last synced: 15 Mar 2026
https://github.com/manuelandersen/football-pipeline
DE Zoomcamp 2024 Final Project 🧙
bigquery data-engineering data-lake data-warehouse dbt dbt-cloud etl-pipeline google-cloud looker-studio mageai python
Last synced: 02 Jan 2026
https://github.com/kartikeya443/automated-data-pipeline-gcp
This project showcases the integration of various Google Cloud Platform services to build an efficient and automated data pipeline for sales data.
bigquery cloud data-engineering flask gcp google-cloud-platform looker-studio pipeline python sql
Last synced: 03 Feb 2026
https://github.com/swatisinghit/e-commerce-trend-analysis-for-target
An exploratory and in-depth study of the E-Commerce sales data for a Brazilian store using SQL.
bigquery data-analysis mysql sql
Last synced: 19 May 2026
https://github.com/chukwuemekaaham/ny_taxi_rides
Analytics engineering using Dbt and Google Cloud BigQuery
analytics-engineering bigquery dbt github
Last synced: 19 May 2026
https://github.com/goosethedev/de-zoomcamp-2025
Homeworks for the DataTalksClub's Data Engineering Zoomcamp 2025.
bigquery data-engineering kestra python terraform
Last synced: 29 Apr 2026
https://github.com/knands42/data-ingestion
Data Ingestion project to evaluate my Kotlin skill using concurrency
bigquery golang google-cloud-platform google-storage gradle-kotlin-dsl kotlin kotlin-flow
Last synced: 23 Jan 2026
https://github.com/cyber-programmer/web-traffic-analytics-ml-model
This Jupyter Notebook focuses on classifying website visitors using logistic regression. The project leverages Google Analytics sample data and BigQuery for data analysis and feature engineering. It provides a comprehensive workflow that includes data import, preprocessing, exploratory data analysis.
bigquery logistic-regression machine-learning
Last synced: 06 Feb 2026
https://github.com/jtwebman/bigquery-local
Node.js + DuckDB local emulator for the Google BigQuery REST API. Drop-in for testing, CI, and local dev — with working PATCH.
bigquery duckdb emulator local-development nodejs sql testing typescript
Last synced: 19 May 2026
https://github.com/epomatti/gcp-bigquery
Data sync via CDC from GCP Cloud SQL to Big Query using Datastream
bigquery cloud-sql datastream gcp
Last synced: 01 Jun 2026
https://github.com/theng23/e-commerce-website-performance-analysis-sql
Using Bigquery base on Google Analytics dataset to analyze E-commerce Website
Last synced: 24 Oct 2025
https://github.com/xennis/particulate-matter-sensor-storage
Store the particulate matter data from a luftdaten.info sensor in BigQuery
bigquery cloud-function luftdaten particulate-matter sensor-data
Last synced: 12 May 2025
https://github.com/manesioz/airflow-without-code
Dynamically generate DAGs to ingest SQL files into BigQuery with one line of "code"
airflow airflow-plugin bigquery python sql
Last synced: 18 Apr 2026
https://github.com/jasontanx/terraform-practice
Creating datasets and tables in Google BigQuery via Terraform
bigquery iac-terraform infrastructure-as-code terraform
Last synced: 18 May 2026
https://github.com/khangtran85/user-behavior-analysis-for-ecommerce
A SQL project analyzing eCommerce data to uncover insights on traffic, customer behavior, and purchasing patterns. Covers key metrics like visits, transactions, and conversion rates, providing data-driven support for optimizing revenue and user experience.
bigquery ga-session google-analytics-sample publicdata sql
Last synced: 21 Feb 2026
https://github.com/larisanti/transaction-ml
This project demonstrates a sequence of BigQuery ML queries to build and evaluate a logistic regression model that predicts customer transactions based on website traffic data from Google Analytics.
Last synced: 11 May 2025
https://github.com/yohanesnuwara/bigquery-sodirchat
Chat interface to Sodir Norwegian oil database using Google BigQuery and Gemini
bigquery retrieval vector-search
Last synced: 18 May 2026
https://github.com/hcrlau/cyclistic-bike-share-analysis
Google Data Analytics Capstone Project
bigquery cyclistic-bike-share-analysis-case-study data-analysis data-visualization sql tableau
Last synced: 05 Apr 2025
https://github.com/fabioba/sales-analytics
This is an exercises provided by ChatGPT about sales data.
airflow bigquery etl-pipeline googlecloudplatform googlecloudstorage
Last synced: 18 May 2026
https://github.com/andrewm4894/gcp-telemetry-example
Simple HTTP endpoint for telemetry data type events in GCP.
bigquery gcp-cloud-functions gcp-storage python terraform
Last synced: 05 May 2026
https://github.com/vbalalian/littlefield
Combined web-scraping, loading, and reporting tool for Littlefield simulation, built for use with Google Cloud Run functions and Google Cloud Scheduler
bigquery cloud-functions extraction google-cloud-platform littlefield-simulation-game loading python reporting sql webscraping
Last synced: 14 May 2026
https://github.com/mohamedkashifuddin/gcp-ecommerce-data-pipeline
An e-commerce data lakehouse implemented on Google Cloud Platform (GCP). This project features an end-to-end data pipeline, from raw data generation via Cloud Functions, layered processing with PySpark on Dataproc, to structured data warehousing in BigQuery. It's fully orchestrated by Apache Airflow, enabling analytics and BI with Metabase.
airflow bigquery cloud-functions data-pipeline dataproc ecommerce gcp metabase pyspark
Last synced: 18 May 2026
https://github.com/flowerinthenight/bqstream
A simple library to help facilitate streaming to BigQuery.
Last synced: 18 May 2026
https://github.com/smohanta23/uber_data-engineering_etl-project
This project demonstrates a comprehensive data engineering workflow using the Uber information dataset. It covers the full spectrum of data engineering pipelines, from data transformation to deployment on Google Cloud, with a focus on creating a scalable and insightful data model.
big-data-analytics bigquery cloudcomputing computeengine dashboard-application dataengineering datainsights datamodelling datapipeline datascience datavisualization etl-pipeline gcp-project googlecloudplatform mage opensource python uber uber-api
Last synced: 01 Jan 2026
https://github.com/robinnoiret/internship-zendesk_reporting_migration
This project involves developing a Python script to import csv export from Zendesk to BigQuery. It is not intended for recurring use, but to enable an initial dump of historical data.
bigquery connector export-csvfile json zendesk
Last synced: 06 May 2026
https://github.com/vidyadnina/other-sql-projects-and-queries
Other SQL projects and queries.
Last synced: 06 Feb 2026
https://github.com/mysto-007/cyclistic-bike-share-analysis
Analyzed the dataset of Cyclistic Rental Service as the Capstone project for Google Data Analytics SpecializationAnalyzed the dataset of Cyclistic bike-share (Capstone project for Google Data Analytics Specialization)
bigquery data-analysis excel ms-sql-server sql tableau tableau-public
Last synced: 16 Mar 2026
https://github.com/tomgorb/some-data-monitoring
fully functional DAG using Airflow 2 and minikube (locally) to help monitor GCP billing
airflow2 bigquery gcp minikube
Last synced: 07 Apr 2026
https://github.com/patriciavalentine/loan-data-queries
In this project, I analyzed a vehicle loan dataset using BigQuery to identify demographic, financial, and loan patterns. Through SQL queries, I extracted insights such as the credit scores, and loan distribution by region, and explored high-risk profiles. The findings are visualized in Looker Studio, thus helping to inform strategic decisions.
asset-finance bigquery loan-data looker-studio
Last synced: 30 Oct 2025
https://github.com/batou9150/google-cortex-quickstart
Quick start with Google Cloud Cortex Framework
bigquery cloud cortex google salesforce sap
Last synced: 11 Sep 2025
https://github.com/sejalmankar1012/product_data_analyst_assessement
Analyzing the Impact of Business Hour Mismatch on Order Volume in the Food Delivery Industry: A Case Study of UEats and Ghub
assessment-project bigquery loop product-analyst sql-query
Last synced: 21 Mar 2025
https://github.com/vigneshSs-07/Cloud-BigQuery-and-SQL---The-Interview-Guide
This deals with SQL commands, interview preparation and query questions and solutions in BigQuery
azuresql bigquery gcp sql sql-query sql-server sqlalchemy
Last synced: 09 May 2025
https://github.com/pawel045/big-tech-stocks
ETL project
big-data bigquery dataengineering etl
Last synced: 17 May 2026
https://github.com/vigneshss-07/mastering-sql-and-bigquery-on-google-cloud-platform
Take your Data Analytics skills to the next level with this comprehensive playlist. Learn SQL from the basics to advanced techniques while mastering BigQuery on Google Cloud.
Last synced: 21 Jun 2025
https://github.com/blandoncj/terraform-bigquery-gcp
Infrastructure as Code (IaC) for Google Cloud BigQuery using Terraform. Automates dataset and table provisioning with best practices for cloud resource management.
automation bigquery gcp terraform
Last synced: 17 May 2026
https://github.com/syedsajjadaskari/end-to-end-chicago-taxi-tip-prediction-with-bigquery-and-vertex-ai
An end-to-end example of Chicago taxi on Google Cloud using TensorFlow, TFX, and Vertex AI
bigquery gcp tensorflow tfx vertex-ai
Last synced: 06 May 2026
https://github.com/chdl17/nyc_green_taxis_peak_hour_analysis
This project analyzes GCP BigQuery data and uses Looker Studio to build a Peak Hour Analysis.
bigquery gcp google-cloud-platform looker-studio sql
Last synced: 06 Feb 2026
https://github.com/venugopal9578/youtube-trending-sql-analysis
SQL project analyzing YouTube trending videos in India using Google BigQuery. Includes ranking, aggregation, and channel performance analysis with visuals.
analytics bigquery dataanalysis freelance portfolio sql youtube
Last synced: 17 May 2026
https://github.com/ivanildobarauna/ivanildobarauna
Special Repository to Make README
ai airflow big-data bigquery data-engineering gcp python
Last synced: 02 Feb 2026
https://github.com/suv05/brazilian-ecommerce-data-analysis
End-to-End Big Data Analytics on Google Cloud Platform
bigquery dataproc kaggle-dataset spark
Last synced: 15 Apr 2026
https://github.com/plishka/blockchain_analysis
Cryptocurrency On-Chain Analysis (Bitcoin Blockchain)
bigquery blockchain data-cleaning scraping-websites sql tableau
Last synced: 25 Feb 2026
https://github.com/lisabensoussan/bigdataminig_finalassignment
This repository contains solutions for the final assignment of the Big Data Mining course (52002/52019), focusing on querying large datasets with BigQuery, network analysis with Python, and distributed data processing with Apache Spark.
bigquery community-detection data-cleaning dataframe exploratory-data-analysis pagerank rdd sql text-analysis visualization
Last synced: 07 Feb 2026
https://github.com/hitthecodelabs/bigquery_ml
Jupyter notebooks that utilize Google BigQuery's machine learning capabilities.
Last synced: 15 Apr 2026
https://github.com/shahardekel/diabetes-analysis
bigquery cognos-dashboard python sql
Last synced: 15 Apr 2026
https://github.com/spacepatcher/google-workspace-gmail-collector
👁 App for collecting Gmail logs from your Google Workspace account and sending them to Kafka
bigquery gmail google-workspace security soc
Last synced: 31 Jan 2026
https://github.com/refrainit/zangetsu-data
PostgreSQL、BigQuery、Googleスプレッドシートへアクセスし、データを取得することをサポートするライブラリ
bigquery pip postgresql python spreadsheet zangetsu
Last synced: 06 May 2026
https://github.com/karencofre/marketing-segmentacion-en-powerbi
Proyecto prueba de hipótesis en powerbi y python
bigquery google-colab powerbi python sql statsmodels
Last synced: 31 Jan 2026
https://github.com/kahfisa/business-performance-analytics-kimia-farma
Dashboard Performance Analytics Kimia Farma
Last synced: 10 Sep 2025
https://github.com/thanhloc81/sql-project-bicycles-practise
✨ Utilizing SQL to extract data following a simulated task involving the Sales and Product modules
adventureworks bicycle bigquery google-cloud sql
Last synced: 01 Feb 2026
https://github.com/hardik-agrl/youtube_trending_pipeline
The Project fetches trending YouTube videos using the YouTube Data API, stores the data in Google Cloud Storage, and loads it into a BigQuery table for analysis.
bigquery google-cloud-storage python python-dotenv youtube-api-v3
Last synced: 15 Apr 2026
https://github.com/cmmasaba/ms-ads-integration
Extract ads performance data from Microsoft Ads platform and store in BigQuery
Last synced: 15 Apr 2026
https://github.com/vaishnavipaithane/bellabeat-data-analysis-case-study
This capstone project was done as a part of Google Data Analytics Professional Certificate course.
Last synced: 01 Feb 2026
https://github.com/minhajuddin2510/bigquery_alerts
In today’s data-driven world, organisations heavily rely on timely alerts to monitor critical systems and make informed decisions. However, when working with BigQuery, a popular cloud-based data warehouse, there is no built-in functionality to generate alerts. In this article, we will explore how I recently built a cloud function to address this
alerting bigquery cloudfunctions monitoring-tool slack
Last synced: 06 May 2026
https://github.com/lambdamusic/dimschema
CLI to retrieve SQL schema information about the Dimensions on Google BigQuery dataset.
bigquery dimensions python scholarly-metadata
Last synced: 12 May 2026
https://github.com/paty-oliveira/carris-data-pipeline
Repository for Extraction, Loading and Transformation of Carris data.
apache-airflow bigquery docker docker-compose elt-pipeline
Last synced: 25 Feb 2026
https://github.com/scraly/bigquery
Google BigQuery AaaS tools, tips and fun
Last synced: 17 May 2026
https://github.com/victorelexpe/bq-schema-sync
bigquery gcp google-cloud python schema sync
Last synced: 26 Feb 2026
https://github.com/kajinmo/lightweight-etl-pipeline-to-gcp
An ETL pipeline that extracts data from multiple sources, masks sensitive information, and loads it into Google Storage and Google BigQuery. Designed for environments where Airflow is unavailable. It provides a no-frills, dependency-light way to define, schedule, and monitor ETL workflows using Python libraries.
bigquery etl gcp pipelines pydantic python storage
Last synced: 17 May 2026
https://github.com/ekoepplin/dbt-bigquery-core
How to get data to BigQuery (or duckDB) and setup dbt tests for SODA cloud monitoring
bigquery data data-quality dbt dlt duckdb gcp soda
Last synced: 06 May 2026
https://github.com/bleakmego/wrenai
WrenAI is an open-source GenBI agent designed for seamless integration and powerful performance. Explore the code on GitHub! 🐙🌟
agent anthropic bedrock bigquery business-intelligence charts duckdb genbi llm openai postgresql rag sql sqlai text-to-chart text-to-sql text2sql vertex
Last synced: 07 Apr 2026
https://github.com/kina2711/datapipeline_omnichanneltobigquery
A data pipeline that fetches, normalizes and sorts Caresoft omnichannel data, then loads it into BigQuery—with Kafka-based schema validation coming soon.
Last synced: 19 May 2026
https://github.com/shakeel-data/amazon-sales-forecasting-python-bigquery-ml
An end-to-end analytics project using Python, SQL, & ML to forecast Amazon sales and segment customers. We build predictive models (LightGBM, Prophet) and clustering (KMeans) to deliver actionable insights for revenue growth and targeted marketing.
bigquery kmeans-clustring lightgbm linear-regression prophet-facebook scikit-learn
Last synced: 09 May 2026
https://github.com/jmfeck/bigquery-local-framework
This repo provides tools to manage BigQuery operations locally, simplifying tasks like uploading flat files, running SQL queries, and downloading tables. It offers a unified interface for local BigQuery interactions, enabling more efficient interaction with it.
bigquery data-engineering ingestion pandas python
Last synced: 06 May 2026
https://github.com/themihirmathur/uber-data-analytics
The goal of this project is to perform comprehensive data analytics on Uber trip data using a modern data engineering stack on Google Cloud Platform (GCP).
bigquery data-analysis data-engineering etl-pipeline google-cloud-platform looker python
Last synced: 09 Feb 2026
https://github.com/aisurjyasamantaray/-optimizing-target-s-brazilian-operations-insights-from-order-processing-pricing-and-payment-trends-
This project offers an in-depth analysis of consumer behavior, logistical performance, and payment preferences within the e-commerce sector. By examining order costs, delivery times, and payment methods, businesses can uncover valuable insights into operational efficiency and customer preferences.
bigquery consumer-insights data-analysis database sql target
Last synced: 26 Feb 2026
https://github.com/peippo1/gcp-datawarehouse-terraform
Infrastructure-as-Code (IaC) project that provisions a foundational data warehouse environment on Google Cloud Platform using Terraform. Includes a BigQuery dataset and a Cloud Storage bucket, ready for integration with analytics tools like Dataiku or custom ETL pipelines.
bigquery data-engineering devops gcp infrastructure-as-code terraform
Last synced: 16 Apr 2026
https://github.com/miliar/database-adapters
Prototype for easy data transfer between MySql, Bigquery and CSV
adapter bigquery csv datapipeline mysql unittest
Last synced: 06 May 2026
https://github.com/vigneshss-07/cloud-bigquery-and-sql---the-interview-guide
This deals with SQL commands, interview preparation and query questions and solutions in BigQuery
azuresql bigquery gcp sql sql-query sql-server sqlalchemy
Last synced: 27 Feb 2026
https://github.com/jhermienpaul/google-data-analytics-program
Hands-on learning materials from the 8-course Google Data Analytics Professional Certificate program, covering foundational data skills, tools, and real-world business problem-solving
bigquery dashboard data-analysis data-analytics data-modeling data-storytelling data-visualization data-wrangling descriptive-analytics diagnostic-analytics etl-pipeline r-programming rstudio sql tableau
Last synced: 13 Jul 2025
https://github.com/oliveroneill/wilt-cloud-functions
Wilt Google Cloud Functions
bigquery google-cloud-functions
Last synced: 12 May 2026
https://github.com/myktorijus/retention-cohort
Extracted cohort data using SQL in BigQuery focusing on weekly retention from week 0 to week 6
bigquery data-analysis data-visualization powerbi sql
Last synced: 13 Jul 2025
https://github.com/ivanildobarauna/pypi-package-stats
Project for ingest pypi packages data from BigQuery and send to DataDog for analysis and insights with dashboards, monitors and more
bigquery cloud data-engineering data-warehouse gcp software-engineering
Last synced: 27 Feb 2026
https://github.com/mlund2k/project-1-baseball-performance-vs.-attendance
Project assets for my first exploratory data analysis: Baseball Performance vs. Attendance.
bigquery data-analysis data-cleaning data-visualization excel rstudio sql tableau tidyverse
Last synced: 12 Feb 2026
https://github.com/jasontanx/ridership-headline-project
This end to end data engineering / data analytics project will be about the Malaysian public transport ridership data.
bigquery data-engineering minio-server public-transport-ridership terraform
Last synced: 09 May 2026
https://github.com/vbalalian/three-gits
Group analytics project for a predictive analytics course. Using the Yelp open dataset to predict restaurant success.
bigquery dbt predictive-analytics python regression sentiment-analysis sklearn sql vader-sentiment-analysis yelp-dataset
Last synced: 13 May 2026
https://github.com/valenthr/purchase_funnel
Google merch store sales analysis
Last synced: 21 Jun 2025
https://github.com/lorinczakos/sql-projects
This is a collection of my SQL scripts that I wrote and were approved through my course with GoIT Romania Data Analyst course
bigquery cte data data-analysis dbeaver marketing-analytics postgresql project-repository sql vscode
Last synced: 16 May 2026
https://github.com/ket0825/v1-gcp-preview
Manage GCP src for preview services / Preview 서비스를 위한 GCP 레포
bigquery cloud-functions cloud-run cloudbuild gcp gcp-batch gcs logging pubsub
Last synced: 13 Feb 2026
https://github.com/rohit196/sql-learning-hub
A comprehensive collection of SQL resources, projects, tutorials, and interview preparation materials
bigquery datawarehouse learning-resources nosql sql sql-projects
Last synced: 17 Jun 2025
https://github.com/kaanevranportfolio/kafka_spark_bigquery_newsstream
Create VMs using Terraform, Install Kafka & Zookeeper on VM using Ansible (GCP)
ansible ansible-playbook ansible-role bigquery bigquery-table gcp kafka python terraform
Last synced: 13 Apr 2025
https://github.com/jey-37/nginx-pipeline
The Apache Beam program which reads nginx access logs from Google Cloud Pub/Sub, parses them, and saves into BigQuery.
apache-beam bigquery dataflow gcp-pubsub
Last synced: 16 May 2026
https://github.com/janmin123/cyclistic
Capstone project for Google/Coursera Data Analytics Course
analysis bigquery sql tableau visualization
Last synced: 10 Jul 2025
https://github.com/ansh-info/databridge
End-to-end financial data pipeline unifying real-time and batch ingestion with PySpark ETL, BigQuery storage, DBT modeling, Kafka streaming, and Airflow/Docker orchestration.
airflow apache-spark bash big-data bigquery dbt docker docker-compose etl etl-pipeline gcp google kafka kafka-consumer kubernetes orchestration pyspark python3 real-time stock
Last synced: 28 Feb 2026