BigQuery
Google BigQuery enables companies to handle large amounts of data without having to manage infrastructure. Google’s documentation describes it as a « serverless architecture (that) lets you use SQL queries to answer your organization’s biggest questions with zero infrastructure management. BigQuery’s scalable, distributed analysis engine lets you query terabytes in seconds and petabytes in minutes. » Its client libraries allow the use of widely known languages such as Python, Java, JavaScript, and Go. Federated queries are also supported, making it flexible to read data from external sources.
📖 A highly rated canonical book on it is « Google BigQuery: The Definitive Guide », a comprehensive reference. Another enriching read on the subject is the inside story told in the article by the founding product manager of BigQuery celebrating its 10th anniversary.
- GitHub: https://github.com/topics/bigquery
- Wikipedia: https://en.wikipedia.org/wiki/BigQuery/
- Repo: https://github.com/GoogleCloudPlatform/bigquery-utils/
- Released: May 19, 2010
- Related Topics: cloud-computing,
- Aliases: bq,
- Last updated: 2026-06-22 00:03:35 UTC
- JSON Representation
https://github.com/alessine/zurich_air_quality
End-to-end Data Pipeline built as the Final Project of the Data Engineering Zoomcamp
bigquery data-engineering dataform docker google-cloud-platform kestra looker-studio
Last synced: 01 Mar 2026
https://github.com/istinnew/cook-me-up
[In Progress] Welcome to Cook-Me-Up! This project aims to analyze and organize cooking recipes using data analysis (Python, BigQuery SQL, Looker Studio etc.) and machine learning techniques. The goal is to simplify meal preparation and offer users a comprehensive database of culinary delights.
bigquery clustering cookme culinary data data-science dataanalysis datavisualization looker-studio machine-learning python recipe-search recipes unsupervised-learning
Last synced: 16 May 2026
https://github.com/cc59chong/end-to-end-cloud-data-pipeline-for-retail-sales-forecasting
Batch Pipeline
bigquery dbt-cloud docker gcs iam jupyter-notebook kestra spark terraform wsl-ubuntu
Last synced: 16 Apr 2026
https://github.com/shakespear567/data_engineering_gcp
Data Engineering Using Google Could Platform and Mage
apachebeam bigquery clouddataflow cloudsql data-engineer dataflow dataproc gcp-components google-bigquery google-cloud google-virtualmachine looker spark terraform
Last synced: 07 May 2026
https://github.com/alessine/data-engineering-zoomcamp
Materials from the Data Engineering Zoomcamp 2025
bigquery data-engineering dbt docker kestra spark
Last synced: 16 Apr 2026
https://github.com/izmian/google-business-intelligence_professionalcertificate
Included my course and project of Google Business Intelligence by Google on Coursera
bigquery business-intelligence datavisualization sql tableau
Last synced: 06 May 2025
https://github.com/paulveillard/cybersecurity-analytics
An ongoing collection of awesome software, libraries, learning tutorials, documents and books, technical resources and cool stuff about Analytics Engineering in Cybersecurity.
analytics bigdata bigquery cybernetics cybersecurity data data-engineering data-science encryption encryption-decryption seo seo-friendly seo-optimization
Last synced: 28 Mar 2025
https://github.com/walterowisk/sql-learn
SQL Learning
beekeeper bigquery data-analytics dbeaver mysql sql
Last synced: 17 Apr 2026
https://github.com/IvanildoBarauna/pypi-package-stats
Project for ingest pypi packages data from BigQuery and send to DataDog for analysis and insights with dashboards, monitors and more
bigquery cloud data-engineering data-warehouse gcp software-engineering
Last synced: 12 Jul 2025
https://github.com/lumapps/tap-bigquery
Fork of z3z1ma/target-bigquery — Singer target for BigQuery supporting storage write, GCS, streaming, and batch load methods, built with the Meltano SDK.
bigquery meltano python singer
Last synced: 02 Apr 2026
https://github.com/marcosach/tp-infra
Este repositorio contiene todos los archivos que componen al trabajo práctico final de la materia Infraestructura para la Ciencia de Datos de la Licenciatura en Ciencia de Datos (UNSAM).
bigquery buckets datamart datawarehouse etl gcp gcs pipelines python sql
Last synced: 17 Apr 2026
https://github.com/ymyzk/bq-globalip
Record the current global IPv4 address to a BigQuery table.
Last synced: 17 Apr 2026
https://github.com/nghiant3110/google_fiber_bi_5
This is a BI Capstone project based on the Google Fiber dataset from Google BI Course
bigquery google-sheets looker-studio sql
Last synced: 11 Apr 2025
https://github.com/lucashomuniz/project-22
[Dashboard] Data and Sustainability: Optimizing Green Flow's Fertilizer Portfolio
agrotech bigquery data-analytics data-structures data-visualization google-cloud-platform powerbi powerbi-visuals powerquery sql sustainability
Last synced: 20 Mar 2025
https://github.com/metlinskyi/englishdom
Filtered and evaluated EnglishDom.com teachers according to custom criteria.
bigquery google-sheets python webscraping
Last synced: 16 May 2026
https://github.com/fabiopapais/chat-your-data
Chat with your SQL database and make complex queries with natural language using LLMs
bigquery chainlit langchain llm
Last synced: 17 Apr 2026
https://github.com/lawal-hash/olistelt
An end-to-end ELT data pipeline of the Brazilian olist e-commerce dataset using the modern data stack
airflow bigquery dbt dbt-core docker postgresql sql
Last synced: 17 Feb 2026
https://github.com/g-schumacher44/analyst_resource_hub
A collection of guidebooks, quickref, and resources for data analysis
analytics bigquery data lookerstudio machine-learning model python sql yaml-configuration
Last synced: 20 Jun 2026
https://github.com/gabrieladados/people-analytics
People Analytics: Insights para Retenção de Talentos
bigquery figma people-analytics sql tableau
Last synced: 17 Apr 2026
https://github.com/khanovico/energy-data-analysis
This is the cloud model analyzing real world dataset with BigQuery and other big-data analyzing tools. I implemented docker image for running this app on cross-platform environments.
big-data-processing bigquery docker google-app-engine jupyter-notebook mlflow python scikit-learn seaborn xgboost
Last synced: 17 Feb 2026
https://github.com/chaaalistaa/thelookecommerce---project
Analysis "TheLook" eCommerce with highlight goals such as identifying sales trends, understanding customer behaviors, enhancing customer retention, and driving repeat purchases.
big-data-analytics bigquery data-analytics data-visualization looker-studio sql
Last synced: 17 Apr 2026
https://github.com/ajaxbarcelonacruyff/ec_demo
Generate EC demo data / ECサイト用のサンプルデータを生成
bigquery ecommerce google-analytics-4
Last synced: 04 Apr 2026
https://github.com/wan-huiyan/gcp-dataform-rest-api-deploy
Claude Code skill: Deploy .sqlx files to Google Cloud Dataform via REST API — full lifecycle with gotcha documentation
automation bigquery ci-cd claude-code claude-code-skill cloud-workflows dataform gcp google-cloud sql
Last synced: 04 Apr 2026
https://github.com/digitaloptimizationgroup/digitaloptgroup-r-notebooks
A collection of R notebooks to analyze data from the Digital Optimization Group Platform
ab-testing bigquery jupyter-notebook performance-analysis r web-analytics
Last synced: 07 May 2026
https://github.com/tuanai-vireox/dataform-utils
Bigquery Dataform Javascript Utils Package - Support Ads, Query Common, ...
bigquery dataform datawarehouse
Last synced: 19 Apr 2026
https://github.com/alex-nettekoven/gcp-kafka-spark-bigquery-etl-newstream
Terraform-Deployed ETL Pipeline (Python → Kafka → Spark → BigQuery) for NewsAPI Data
bigquery etl-pipeline kafka newsapi python spark
Last synced: 08 May 2026
https://github.com/thahab-anal/thahabu-python-data-analysis-bi-portfolio
Portfolio showcase on Data Analysis Learning
bigquery looker-studio pandas-dataframe python
Last synced: 20 Apr 2026
https://github.com/coatless/bigquery-reddit-ask-your-advisor
Analysis code that counts instances of a phrase on Reddit (e.g. "ask your advisor")
ask-your-advisor bigquery r reddit
Last synced: 20 Apr 2026
https://github.com/vaxdata22/city-weather-and-s3file-rds-s3-bigquery-by-airflow-on-ec2
This is my third industry-level ETL project. This data pipeline orchestration uses Apache Airflow on AWS EC2. It demonstrates how to build an ETL data pipeline that would perform data extraction to a database in parallel to a loading process into the same database, join the tables, copy joined data to S3 and finally copy the S3 file to BigQuery DW.
apache-airflow aws-ec2 aws-rds-postgres aws-s3 bigquery business-intelligence dags data-warehousing etl-pipeline openweathermap-api orchestration python3 sql
Last synced: 18 Mar 2025
https://github.com/garbetjie/monolog-bigquery-handler
A simple Monolog handler for writing to BigQuery.
bigquery logging monolog monolog-handler
Last synced: 20 Apr 2026
https://github.com/srgrace/sql_scavenger_hunt
SQL Scavenger Hunt
bigquery jupyter-notebook python sql
Last synced: 20 Apr 2026
https://github.com/zaynabbug/end-to-end-fitness-data-pipeline-on-gcp
A cloud-native data pipeline that ingests, processes, and visualizes real-time and batch fitness data. Built with Pub/Sub, Airflow, BigQuery, dbt, Looker Studio, Terraform, and Docker to automate data workflows and provide actionable insights.
airflow bigquery dbt docker gcp looker-studio pubsub terraform
Last synced: 05 May 2026
https://github.com/rohithay/sql-bench
An elegant CLI toolkit for validating BigQuery queries, comparing schemas, and estimating costs before pushing code to production.
bigquery cli cloud-tools data-engineering developer-tools google-cloud query-validator schema-diff sql sql-lint
Last synced: 22 Apr 2026
https://github.com/tranngoca5039/bigquery-a5y
📊 Streamline your data analysis with bigquery-a5y, a powerful tool for optimizing BigQuery performance and improving query efficiency.
analytics api big-data bigquery cloud-computing data-analysis data-integration data-management data-pipeline data-visualization data-warehouse google-cloud machine-learning serverless sql
Last synced: 05 Jun 2026
https://github.com/edisedis777/bigquery-cost-optimization
GitHub repository showcasing strategies to optimize Google BigQuery (GBQ) costs when dealing with raw data dumps.
bigquery cost gbq google googlebigquery
Last synced: 24 Apr 2026
https://github.com/tosh2230/bigquery-table-history
Diff daily changes by BigQuery INFORMATION_SCHEMA.PARTITIONS records.
Last synced: 24 Apr 2026
https://github.com/raqssoriano/hha504_assignment_nosql_dbs
This task is part of my assignment focused on creating and configuring databases in different platforms, such as GCP's BigQuery, MongoDB Atlas, and Redis Cloud.
bigquery mongodb-atlas mongodbcompass redis redisinsight
Last synced: 24 Apr 2026
https://github.com/tosh2230/cdc-rds-bq
Change data capture from Amazon RDS to Google BigQuery
bigquery changedatacapture rds
Last synced: 24 Apr 2026
https://github.com/ackeecz/terraform-gcp-cloud-run_pubsub_to_bq
Cloud Run subscribes itself to given topic and inserts each message to BigQuery table.
Last synced: 24 Apr 2026
https://github.com/push-protocol/push-google-bigquery
The Power of Web3 Big Data: A Guide to Using Google BigQuery and Push Protocol for Data Communication and Analysis
bigquery data push push-notifications web3
Last synced: 26 Mar 2025
https://github.com/naustica/semantic_scholar_bq
Repository containing scripts for importing Semantic Scholar snapshots into BigQuery
bigquery python scholarly-metadata semantic-scholar
Last synced: 06 Jun 2026
https://github.com/stoqey/rasputia
Rasputia Latimore - The Big Data Bitch 💋
Last synced: 15 May 2026
https://github.com/antbit96/dataform_poc
Template for basic data preparation
bigquery bigquery-dataform data-preparation
Last synced: 26 Apr 2026
https://github.com/nszoni/dbtgen
dbt: write nothing, generate (almost) everything.
analytics bigquery dbt documentation generative-ai github tooling
Last synced: 08 May 2026
https://github.com/richardbnk/data_tools
Python Library to Accelerate Creation of Data ETL Processes on multiple database systems.
Last synced: 27 Apr 2026
https://github.com/seahrh/nyc-taxi-trips
REST API for the New York City Taxi Trips public dataset, implemented in Scala and Play Framework 2.7
bigquery nyc-taxi-dataset play-framework rest-api scala
Last synced: 14 May 2026
https://github.com/rifa8/data-warehouse-submission
Learning about Data Warehouse
bigquery citus columnar data-warehouse datalake gcs-bucket
Last synced: 15 May 2026
https://github.com/kellyjadams/ap-exam-scores
Analyzing AP exam scores for a school.
Last synced: 07 Jun 2026
https://github.com/logicoffee/dbt-bigquery-extras
An experimental dbt package for BigQuery offering several materializations.
Last synced: 13 Jun 2026
https://github.com/vjavallar-ship-it/braze-lifecycle-analytics-automation-pipeline
Braze lifecycle marketing analytics pipeline
a-b-testing analytics automation bigquery braze communication-strategy data-insights data-modeling lifecycle marketing-automation performance-metrics pipeline sql user-segmentation
Last synced: 07 Jun 2026
https://github.com/marielachirinosr/nyc-taxi-trip-exploration-2019-2020
Explores passenger behavior & impact of COVID-19 on NYC taxi industry (Q1 2019-2020).
bigquery data data-analysis data-visualization python sql tableau
Last synced: 15 Jun 2026
https://github.com/rrmcguinness/protoc-gen-bq-schema
A protocol buffer compiler (protoc) plugin for generating Google BigQuery JSON table definitions.
bigquery bigquery-schema protobuf
Last synced: 01 May 2026
https://github.com/shaundann/autosight
AutoSight is an AI-powered multi-agent data analysis pipeline built on Google Cloud. From ingesting raw CSVs to generating visualizations and natural language summaries — all results are displayed live in a Streamlit dashboard.
ai-agents automated-data-analysis bigquery data-pipeline gcp google-cloud llm multi-agent-systems python streamlit vertex-ai
Last synced: 09 May 2026
https://github.com/marceloneppel/map-to-bigquery-structs
Tool to convert a Golang map to a struct containing fields with types like bigquery.Null*.
Last synced: 28 Apr 2026
https://github.com/davidkhala/gcp-collection
Notebooks for GCP services
bigquery bq databricks datastore firestore google-cloud-platform
Last synced: 28 Apr 2026
https://github.com/martinkalema/bigquery-pubsub
Loading data into BigQuery Table
bigquery data-engineering flat-file kafka
Last synced: 15 May 2026
https://github.com/wayanradit29/tutas-recommender
End-to-end student–tutor recommender system with synthetic data generation, preprocessing, feature engineering in BigQuery, and model deployment on Google Cloud.
ai bigquery cloud-computing data-engineering education google-cloud machine-learning python recommender-system vertex-ai
Last synced: 15 May 2026
https://github.com/adindasarianti/PBI_Rakamin_X_Kimia_Farma
This repository contains my project as a Big Data Analytics intern at Kimia Farma, where I analyzed the performance of Kimia Farma from 2020 to 2023
bigquery dataanalytics lookerstudio
Last synced: 07 Sep 2025
https://github.com/vikasgupta1812/google_cloudml_scripts
https://goo.gl/dFjFQf
bigquery google-cloud-ml python-language tensorflow-tutorial
Last synced: 30 Sep 2025
https://github.com/malbiruk/million-songs-pipeline
End-to-end batch pipeline joining audio features, lyrics, and genres from the Million Song Dataset
batch-processing bigquery data-engineering data-pipeline data-warehouse dataproc dbt dezoomcamp gcp million-song-dataset prefect pyspark streamlit terraform
Last synced: 08 Jun 2026
https://github.com/alessio-siciliano/bigquery-advanced-utils
BigQuery-advanced-utils is a lightweight utility library that extends the official Google BigQuery Python client. It simplifies tasks like query management, data processing, and automation. Aimed at developers and data scientists, the project is open to contributions to improve and enhance its functionality.
bigquery datatransfer google-cloud python
Last synced: 28 Apr 2026
https://github.com/mahdiik1/bigquery-retail-analysis
Uses Google BigQuery to query a sample retail dataset. Features example SQL queries and a Python script for large-scale data analysis. Great for learning GCP integration, basic analytics, and high-volume querying.
Last synced: 28 Apr 2026
https://github.com/rafal-kowalski-dev/selling-cars-analize
Hobby project for learning PySpark, AirFlow and BigQuery
airflow bigquery gcp pyspark python sqlalchemy
Last synced: 28 Apr 2026
https://github.com/allanreda/share-of-search-retrieval-and-visualization
Share of search analysis including data retrieval from Google Ads API, storing data in BigQuery and visualizing it in Looker Studio
bigquery google-ads-api looker-studio python share-of-search
Last synced: 28 Apr 2026
https://github.com/jmorl96/bitcoin-ecosystem-insights-data-engineering-project
Data engineering project that collects, processes, and visualizes Bitcoin blockchain and Kraken API data using Google Cloud Platform. Technologies: Terraform, Airflow, DBT, Looker Studio, BigQuery, Python, Cloud Run. The project ensures reproducibility and scalability with infrastructure as code and automated data pipelines.
airflow bigquery cloud-run dbt docker google-cloud-platform looker-studio python terraform
Last synced: 29 Apr 2026
https://github.com/fakhri098/project-sql-bigquery
This project aims to analyze taxi trip data with a focus on trip duration patterns, popular routes, and trip costs. The study was conducted to gain in-depth insights into taxi travel behavior based on historical data.
Last synced: 08 Jun 2026
https://github.com/syou6162/mackerel-plugin-bigquery-query-result-importer
Mackerel plugin to post bigquery's query result
Last synced: 10 Apr 2025
https://github.com/mikeghen/metadata
Pulls data from Socrata open data portals
Last synced: 29 Apr 2026
https://github.com/prodriguezdefino/dataflow-cassandra-to-bigquery
Captures data from a Cassandra instance and sends it to BigQuery
Last synced: 29 Apr 2026
https://github.com/alwayssany/bigquery-hackathon
A bigquery powered Smart Substitute Recommender that Suggest ideal product substitutes based on a deep understanding of product attributes, not just shared tags or categories.
bigquery bigquery-ai bigquery-ml google-cloud google-cloud-platform notebook-jupyter public-dataset python sql vector vector-search
Last synced: 29 Apr 2026
https://github.com/humairarizwan/uber-ride-dataengineering-analysis
This project creates a pipeline to process data and performs data analytics on Uber data.
bigquery dataanalysis dataengineering gcp-project googlestorage looker-studio
Last synced: 29 Apr 2026
https://github.com/sanjay-k08/python-for-gcp-interact-with-google-cloud-using-python
Python For GCP is a project aimed at simplifying the interaction with Google Cloud Platform (GCP) services using Python. This repository provides code examples and scripts that help you manage and automate various GCP resources such as BigQuery, Cloud Storage, BigTable, Compute Engine, and more entirely through Python.
bigdata bigquery cloudstorage computeengine data-pipelines devops gcp gcp-automation python-script terraform-alternative
Last synced: 29 Apr 2026
https://github.com/kaushik-puttaswamy/walmart-sales-data-ingestion-and-transformation-in-bigquery-using-airflow
An ETL pipeline that ingests Walmart sales data from Google Cloud Storage into BigQuery, automates table creation, and performs data transformation using SQL MERGE with Apache Airflow.
airflow-dags bigquery etl-pipeline gcs-bucket google-cloud-platform merge python sql transformation
Last synced: 29 Apr 2026
https://github.com/jorbriib/nodejs-bigquery-connect-api-rest
NodeJS service to connect to BigQuery through API REST
api api-rest bigquery javascript node node-js nodejs npm sql
Last synced: 29 Apr 2026
https://github.com/ruru-lyy/nyc-taxi-service-pipeline
In this project, I built a data pipeline using Mage.ai for ETL, GCP for storage, BigQuery for querying, and Looker Studio for analytics. This project helped me learn how to process, store, and visualize data effectively using modern tools.
bigquery data-engineering data-modeling etl-pipeline looker mage-ai python
Last synced: 29 Apr 2026
https://github.com/amirrezaskh/nyc-taxi-dashboard
A comprehensive data analytics platform that processes NYC taxi trip data from Google BigQuery and visualizes insights through an interactive React dashboard. Features real-time heatmaps, temporal analysis, and geographic intelligence across 263 NYC taxi zones.
bigquery dashboard data-analytics data-science data-visualization geospatial leaflet material-ui nyc-taxi plotly react typescript
Last synced: 29 Apr 2026
https://github.com/machinelearningzuu/data-engineering-projects
This repository is a curated collection of projects and tools that exemplify best practices in data engineering. It serves as a resource for data professionals seeking to enhance their data infrastructure, optimize data pipelines, and implement cutting-edge data processing techniques.
airflow bigquery data-engineering data-science data-visualization data-warehouse
Last synced: 30 Apr 2026
https://github.com/vaibhavs10/ml-on-gcp
The repository walks through a Data Scientist focused way of building and deploying Machine Learning models on Google Cloud
aiplatform bigquery googlecloudplatform ml
Last synced: 30 Apr 2026
https://github.com/gayatri1505/real-time-stock-market-data-pipeline-with-google-cloud-platform
This project builds a complete real-time stock market data pipeline on Google Cloud Platform. It ingests intraday stock prices from the Alpha Vantage API, stores and transforms the data using Cloud Functions, Pub/Sub, Cloud Storage, and BigQuery, and performs rich SQL-based analytics to uncover trading patterns, price movements, & volume anomalies.
bigquery data-engineering google-cloud-platform python3 real-time sql time-series visio
Last synced: 30 Apr 2026
https://github.com/crudek-data/bigquery-kaggle-apis
kaggle api to download free datasets along with google bigquery api to read/write from cloud data warehouse
bigquery data-engineering kaggle
Last synced: 10 May 2026
https://github.com/brpy/nyc-trips
Data engineering | Zoomcamp journey on nyc trip data with gcp stack
Last synced: 30 Apr 2026
https://github.com/yiu31802/gcp-project
GCP AppEngine project of Twitter data and some sample code
appengine bigquery gcp google-bigquery google-cloud google-datastore resas twitter twitter-data twitter4j
Last synced: 30 Apr 2026
https://github.com/kaushik-puttaswamy/train-ticket-booking-customer-data-ingestion-via-pub-sub-stream-dataflow-and-bigquery-with-looker
This project demonstrates real-time train ticket booking customer data ingestion and transformation using Pub/Sub, Dataflow, BigQuery, and visualization with Looker. It enables efficient data processing, storage, and analysis for customer insights.
bigquery dataflow etl gcp looker pubsub real-time-analytics
Last synced: 30 Apr 2026
https://github.com/nph1508/exploring_sales_and_product_performance_in_bicycle_manufacturing_sql
Analyzed sales and product performance data from a bicycle manufacturing company using SQL in BigQuery. Focused on identifying trends in product categories, revenue distribution, and monthly performance to guide production and inventory planning.
Last synced: 08 Jun 2026
https://github.com/justinjsd/analytics-engineering
📊 A repository focusing on analytics engineering, particularly using dbt on the Northwind Sample dataset
Last synced: 13 Jun 2026
https://github.com/anyesh/gbq-helpers
GBQ related helper functions and snippets.
Last synced: 01 May 2026
https://github.com/ddzikri/analisis-data-kimia-farma
Project Based Internship Kimia Farma Rakamin Academy
Last synced: 18 Mar 2025
https://github.com/nikhilsree5/targetcasestudy
An exploratory and in-depth study of the e-commerce market in Brazil.
bigquery eda sql visualization
Last synced: 15 Mar 2025
https://github.com/yu-iskw/bigquery-lineage
Visualize BigQuery data lineage graph
bigquery data-governance data-management visualization
Last synced: 01 May 2026
https://github.com/nph1508/sql_for_ecommerce_analyzing_sales_customer_behavior_in_bigquery
Designed and executed complex SQL queries on an ecommerce dataset using Google BigQuery to uncover customer behavior patterns, sales performance, and category-level insights. Focused on extracting business value through data exploration and aggregation techniques.
Last synced: 01 Jul 2025
https://github.com/robertofernandezmartinez/logistics-fleet-dbt
🏗️ Modern Analytics Engineering project using dbt and BigQuery to model fleet operations. Implementing a Medallion Architecture, it transforms raw GPS data into a reliable Star Schema. Focuses on resolving data quality issues like sensor noise and duplicates through automated testing and CI/CD to ensure production-grade reporting.
analytics-engineering bigquery data-engineering data-modeling data-pipeline data-quality dbt etl google-cloud-platform logistics-analytics medallion-architecture sql
Last synced: 19 Jun 2026