An open API service indexing awesome lists of open source software.

BigQuery

Google BigQuery enables companies to handle large amounts of data without having to manage infrastructure. Google’s documentation describes it as a « serverless architecture (that) lets you use SQL queries to answer your organization’s biggest questions with zero infrastructure management. BigQuery’s scalable, distributed analysis engine lets you query terabytes in seconds and petabytes in minutes. » Its client libraries allow the use of widely known languages such as Python, Java, JavaScript, and Go. Federated queries are also supported, making it flexible to read data from external sources.

📖 A highly rated canonical book on it is « Google BigQuery: The Definitive Guide », a comprehensive reference. Another enriching read on the subject is the inside story told in the article by the founding product manager of BigQuery celebrating its 10th anniversary.

https://github.com/f4rkh4d/drift

sql linter and formatter in rust. 7 dialects, 80 rules, schema-aware. single binary. 50-200x sqlfluff. demo at drift.frkhd.com/play/

bigquery cli database dbt formatter linter lsp migration mysql postgres rust schema-aware snowflake sql sql-linter sqlfluff sqlite static-analysis tsql wasm

Last synced: 21 May 2026

https://github.com/anpandu/ps2bq

Stream insert GCP PubSub messages into BigQuery table.

bigquery golang pubsub

Last synced: 12 Feb 2026

https://github.com/sahilgundu/tier1-uk-bank-fx-streaming-gcp

Sanitized case study — Tier-1 UK bank FX streaming on GCP (Pub/Sub → Dataflow → BigQuery, Composer, VPC-SC/CMEK). Patterns only; no client code/data.

architecture bigquery case-study data-engineering dataflow gcp mermaid pubsub streaming

Last synced: 10 Jun 2026

https://github.com/v-cth/database_audit

Audit your data quality in seconds

bigquery database dataengineering snowflake

Last synced: 20 Nov 2025

https://github.com/sahilgundu/tier1-swiss-bank-regulatory-reporting-lakehouse-gcp

GCP-based Regulatory Reporting Lakehouse — Tier-1 Swiss Bank (Simulated Case Study):- Documentation-only repo illustrating a cloud-native data lakehouse architecture for regulatory reporting on Google Cloud Platform (GCS + BigQuery + Dataflow + Composer). Includes ADRs, runbooks, and compliance data contracts.

adr bfsi bigquery composer data-engineering data-pipeline dataflow gcp lakehouse pubsub regulatory-reporting runbook

Last synced: 16 May 2026

https://github.com/markjamesbutler/dbt-fundamentals-bigquery

Implementation of dbt fundamentals training course material using BigQuery.

bigquery dbt dbt-fundamentals fundamentals jinja2 practice-tasks sql

Last synced: 27 May 2026

https://github.com/oleksiilatypov/google_cloud

AI & Data, Google Cloud Skills Boost

bigquery document-ai ml vertexai

Last synced: 23 Apr 2026

https://github.com/mdornseif/datastore-to-bigquery

The missing Data Transfer Tool: Dump Google Cloud Datastore contents and load them into BigQuery.

backup bigquery bigquery-backup cloud datastore google

Last synced: 02 Jan 2026

https://github.com/lupusruber/music_analytics

This project processes real-time music event data using Kafka, Apache Spark on Google Cloud Dataproc, and stores the transformed data in BigQuery for analytics, all orchestrated by Airflow and managed with Terraform.

bigquery data-proc dimensional-modeling gcp-project kafka spark-structured-streaming

Last synced: 01 May 2026

https://github.com/vidyadnina/cyclistic-sql-tableau-project

Trip data analysis for a bike-sharing service company using SQL and Tableau.

bigquery dashboard data-analysis data-analytics-sql data-cleaning data-visualization sql

Last synced: 02 Jan 2026

https://github.com/pratshrestha/cochin-traders---sql--sales-analysis

Cochin Traders imports and exports specialty foods globally. This project analyzes sales and operational data to enhance business efficiency, supply chain management, and sales performance. Key areas of focus include

bigquery customer-engagement employee-performance inventory-management sales-trends sql

Last synced: 02 Jan 2026

https://github.com/moeabbas6/bq_data_loader

A Python script for executing and logging batch SQL commands in Google BigQuery. Includes tracking of execution times, unique job and statement IDs, and automated logging to a specified BigQuery table.

bigquery data python

Last synced: 24 Mar 2025

https://github.com/wooyakob/music-recommendation-engine

Using Gemini API to generate personalized music recommendations.

ai bigquery gemini-api google-cloud-platform

Last synced: 23 Mar 2025

https://github.com/gabrieladados/analise-ecommerce

Análise SQL para E-commerce: Estratégias de Crescimento para Impulsionar Vendas

bigquery data-analysis ecommerce sql

Last synced: 31 Mar 2025

https://github.com/akansharajput280799/strategic-analysis-of-retail-brand-in-south-america-using-sql

Leveraged Big Query and MySQL to analyze 100K records for sales optimization, trend identification, and enhancing customer satisfaction for a retail brand in South America and to provide insights and recommendations to improve their userbase and improve their services

bigquery data-analysis data-science database database-schema google-bigquery mysql-server sql

Last synced: 19 May 2026

https://github.com/bsrikanth24/etl-pipeline-project-sales

This project implements a pipeline ETL to process fictitious sales data.

bigquery pandas-dataframe python

Last synced: 03 May 2026

https://github.com/prashhhant213/strategic-analysis-of-retail-brand-in-south-america-using-sql

Leveraged Big Query and MySQL to analyze 100K records for sales optimization, trend identification, and enhancing customer satisfaction for a retail brand in South America and to provide insights and recommendations to improve their userbase and improve their services

bigquery database mysql-server sql

Last synced: 11 Apr 2026

https://github.com/janaom/gcp-de-project-data-pipeline-with-cloud-run-functions-airflow-biggueryml

Build a data pipeline on Google Cloud using an event-driven architecture, leveraging GCS, Cloud Run functions, and BigQuery. Explore both VM and Composer options for Airflow management, and utilize Logging & Monitoring for pipeline health. Discover how SQL-based BigQuery ML can be used for initial ML implementation in specific scenarios.

airflow bigquery bigqueryml cloud-functions cloud-run-functions composer data-engineering-project google-cloud-platform

Last synced: 05 Jan 2026

https://github.com/hardik-agrl/ecommerce-user-conversion-dashboard

This project analyzes user behavior in an ecommerce setting using BigQuery and Looker Studio. It tracks offer notification clicks, session activity, and purchase completions to identify how many users convert within the same session after engaging with an offer.

bigquery dashboard-visualization looker-studio sql

Last synced: 02 Jul 2025

https://github.com/rolandbende/python-bigquery-migrations

Python bigquery-migrations package is for creating and manipulating BigQuery databases easily.

bigquery google migration-automation migration-scripts migration-tool migrations python

Last synced: 02 Feb 2026

https://github.com/thenazar9/user-behavior-email-campaign-analysis-sql

Analysis of user behavior and email campaign performance using BigQuery and Looker Studio, focusing on account creation trends, email engagement, and user segmentation.

analytics bigquery data-analysis data-visualization etl looker-studio sql structured-query-language

Last synced: 16 Oct 2025

https://github.com/ruohansong/sql_practice

SQL queries from both Leetcode SQL 50 and some BigQuery practices

bigquery sql sql-query

Last synced: 02 Jul 2025

https://github.com/francois-lenne/elt-mp4-quiberon

the goal of this project is to retrieve the video of the municipality of quiberon and see if a person is in or no

bigquery cicd data-engineering docker elt google-cloud-functions google-cloud-platform google-cloud-run google-cloud-storage pipeline python sql unstructured-data

Last synced: 11 Apr 2026

https://github.com/antahantah/simple-airflow-with-postgre

Capstone Project 3 - Data Pipeline Apache Airflow dari PostgreSQL ke BigQuery | Capstone Project 3: Data Pipeline with Apache Airflow - From PostgreSQL to Google BigQuery

airflow bigquery capstone capstone-project data-engineering docker extract-load postgresql python

Last synced: 14 Apr 2026

https://gitlab.com/solidninja/albion

A Scala BigQuery client

bigquery scala

Last synced: 04 Feb 2026

https://github.com/scraly/flume-bigquery-sink

An Apache Flume Sink implementation to publish data to Google BigQuery

bigquery flume sink

Last synced: 25 Apr 2026

https://github.com/simhayn/genomics-cannabis-bigquery

BigQuery's Cannabis_Genomics Dataset Exploration using SQL in a Python Environment

big-data bigquery bioinformatics exploratory-data-analysis genomics python sql

Last synced: 02 Jan 2026

https://github.com/robinnoiret/internship-zendesk_reporting__migration

This project involves developing a Python script to import csv export from Zendesk to BigQuery. It is not intended for recurring use, but to enable an initial dump of historical data.

bigquery connector export-csvfile json zendesk

Last synced: 16 Mar 2025

https://github.com/janaom/gcp-de-project-elt-dbt-cloud-run-composer-dataplex

A Google Cloud ELT pipeline project with dbt, Cloud Run, Composer, BigQuery, and Dataplex. Includes event-driven automation, data quality checks, and complete setup guidance.

bigquery cloud-run cloud-run-functions dataplex dbt elt-pipeline google-cloud

Last synced: 28 May 2026

https://github.com/data-analitycs-pos-tech-fiap/covid19_ibge

Este repositório contém a análise de dados realizada no âmbito do Tech Challenge da pós-graduação em Data Analytics da FIAP. O objetivo do projeto é explorar e interpretar dados relevantes sobre o comportamento da população durante a pandemia da COVID-19, utilizando a base de dados PNAD-COVID19 do IBGE. Essa análise visa apoiar a tomada de decisão

bigquery excel google-bigquery google-cloud-platform powerbi python sql

Last synced: 08 Apr 2025

https://github.com/datasherlock/bqml-demo

BigQuery ML demo showcasing data prep, logistic regression training, evaluation, and explainable predictions for account fraud detection on Cloud Spanner data.

bigdata bigquery bigquery-ml sql

Last synced: 21 Jun 2026

https://github.com/jugnuarora/datatalks-de-zoomcamp-project

Data Pipeline creation of france courses enrollments. Every month the providers report the enrollments in their programs. The idea is to get the courses listed as well as the enrollments every month and look at the trend of enrolments and the inter comparison of the trainings s providers for different courses.

bigquery data-analytics data-engineering data-ingestion-and-infrastructure data-pipeline dbt gcp gcs kestra-workflows looker-studio

Last synced: 31 Mar 2025

https://github.com/ngangawairimu/sales-analysis-and-customer-insights

This project features SQL queries for detailed customer and sales analysis:Customer Analysis and Sales Reporting

bigquery bigquery-dataset excel sql

Last synced: 23 Mar 2025

https://github.com/zenklinov/predicting_visitor_purchases_using_google_cloud_bigquery-

This repository contains a project for predicting Apple Inc.'s stock prices using a Long Short-Term Memory (LSTM) neural network. The model is optimized using Keras Tuner, a library for hyperparameter tuning in deep learning models. The dataset used for training and testing the model is sourced from Yahoo Finance.

bigquery bigquery-ml classification

Last synced: 11 Feb 2026

https://github.com/jasontanx/mas-international-arrivals

Code repository about international arrivals into Malaysia

bigquery data-analytics data-engineering etl-pipeline international-arrivals

Last synced: 02 Jan 2026

https://github.com/quipper/send-ci-result-to-bigquery-action

Send test results to BigQuery in GitHub Actions

bigquery github-actions google-bigquery junit-xml

Last synced: 01 May 2026

https://github.com/tuancamtbtx/dataform-utils

Bigquery Dataform Javascript Utils Package - Support Ads, Query Common, ...

bigquery dataform datawarehouse

Last synced: 15 May 2025

https://github.com/kmohamedalie/bigquery-intro

Coursera BigQuery Introduction using Covid19 dataset

bigquery coursera covid-19 datavisualization looker-studio sql

Last synced: 02 Jan 2026

https://github.com/jwcheonx/export-bq-tables

A Bash script for exporting all tables from a BigQuery dataset to local storage

bash bigquery cloudshell tools

Last synced: 12 Apr 2026

https://github.com/owox/sgtm-owox-ga4-bigquery

OWOX BI Streaming is an advanced tracking to get the most from existing Google Analytics 4 installed on your website

analytics bigquery ga4

Last synced: 07 Apr 2025

https://github.com/andrii04/ga4-gcs-to-bigquery-etl

Automated Data Pipeline that ingests daily GA4-formatted CSV files from a private Google Cloud Storage bucket, validates and loads them into BigQuery, and prepares analysis-ready views. The solution is built for deployment as a Cloud Function triggered by Cloud Scheduler and uses Python with the Google Cloud Storage and BigQuery client libraries.

automation bigquery cloud cloudfunctions data data-analysis data-engineering etl etlpipeline gcp google googlecloudplatform pipeline python sql

Last synced: 18 May 2026

https://github.com/ackeecz/terraform-gcp-cloud-function_pubsub_to_bq

Cloud function subscribes itself to given topic and inserts each message to BigQuery table.

bigquery cloud-functions pubsub terraform-module

Last synced: 16 May 2026

https://github.com/tianlangstudio/bigquery_proxy

Support CORS and simplify front-end use of #bigquery

actix-web async bigdata bigquery cors gcp log4rs rust

Last synced: 03 May 2026

https://github.com/itsubaki/hermes-lambda

Transfers AWS cost data to BigQuery

aws bigquery

Last synced: 01 Apr 2025

https://github.com/ansh-info/stockpulse

Real-time stock market analytics pipeline with live visualization dashboard. Built with Python and GCP, featuring automated data processing and interactive Streamlit analytics.

api big-data bigquery cloud cloud-computing cloud-native data-engineering data-pipeline docker docker-compose gcp gcp-automation-gitops gcp-cloud-run gcp-pubsub google-cloud-platform real-time realtime stock-market stocks streamlit

Last synced: 10 Apr 2026

https://github.com/ahmadluay9/hotel-mcp-bigquery-postgresql

AI-powered Hotel Management Assistant built with Streamlit and Google's ADK. Connects to PostgreSQL for operations and BigQuery for analytics.

adk adk-python agentic-ai bigquery mcp postgresql streamlit

Last synced: 10 Apr 2026

https://github.com/vanducng/miu-db

A headless database CLI for humans and agents.

bigquery database mysql postgresql snowflake sql sqlite terminal textual tui

Last synced: 08 Jun 2026

https://github.com/greatwoman23/car_insurance_analysis

The Car Insurance Analysis project aims to provide a comprehensive examination of a car insurance portfolio using advanced data analytics tools. The analysis offers valuable insights into policy demographics, claims patterns, and financial metrics, helping stakeholders make informed decisions.

bigquery data data-science dataanalytics insurance-claims looker-studio tableau

Last synced: 03 Feb 2026

https://github.com/isaacmg/mimic_iv_bq_queries

Queries needed to recreate time series features for model training

bigquery mimic-iv sql

Last synced: 14 Mar 2025

https://github.com/noridj4/langchain-runnables

🚀 Explore LangChain Runnables to easily compose and manage components for efficient and flexible execution within the LangChain ecosystem.

ai bigquery chatbot chatgpt chroma docker fastapi langchain llama llm llm-agents local-first offline-first openai prompt-engineering rag text-to-sql vector-database

Last synced: 07 Apr 2026

https://github.com/misaober/datove_inzenyrstvi_projekt

Kurz Datové inzenýrství v praxi (Czechitas, 36 hod) - vytvoření vlastního projektu na reálných datech obsahující skripty pro vytvoření vrstev L1, L2, L3, datový model a design architektury projektu.

bigquery python sql

Last synced: 14 May 2026

https://github.com/iht/bigquery-dataflow-cdc-example

A Dataflow streaming pipeline written in Java, reading data from Pubsub and recovering the sessions from potentially unordered data, and upserting the session data into BigQuery with no duplicates

apache-beam bigquery cdc dataflow google-cloud pubsub

Last synced: 04 Jan 2026

https://github.com/sangnandar/insert-unique-record

This is Cloud Functions script to insert only unique records into BigQuery.

bigquery digital-marketing-analytics google-cloud-functions

Last synced: 03 May 2026

https://github.com/davelester/gharchive-bigquery-examples

Examples Using BigQuery to Analyze GH Archive Data

bigquery gharchive

Last synced: 27 Mar 2025

https://github.com/data-platform-hq/terraform-google-bigquery

Terraform module for managing Google BigQuery datasets

bigquery google-cloud terraform-module

Last synced: 27 Aug 2025

https://github.com/rifqyirfanto21/ecommerce-data-pipeline-airflow-gcp-dbt

End-to-end automated data pipeline using Python, PostgreSQL, Airflow, GCS, BigQuery, and dbt — built to simulate a production-grade analytics workflow. Completed as part of Purwadhika’s Module 3 Data Engineering Program

airflow bigquery dataengineering dbt googlecloudplatform python

Last synced: 21 Apr 2026

https://github.com/ankita-selokar/fitbit-for-her-crafting-fitbit-s-strategy-for-women

This project analyzes smart device usage data to uncover trends and insights, guiding Fitbit by Google’s product and marketing strategies for their new women-focused product launch. It combines competitive market analysis with customer behavior insights to inform key decisions.

bigquery excel powerbi spreadsheet sql

Last synced: 20 Aug 2025

https://github.com/armahdavi/bigdata_pyspark_sales_analytics

Summarizing my big data code in python pyspark to analyze sales data with retail and walmart superstore to draw sales insights

big-data bigquery clustering dataframe hadoop k-means machine-learning pyspark pyspark-ml python spark unsupervised-learning

Last synced: 12 Apr 2026

https://github.com/pittica/google-bigquery-helpers

Helpers for Google Cloud BigQuery.

bigquery gcp google-cloud-platform pittica

Last synced: 06 Jan 2026

https://github.com/chiamakaukwuoma/portfolio

This repository contains various projects I've been privileged to work on outside of work.

aws-rds azure-fabric bigquery data-analysis docker-container elasticsearch excel grafana hadoop looker-studio mssql mysql postgresql powerbi python sql tableau

Last synced: 10 Apr 2026

https://github.com/nghiant3110/b2b_crm_3

This is a DA project based on the B2B Sales CRM dataset from Maven Analytics

bigquery google-sheets looker-studio sql

Last synced: 18 Aug 2025

BigQuery Awesome Lists
BigQuery Categories