An open API service indexing awesome lists of open source software.

BigQuery

Google BigQuery enables companies to handle large amounts of data without having to manage infrastructure. Google’s documentation describes it as a « serverless architecture (that) lets you use SQL queries to answer your organization’s biggest questions with zero infrastructure management. BigQuery’s scalable, distributed analysis engine lets you query terabytes in seconds and petabytes in minutes. » Its client libraries allow the use of widely known languages such as Python, Java, JavaScript, and Go. Federated queries are also supported, making it flexible to read data from external sources.

📖 A highly rated canonical book on it is « Google BigQuery: The Definitive Guide », a comprehensive reference. Another enriching read on the subject is the inside story told in the article by the founding product manager of BigQuery celebrating its 10th anniversary.

https://github.com/shaheerazam-dev/cyclistic-case-study-google-data-analytics-certificate

This case study simulates the real-world experience of a junior data analyst at Cyclistic, a fictional company. We will leverage the data analysis process framework (Ask, Prepare, Process, Analyze, Share, Act) to address critical business questions and provide data-driven insights to guide strategic decision-making.

bigquery data-science data-visualization spreadsheet sql tableau

Last synced: 06 Feb 2026

https://github.com/push-protocol/push-google-bigquery

The Power of Web3 Big Data: A Guide to Using Google BigQuery and Push Protocol for Data Communication and Analysis

bigquery data push push-notifications web3

Last synced: 26 Mar 2025

https://github.com/zborovskaanna/e-commerce-web-events-analysis

SQL project based on the Big Query public database 'The Look e-Commerce' and a dashboard in Looker Studio

analysis bigquery dashboard data-visualization looker-studio sql

Last synced: 03 Jan 2026

https://github.com/denny-b-justin/purdue

The internship was broadly to understand if the topics/events are being covered differently in the different countries and how they affect stock market returns. The provided dataset is a post-processed set of news articles, so already reflects topic modelling and sentiment analysis.

big-data bigquery finance gdelt-events python

Last synced: 03 Jan 2026

https://github.com/tharun2806/end-to-end-internship-data-analysis

Internship Dataset Analysis is an end-to-end project analyzing an internship dataset obtained from Kaggle. The project involves cleaning and preprocessing the data using Excel and SQL, followed by exploratory data analysis (EDA). The analysis includes statistical, sectoral and geospatial insights, visualized through an interactive Tableau dashboard

bigquery data-analysis data-cleaning data-preprocessing data-visualization exploratory-data-analysis geospatial-analysis microsoft-excel reporting sectoral-analysis statistical-analysis tableau-public

Last synced: 01 Apr 2025

https://github.com/victorcezeh/data-engineering-final-semester-portfolio

This GitHub repository serves as a comprehensive platform for managing and showcasing my data engineering projects and assessments throughout my final semester at Alt School Africa. Designed to foster collaboration, organization, and continuous improvement, this repository is the backbone of my academic journey in data engineering.

bigquery docker gcs-bucket postgresql python

Last synced: 22 Feb 2026

https://github.com/lucaslopezc/desafio-sql-2

Análisis de Estudiantes Este repositorio te ayudará a aprender funciones básicas de análisis en SQL con una base de datos de estudiantes. Los ejercicios incluyen rankings de estudiantes por horas de estudio y comparaciones con promedios por ciudad, usando funciones como RANK(), AVG(), y CTEs para crear soluciones eficientes y legibles.

bigquery sql

Last synced: 09 Apr 2025

https://github.com/lucaslopezc/desafio-sql

Repositorio con consultas SQL para analizar datos de estudiantes en make-it-real-tarea-clase.clase_mir. Incluye exploración de datos, filtros (WHERE, HAVING), agrupaciones (GROUP BY), ordenamientos (ORDER BY, LIMIT) y estructuras condicionales (IF, CASE).

bigquery sql

Last synced: 04 Oct 2025

https://github.com/tomy-jr98/air-quality-sql-project

Air pollution analysis using BigQuery and Tableau, with data cleaning, aggregation, and visualization.

air-pollution bigquery data-analysis portfolio sql tableau

Last synced: 25 Jul 2025

https://github.com/kogunlowo123/terraform-gcp-bigquery

Terraform module for Google BigQuery with datasets, tables, materialized views, routines, and data transfer configs

analytics bigquery data-warehouse gcp google-cloud infrastructure-as-code production-ready terraform terraform-module

Last synced: 02 May 2026

https://github.com/quangandrei1003/france_air_pollution_pipeline

End-to-end air pollution data pipeline for French metropolitan cities using Airflow, Python, dbt, BigQuery.

airflow bigquery data data-analytics data-engineering data-modeling data-visualization dbt docker etl pandas python terraform

Last synced: 13 Apr 2026

https://github.com/akihokurino/dbt-sample

dbt sample

bigquery dbt python3

Last synced: 15 Apr 2026

https://github.com/arhea/go-mock-bigquery

Creates a mock BigQuery client based on the bigquery-emulator for testing in Golang projects.

bigquery golang golang-module google-bigquery google-cloud-platform testcontainers-go testing

Last synced: 16 Feb 2026

https://github.com/riju18/airflow-data-engineering-with-bigquery-and-dbt

Fetch Data from a simple csv file, send the data in GCP BigQuery table, run dbt to automate the DWH and run SODA to check Data Quality.

apache-airflow bigquery csv dbt python3 soda

Last synced: 01 May 2026

https://github.com/aazuspan/landsat-bigquery

Summarizing 51 years of Landsat data using Earth Engine and BigQuery

bigquery google-earth-engine landsat

Last synced: 03 Jan 2026

https://github.com/ayresgneto/use-case-gcp-etl

ELT pipeline GCP. Tecnologias utilizadas: Postgresql, GCP Storage, Airflow (local), Pyspark (local), BigQuery

airflow big-data bigquery data data-engineering etl gcp pipeline postgresql programming-oriented-object pyspark python spark

Last synced: 03 Jan 2026

https://github.com/yindia/xray

About Xray: Simplify database structure extraction for MySQL, PostgreSQL, BigQuery, Redshift, MsSQL and Snowflake in Go

bigquery mssql mysql postgressql rds redshift snowflake

Last synced: 02 May 2026

https://github.com/sameer6690/data_analytics_bootcamp_hdnb

This is an analytics project on the "Titanic - Machine Learning From Disaster" dataset's train.csv file. I performed data cleaning with MS Excel before using SQL to query results based on the questions provided for the completion of the project. Finally I visualized the data on Google Looker Studio.

bigquery excel looker-studio sql

Last synced: 14 Feb 2026

https://github.com/victorcezeh/end-to-end-elt-pipeline

An end-to-end ELT project using the Brazilian E-Commerce dataset from Kaggle. This project demonstrates the use of Python, PostgreSQL, Docker, Docker Compose, Airflow, dbt, and BigQuery to ingest, transform, and analyze data, providing insights into sales, delivery times, and order distributions.

airflow bigquery dbt-core docker docker-compose postgresql python

Last synced: 13 Feb 2026

https://github.com/edwinrlambert/cyclistic-bike-share-analysis

This repository is part of the Google Data Analytics Capstone Project, focusing on analyzing Cyclistic's bike-sharing data to identify trends and strategies for converting casual riders to annual members. It aims to provide actionable insights for enhancing marketing efforts.

act analyze ask bigquery prepare process share sql

Last synced: 03 Jan 2026

https://github.com/adindasarianti/pbi_rakamin_x_kimia_farma

This repository contains my project as a Big Data Analytics intern at Kimia Farma, where I analyzed the performance of Kimia Farma from 2020 to 2023

bigquery dataanalytics lookerstudio

Last synced: 22 May 2026

https://github.com/mehbas92/sql

SQL Using BigQuery

bigquery sql

Last synced: 06 Apr 2025

https://github.com/marceloneppel/gcs-to-bigquery

WIP: Moving data from GCS to BigQuery.

bigquery gcs scala scio

Last synced: 27 Apr 2026

https://github.com/entur/terraform-aiven-kafka-connect-bigquery-sink

Terraform module for BigQuery sink connector on Aiven KafkaConnect cluster

aiven bigquery kafka-connect sink-connector terraform terraform-modules

Last synced: 05 Jul 2025

https://github.com/vedantwalia/google-data-analytics-capstone-case-study

This is a repository of my work on data analysis as a part of the Google Data Analytics Capstone

bigquery data data-viz datavisualization-project divvy-bikes google googledataanalytics sql tableau tableau-public

Last synced: 02 Jan 2026

https://github.com/nghiant3110/firebase_6

This is a DA project based on the Firebase Sample dataset on Big Query

bigquery firebase looker-studio sql

Last synced: 02 Jan 2026

https://github.com/marcopellegrinoit/web-traffic-time-series-predictions

Forecast Web Traffic Demand Time Series with ARIMA+ BigQuery and Looker Studio. Addionatel modeling available with ARIMA, LSTM, and Facebook Prophet.

arima bigquery gcp lstm prophet-model time-series vertex-ai

Last synced: 02 Jan 2026

https://github.com/drvipulasharma/e-commerce-data-analysis-sql-big---query

E-Commerce-Data-Analysis-SQL-Big-Query

bigquery sql

Last synced: 16 Mar 2025

https://github.com/subhamay-bhattacharyya-tf/terraform-google-bigquery-dataset

🏗️ Terraform module to manage BigQuery datasets, including location, access controls, IAM policies, and data governance settings.

bigquery bigquery-dataset terraform-gcp-module terraform-module

Last synced: 02 May 2026

https://github.com/yu-iskw/homebrew-bigquery-to-datastore

A homebrew tap for bigquery-to-datastore

bigquery google-datastore homebrew

Last synced: 10 May 2026

https://github.com/mchmarny/stocker

Using tweeter sentiment and stock market price signal correlation to predict next day closing price

bigquery ml prediction regression-models

Last synced: 23 Mar 2026

https://github.com/1adityakadam/uber_data_analytics

End to end Google Bigquery + Looker Studio Data Analytics Project Transforming NYC Taxi Data into Actionable Intelligence

bigquery looker-studio mage-ai-pipeline numpy pandas sql

Last synced: 13 Apr 2026

https://github.com/halomademeapc/bigquerymapping

Source generator for working with BigQuery Storage API

bigquery csharp google

Last synced: 18 Jun 2025

https://github.com/hadiuzzaman524/python-clean-architecture

A scalable COVID-19 ETL pipeline built with Python, Airflow, and BigQuery, following Clean Architecture and Domain-Driven Design principles. Designed for modularity, testability, and production-ready data workflows in a Dockerized environment.

airflow airflow-dags bigquery clean-architecture clean-code data-engineering docker domain-driven-design etl-pipeline postgresql python sqlalchemy

Last synced: 03 May 2026

https://github.com/dobsontom/basket-abandonment

Data pipeline for detecting and responding to basket abandonment using BigQuery and Adobe Campaign.

adobe-campaign bigquery ga4 gcp sql

Last synced: 23 Feb 2026

https://github.com/rifa8/extract-load-demo

Learning Google Cloud Platform (GCP)

airbyte bigquery bucket gcp

Last synced: 03 May 2026

https://github.com/ahbiels/chatbot_analize_avaliation

Um bot feito no dialogflow cx que permite ao usuário avaliar um determinado produto da empresa. Após a avaliação, o bot ira fazer uma análise de sentimentos na avaliação do usuário, e armazenar o resultado da avaliação (juntamente com o texto da avaliação, nome do usuário e produto) dentro de um dataset no BigQuery

bigquery chatbot dataset dialogflow dialogflow-cx documentation flask gcp google-cloud iterator language-model nlu nlu-chatbot python sql

Last synced: 02 Jan 2026

https://github.com/zeinhasan/etl-using-airflow

Extract Transform Load Using Airflow

airflow bigquery etl

Last synced: 02 Jan 2026

https://github.com/aafaf655/de-zoomcamp2025

This repository contains my work completed during the Data Engineering Zoomcamp 2025

bigquery docker gcp kestra python sql terraform

Last synced: 13 Apr 2026

https://github.com/ngangawairimu/clv-rfm-and-customer-segmentation-analysis

This project performs cohort analysis to estimate Customer Lifetime Value (CLV) by analyzing weekly revenue and user registrations over 12 weeks, forecasting future revenue, and providing actionable insights for marketing and business strategy.

bigquery clv-analysis cohort-analysis customer-segmentation excel rfm-analysis

Last synced: 22 Feb 2025

https://github.com/lucashomuniz/project-16

[DATA VISUALIZATION] Raw Data to Business Growth by Integrating GCP, SQL and PBI

bigquery dax-languague gcp m-language powerbi powerbi-visuals powerquery sql

Last synced: 30 Mar 2025

https://github.com/firetyrant/sql-portfolio-projects

Documenting my SQL learning journey with hands-on projects focused on data cleaning, analysis, and optimization.

bigquery data-analysis databases etl learning portfolio query-optimization sql

Last synced: 19 Apr 2026

https://github.com/phstudy/zetasketch-bigquery-example

An example demonstrates how to use ZetaSketch with BigQuery

bigquery hll java zetasketch

Last synced: 17 May 2026

https://github.com/yasarsultan/taxi-trip-analysis

The NYC Taxi Trip Batch Data Pipeline automates processing of large-scale trip data using Apache Spark and Airflow, integrating AWS S3 and Google BigQuery for storage and analytics. It features scalable, containerized workflows with robust data validation.

airflow aws-s3 bash-script batch-processing bigquery data-lake data-warehouse docker python3 spark

Last synced: 10 Apr 2026

https://github.com/mutaharshaik/airflow_retail_project

Airflow retail project using pipeline with BigQuery, dbt, Soda

airflow astro-cli astro-sdk bigquery datamodeling dbt docker etl-pipeline gcp snowflake soda

Last synced: 02 Jan 2026

https://github.com/francois-lenne/play-bq-gcp

Data pipeline in order to retrieve data from the playstation API to BigQuery

bigquery cicd data-engineering google-cloud python

Last synced: 21 Apr 2026

https://github.com/samyomb/olist-ecommerce-analytics

Olist e-commerce performance & customer reviews — Python cleaning + BigQuery SQL + Looker Studio dashboard (2017 FY & 2018 YTD) with actionable insights

analytics bigquery brasil customer-experience dashboard data-visualization e-commerce looker-studio olist python review sql

Last synced: 05 Oct 2025

https://github.com/mahmoud2abdallah/improvado-marketing-homework

This Looker Studio dashboard provides a comprehensive analysis of marketing performance for August 2024, transforming raw data into actionable insights for data-driven decision making.

bigquery business-intelligence data-analysis looker-studio marketing

Last synced: 05 Oct 2025

https://github.com/mihir-robotics/datavault-modeling-dbt

Data Vault 2.0 mini-project that leverages dbt and AutomateDV package for creating an end-to-end pipeline from source to business vault.

bigquery data-vault-architecture dbt-core

Last synced: 05 Oct 2025

https://github.com/vishal786-commits/target-businesscasestudy-sql

This project analyzes Target’s e-commerce transactions in Brazil between 2016 and 2018 using SQL. The goal was to explore customer behavior, order patterns, payments, delivery times, and freight costs to generate actionable business insights.

bigquery data-analysis sql

Last synced: 05 Oct 2025

https://github.com/raksha-17/ai_analysis

Derive actionable insights from two datasets — Global AI Impact and AI Job Market Insights — to inform strategy on AI adoption, market dynamics, and talent planning across countries and industries.

bigquery case-study sql tableau-public

Last synced: 05 Oct 2025

https://github.com/ankitwalimbe/ecommerce-funnel-analysis

SQL-based analysis of the Olist e-commerce dataset — building an order funnel (purchase → approval → delivery) with breakdowns by payment type, product category, region, and monthly trend. Includes insights, CSV exports, and Tableau dashboard.

bigquery business-intelligence data-analysis ecommerce funnel-analysis sql tableau-public

Last synced: 05 Oct 2025

https://github.com/oguzgn/real-estate-ads-website-case-study

Real-estate ads case study: BigQuery data quality & ‘strict’ reporting, contract-level (simple) retention analysis, and BigQuery ML renewal prediction with a Looker Studio dashboard.

bigquery bigquery-ml churn-prediction looker-studio retention

Last synced: 05 Oct 2025

https://github.com/egbe34/sql-portfolio

SQL portfolio showcasing business-focused queries for KPIs, retention, churn, RFM, and Pareto analysis. Built with sample commerce data for analytics and BI use cases.

bigquery business-intelligence churn-analysis cohort-analysis data-analysis kpi postgresql rfmsegmentation sql windowfunction

Last synced: 19 May 2026

https://github.com/kahfisa/dashboard-sales-performance

Dashboard Sales Performance

bigquery lookerstudio

Last synced: 05 Oct 2025

https://github.com/matheusfillipe/bqui

TUI for bigquery - 100% vibed, dont even use

bigquery bubbleteat charm golang google sql terminal tui

Last synced: 03 May 2026

https://github.com/pathilink/ebury_case

Technical case study in Analytics Engineering using BigQuery, focusing on dimensional modeling and SQL queries for payment and client analysis.

bigquery data modeling sql

Last synced: 05 Oct 2025

https://github.com/arthurcornelio88/dataops_pipeline

⚙️ Data Engineering pipeline for the MLOps workflow. Automates ETL and feature engineering with Airflow on GCP/BigQuery.

airflow bigquery data-engineering dataops gcp mlops mlops-workflow

Last synced: 05 Oct 2025

https://github.com/alyllanes/dbt-analytics-projects

A repository for all the dbt modeling projects I'm working on to practice my skills

analytics-engineering bigquery dbt sql

Last synced: 05 Oct 2025

https://github.com/nvpham12/retail-sales-dashboard

This project demonstrates SQL querying and optimization in Google BigQuery to extract a data sample before loading it into Tableau for dashboarding.

bigquery query-optimization sql tableau-public

Last synced: 05 Oct 2025

https://github.com/manishkaa/google_data_analytics_capstone_case_study

This case study is a part of Google Data Analytics Capstone Project

bigquery data-analysis sql tableau

Last synced: 05 Oct 2025

https://github.com/tmohamedashraft/airline-dwh

The repository contains the complete schema design, queries, and documentation for the airline data warehouse, providing a detailed reference for the data warehouse structure and query optimization techniques.

bigquery dwh gcp kimball sql

Last synced: 05 Oct 2025

https://github.com/newtonmunene99/sec-filings

Simple golang app that crawls sec EDGAR filings and loads indices into Google BigQuery

bigquery cloudstorage gcp golang

Last synced: 16 May 2026

https://github.com/marielachirinosr/analysis-urgencias-hospital-pitalito

This project involves analyzing emergency room admission data from the E.S.E Hospital Departamental de Pitalito using a star schema model.

bigquery data data-analysis etl-pipeline tableau

Last synced: 21 Jan 2026

https://github.com/oliveroneill/wilt-cloud-functions

Wilt Google Cloud Functions

bigquery google-cloud-functions

Last synced: 12 May 2026

https://github.com/vitaliikalyta96/data-analysis

These projects are about data analysis using multiple tools and techniques to derive insights from various datasets.

a-b-testing amplitude bigquery google-sheets looker-studio postgresql python sql tableau-public tracking-plan

Last synced: 13 Apr 2026

https://github.com/eddieatgoogle/sql-based-genai-data-pipeline

GenAI data pipeline that performs data preparation, management and performance evaluation tasks for RAG systems using SQL as the primary development language. Please feel free to use this as a starting point for your own projects.

bigquery bqml dataform dbt embeddings gemini google-cloud-platform sql vector-search vertex-ai

Last synced: 25 Feb 2025

https://github.com/giorgishengelia/bike-share-analysis-report

Help developing marketing strategy using data analytics to help convert casual riders into members

bigquery sql tableau

Last synced: 07 Oct 2025

https://github.com/rubnsbarbosa/nasa-asteroids-extractor

ETL asteroids data extractor from GCS bucket to BigQuery

bigquery bucket cloud-storage google-cloud nasa-api-neows

Last synced: 02 Jan 2026

https://github.com/kaushik-puttaswamy/flight-booking-airflow-ci-cd

This project automates flight booking data processing using Apache Airflow, PySpark, and GCP with CI/CD integration for efficient deployment across development and production environments. It orchestrates data transformation, storage in BigQuery, and deployment via GitHub Actions.

airflow-dags bigquery ci-cd gcp pyspark python

Last synced: 17 May 2026

https://github.com/juldrixx/bigquery-avro-schema-converter

Website to convert a schema from one format to another between BigQuery and Avro

avro avro-schema bigquery bigquery-schema converter schema

Last synced: 02 Jan 2026

https://github.com/crudek-data/bigquery-kaggle-apis

kaggle api to download free datasets along with google bigquery api to read/write from cloud data warehouse

bigquery data-engineering kaggle

Last synced: 10 May 2026

https://github.com/olahsymbo/analytics-dashboard-service

data analytics dashboard based on python Plotly-Dash library https://plotly.com/dash/

bigquery css dash flask flask-sqlalchemy gunicorn html plotly postgres python

Last synced: 05 Apr 2026

https://github.com/bruno-furtado/gcp-data-architecture

Example of data architecture to make information available for fast consumption and analytical exploration.

bigquery cloudrun cloudsql dataflow dataform dlq gcp google-cloud logging looker-studio pubsub

Last synced: 09 Oct 2025

https://github.com/hanif-syazul/analyzing-kimia-farma-sales-performance-with-gcp

This repository contains the final project for the Rakamin Big Data Analytics Internship. It include a complete dashboard of Kimia Farma's sales performance analysis from 2020 to 2023.

big-data-analytics bigquery internship-project kimia-farma looker-studio rakamin sql

Last synced: 02 Jan 2026

https://github.com/jasontanx/ridership-headline-project

This end to end data engineering / data analytics project will be about the Malaysian public transport ridership data.

bigquery data-engineering minio-server public-transport-ridership terraform

Last synced: 09 May 2026

https://github.com/night-fury-me/real-time-vehicle-data-processing

A repository that contains implementation of a Real-Time Vehicle Data Processing Pipeline that efficiently manages and analyzes vehicle data through a cohesive system.

bigquery cpp data-engineering data-streaming flink grpc kafka python real-time-data-processing

Last synced: 02 Jan 2026

https://github.com/vbalalian/three-gits

Group analytics project for a predictive analytics course. Using the Yelp open dataset to predict restaurant success.

bigquery dbt predictive-analytics python regression sentiment-analysis sklearn sql vader-sentiment-analysis yelp-dataset

Last synced: 13 May 2026

https://github.com/dataforge-projects/covid19_ibge

Este repositório contém a análise de dados realizada no âmbito do Tech Challenge da pós-graduação em Data Analytics da FIAP. O objetivo do projeto é explorar e interpretar dados relevantes sobre o comportamento da população durante a pandemia da COVID-19, utilizando a base de dados PNAD-COVID19 do IBGE. Essa análise visa apoiar a tomada de decisão

bigquery excel google-bigquery google-cloud-platform powerbi python sql

Last synced: 18 Feb 2026

https://github.com/edumoraes1/spam_count_sfmc

Consulta de SQL com contagem de envios de email e spam dos ultimos 365 dias

bigquery marketing-cloud salesforce sql

Last synced: 23 Feb 2026

https://github.com/adadalshabab/data-engineering-gcp-project

An end-to-end modern data engineering project, including deployment of ETL pipeline on Google Cloud Platform, using BigQuery for data analysis and leveraging Looker to generate an insight dashboard.

bigquery data data-science data-visualization databases dataengineering-a engineering etl-pipeline looker-studio powerbi

Last synced: 19 Jan 2026

https://github.com/thecodersstudio/node-native-test-runner

Code samples and test cases showcasing the power of Node.js's native test runner for streamlined and efficient testing.

bigquery mock nodejs nodejs-test nodenativetestrunner test

Last synced: 05 Feb 2026

https://github.com/azapeti/bigquery-python-bash-automation

Since you're using the free version, you can only get data from your website through the Google Analytics API for the last 60 days. I would like to demonstrate in this repository how to run BigQuery queries in Python and automate it using bash and crontab for collecting historical data.

analytics automation bash bigquery cronjob crontab ga4 python python3

Last synced: 02 Jan 2026

https://github.com/lisabensoussan/bigdata_midterm

This project focuses on analyzing Stack Overflow data related to JavaScript and Python questions using a combination of SQL queries (Google BigQuery) and Unix shell commands. The aim is to explore trends, activity patterns, and user behavior around these popular programming languages through data wrangling and querying techniques.

bigquery data-cleaning sql unix-command unix-shell

Last synced: 27 Jan 2026

BigQuery Awesome Lists
BigQuery Categories