Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
BigQuery
![](https://explore-feed.github.com/topics/bigquery/bigquery.png)
Google BigQuery enables companies to handle large amounts of data without having to manage infrastructure. Google’s documentation describes it as a « serverless architecture (that) lets you use SQL queries to answer your organization’s biggest questions with zero infrastructure management. BigQuery’s scalable, distributed analysis engine lets you query terabytes in seconds and petabytes in minutes. » Its client libraries allow the use of widely known languages such as Python, Java, JavaScript, and Go. Federated queries are also supported, making it flexible to read data from external sources.
📖 A highly rated canonical book on it is « Google BigQuery: The Definitive Guide », a comprehensive reference. Another enriching read on the subject is the inside story told in the article by the founding product manager of BigQuery celebrating its 10th anniversary.
- GitHub: https://github.com/topics/bigquery
- Wikipedia: https://en.wikipedia.org/wiki/BigQuery/
- Repo: https://github.com/GoogleCloudPlatform/bigquery-utils/
- Released: May 19, 2010
- Related Topics: cloud-computing,
- Aliases: bq,
- Last updated: 2025-02-10 00:03:25 UTC
- JSON Representation
https://github.com/chukwuemekaaham/data-engineering-zoomcamp
Datatalks Club Free Data Engineering Zoomcamp Project
bigquery dbt docker-compose duckdb gcp gcp-cloud-storage github-actions jupyter-notebook kafka linux looker-studio mageai pandas postgresql prefect python redpanda risingwave spark terraform
Last synced: 17 Jan 2025
https://github.com/shubhammohanty680/uber_data_analysis
bigquery data-analysis gcp-compute gcp-project looker-studio mageai python
Last synced: 21 Jan 2025
https://github.com/alimarzouk/paris-aq
ELTL pipeline to monitor air quality in the Paris Île-de-France area
airflow airquality big-data bigquery dataengineering gcs spark
Last synced: 22 Jan 2025
https://github.com/elithrar/finding-bugs-with-bigquery
A talk on using BigQuery, the GitHub Public Data & some elbow grease to find bugs in OSS projects.
big-data bigquery bugs github golang open-source
Last synced: 24 Jan 2025
https://github.com/misicode/Kaggle-Intro_to_SQL
Solutions of the exercises from the course "Introduction to SQL with BigQuery" by @Kaggle.
bigquery kaggle kaggle-intro-to-sql sql
Last synced: 23 Oct 2024
https://github.com/chandanpasunoori/event-sync
Event Sync is for syncing events from multiple sources to multiple destinations, targetted for adhoc events, where sources support acknowledgement functionality.
bigquery golang-tools google-cloud-platform pubsub
Last synced: 19 Dec 2024
https://github.com/stkchan/web-scraping-with-selenium
bigquery pandas python selenium-webdriver webscraping
Last synced: 21 Jan 2025
https://github.com/mattwelke/packt-book-bot
Bot that tweets and logs the Packt free eBook of the day in BigQuery daily.
bigquery bot ebooks ibm-cloud-functions java openwhisk
Last synced: 10 Feb 2025
https://github.com/googlecloudplatform/dcm2bq
About A service for creating a JSON metadata representation for DICOM from multiple input sources and storing into Google Cloud Big Query (BQ).
bigquery dicom gcs googlecloud googlecloudplatform googlecloudstorage json
Last synced: 28 Jan 2025
https://github.com/teraearlywine/sample_sql
The following repo contains samples of SQL code that can be referenced by future clients or employers.
Last synced: 21 Jan 2025
https://github.com/prathmeshyelne/etl-pipeline-for-employee-data-using-data-fusion-airflow
This repository contains code and configuration files for an Extract, Transform, Load (ETL) project using Google Cloud Data Fusion for data extraction, Apache Airflow/Composer for orchestration, and Google BigQuery for data loading.
airflow bigquery dataengineering etl gcp googlecloudplatform
Last synced: 21 Jan 2025
https://github.com/johannaojeling/go-data-ingestion
Cloud Function for ingesting data from Cloud Storage to BigQuery
bigquery cloud-functions cloud-storage go google-cloud
Last synced: 31 Jan 2025
https://github.com/metrics-pli/bigquery-export
Exports collected metrics to Google Big Query
bigquery datastudio lighthouse metrics metrics-pli performance pupeteer
Last synced: 25 Jan 2025
https://github.com/mahendra077/ml_model_using_gcp_bigquery
ML Using Big Query
Last synced: 01 Feb 2025
https://github.com/windi-wulandari/pbi_kimia-farma-x-rakamin
A data-driven analytics project for Kimia Farma to evaluate business performance from 2020-2023 using BigQuery. Focused on transaction data, inventory, branch operations, and product insights. Results were visualized through an interactive dashboard to support strategic decisions and optimizations.
big-data-analytics bigquery datawarehouse googlelooker sql
Last synced: 23 Jan 2025
https://github.com/justinjsd/analytics-engineer-bootcamp
This repository serves as a collection of my work and learnings throughout the bootcamp, focusing on developing skills in analytics engineering, particularly using dbt.
analytics bigquery dbt engineering sql
Last synced: 05 Nov 2024
https://github.com/tupizz/fiap_pnad-covid-19
Este projeto realiza a análise e transformação de dados da PNAD COVID-19 de maio a julho de 2020, utilizando PySpark para processamento de dados em larga escala e BigQuery como destino para armazenamento e análise posterior. O objetivo é consolidar os dados mensais em um único conjunto de dados transformado.
analysis bigquery pyspark python
Last synced: 09 Feb 2025
https://github.com/rohitsanj/superset-dbt-demo
This repository contains an example project (Jaffle Shop) demonstrating integration between Superset and dbt, with BigQuery as the data warehouse.
apache-superset bigquery dbt superset
Last synced: 23 Jan 2025
https://github.com/pedrocarmona/big_query_adapter
An ActiveRecord Google BigQuery adapter
activerecord bigquery gem ruby-on-rails
Last synced: 21 Nov 2024
https://github.com/miguelapp10/api_simpliroute_urbano
extraer datos de la API de SimpliRoute y Urbano en un rango de fechas específico y procesarlos para su análisis y almacenamiento en Google BigQuery
api-client bigquery pandas python
Last synced: 21 Nov 2024
https://github.com/thunchanokbow/inventory-amazon
Inventory value is also important for determining a company's liquidity, or its ability to meet its short-term financial obligations. A high inventory value can indicate that a company has too much money tied up in inventory, which could make it difficult for the company to pay its bills.
azure bigquery cloudcomposer clouddatabase cloudstorage compute-engine dataproc postgresql powerbi pyspark-sql python3
Last synced: 09 Jan 2025
https://github.com/squidmin/java17-spring-gradle-bigquery-reference
Java v17⋅ Spring v3 ⋅ Gradle ⋅ BigQuery
bigquery gradle java java-17-gradle java17 java17-spring-boot spring-boot-3
Last synced: 07 Feb 2025
https://github.com/icarusso/bigqueryexporter
Export query data from google bigquery to local machine
Last synced: 21 Nov 2024
https://github.com/thunchanokbow/audiblebook-revenue
Manage big data on cloud computing to find a list of best-selling audible books, generate reports and dashboards, and provide products and sales promotions that meet the needs of consumers in Thailand
apache-airflow bigquery cloudcomposer data-visualization datalake datawarehouse googlecloudstorage lookerstudio pandas python3
Last synced: 09 Jan 2025
https://github.com/nghiant3110/google_analytic_4
This is a DA project based on the GA4 Sample dataset on Big Query
bigquery google-analytics looker-studio sql
Last synced: 24 Dec 2024
https://github.com/miguelapp10/etl_operadorlogistico
extraer datos de la API de SimpliRoute, AndesExpress y Urbano en un rango de fechas específico y procesarlos para su análisis y almacenamiento en Google BigQuery
api-client bigquery pandas python
Last synced: 20 Dec 2024
https://github.com/alterra-greeve/de-capstone
Capstone Project SIB Batch 6 x Alterra Academy - Data Engineer
bigquery cloud-function data-engineering docker googlefirebase looker-studio python
Last synced: 21 Nov 2024
https://github.com/shinichi-takii/atom-language-sql-bigquery
BigQuery SQL language support in Atom
atom atom-package bigquery grammar snippets sql syntax-highlighting
Last synced: 18 Dec 2024
https://github.com/pmhalvor/whale-speech
A pipeline to map whale sightings to hydrophone audio
beam bigquery gcs mle model-as-a-service python tensorflow2
Last synced: 20 Dec 2024
https://github.com/esanchezros/bigquery-maven-plugin
Maven plugin for managing BigQuery datasets, tables and views
bigquery java maven maven-plugin
Last synced: 22 Jan 2025
https://github.com/zkan/running-bigquery-query-from-airflow-using-bigqueryexecuteoperator
Running BigQuery Query from Airflow using BigQueryExecuteOperator
Last synced: 19 Dec 2024
https://github.com/analyticace/data-engineering-projects
Collection of Open Source Data Engineering Projects
aws big-data bigquery data docker engineering etl oracle-database pipeline sql
Last synced: 22 Dec 2024
https://github.com/cch0/data-engineering-zoomcamp-2024-project
2024 project
bigquery cicd cloud-storage-application cloudstorage gcp mage pipelines terraform
Last synced: 19 Dec 2024
https://github.com/ajaxbarcelonacruyff/gcp_cost
Monitoring Google Cloud costs with Looker Studio.
bigquery googlecloud googlecloudplatform lookerstudio
Last synced: 25 Dec 2024
https://github.com/ajaxbarcelonacruyff/ga4_bigquery_session_source
Creating GA4 session references in BigQuery.
Last synced: 25 Dec 2024
https://github.com/george-nyamao/gcp_etl_project
An ETL pipeline to move an uploaded flat file ffrom GCS, mask PII, store Big Query, and Create a report in Looker.
airflow bigquery cloudcomposer data-fusion gcs-bucket looker python3 wrangler
Last synced: 21 Jan 2025
https://github.com/matt-strautmann/dbt-bigquery-ecommerce-quickstart
Welcome to the dbt-BigQuery Quickstart Project! 🎉 This repository is designed as a hands-on guide to help you build a modern data stack leveraging powerful tools like Airbyte for ingestion, dbt for transformation, and BigQuery for storage and analytics.
bigquery dbt e-commerce quickstarts
Last synced: 17 Jan 2025
https://github.com/miguelapp10/workinghoursbetweentwodate_bigquery
Este proyecto es una calculadora de horas laborales que determina la cantidad de horas trabajadas entre dos fechas, teniendo en cuenta días hábiles y horas de trabajo especificadas con Bigquery
bigquery bigquery-dataset bigquery-table querying sql sql-query
Last synced: 15 Jan 2025
https://github.com/tosh2230/pubsub-dataflow-bigquery
Google Cloud Dataflow for 'Exactly-Once' streaming insertion, from Google Cloud Pub/Sub to Google BigQuery.
bigquery dataflow gcp google-cloud google-cloud-platform pubsub
Last synced: 21 Jan 2025
https://github.com/nguyendangxuanlinh/newyorkbike-rental-trip-time-prediction-model-googlebigquery
The ML project uses Linear Regression to predict the trip time of a bike rental for a new prediction system in new mobile application. The ML datasets have been collected and stored in a BigQuery public dataset
bigquery linear-regression machine-learning
Last synced: 21 Jan 2025
https://github.com/seandavi/aisr-data-warehouse
Animal Image Shared Resource PACS/Viewer
api bigquery clinical-information-system dicom dicom-files gcp image-analysis pacs radiology
Last synced: 22 Dec 2024
https://github.com/jaehyeon-kim/dbt-cicd-demo
DBT CI/CD Demo
bigquery cicd dataengineering dbt gcp github-actions
Last synced: 21 Nov 2024
https://github.com/kellyjadams/bigquery-python-weekly-report
A script to automate a weekly report that runs BigQuery in Python.
Last synced: 22 Jan 2025
https://github.com/xxmadkillerx10/data-engineering-zoomcamp
The Data Engineering Zoomcamp covers essential skills in containerization, workflow orchestration, data warehousing, analytics engineering, batch, and streaming processing. It includes tools like Docker, Terraform, BigQuery, dbt, Spark, Kafka, Kestra, Postgres, Google Data Studio, and Metabase.
airflow bigquery data-visualization dbt dbt-clickhouse docker-compose etl gcs google-cloud kafka postgresql spark sql streaming
Last synced: 03 Feb 2025
https://github.com/chiamakaukwuoma/portfolio
This repository contains various projects I've been privileged to work on outside of work.
aws-rds azure-fabric bigquery data-analysis docker-container elasticsearch excel grafana hadoop looker-studio mssql mysql postgresql powerbi python sql tableau
Last synced: 03 Feb 2025
https://github.com/antbit96/dataform_poc
Template for basic data preparation
bigquery bigquery-dataform data-preparation
Last synced: 14 Dec 2024
https://github.com/valenthr/purchase_funnel
Google merch store sales analysis
Last synced: 27 Jan 2025
https://github.com/owox/sgtm-owox-ga4-bigquery
OWOX BI Streaming is an advanced tracking to get the most from existing Google Analytics 4 installed on your website
Last synced: 20 Dec 2024
https://github.com/lupusruber/music_analytics
This project processes real-time music event data using Kafka, Apache Spark on Google Cloud Dataproc, and stores the transformed data in BigQuery for analytics, all orchestrated by Airflow and managed with Terraform.
bigquery data-proc dimensional-modeling gcp-project kafka spark-structured-streaming
Last synced: 02 Feb 2025
https://github.com/rifa8/data-warehouse-submission
Learning about Data Warehouse
bigquery citus columnar data-warehouse datalake gcs-bucket
Last synced: 27 Jan 2025
https://github.com/rifa8/extract-load-demo
Learning Google Cloud Platform (GCP)
Last synced: 27 Jan 2025
https://github.com/shahardekel/diabetes-analysis
bigquery cognos-dashboard python sql
Last synced: 10 Feb 2025
https://github.com/raqssoriano/hha504_assignment_nosql_dbs
This task is part of my assignment focused on creating and configuring databases in different platforms, such as GCP's BigQuery, MongoDB Atlas, and Redis Cloud.
bigquery mongodb-atlas mongodbcompass redis redisinsight
Last synced: 10 Feb 2025
https://github.com/scraly/bigquery
Google BigQuery AaaS tools, tips and fun
Last synced: 25 Dec 2024
https://github.com/tosh2230/cdc-rds-bq
Change data capture from Amazon RDS to Google BigQuery
bigquery changedatacapture rds
Last synced: 21 Jan 2025
https://github.com/codingsancho/fastapi-bigquery
Learning exercise, Python backend, FastAPI, bigquery, React-JS frontend.
bigquery fastapi javascript python react
Last synced: 20 Dec 2024
https://github.com/adindasarianti/rakamin_kf_analytics
This repository contains my project as a Big Data Analytics intern at Kimia Farma, where I analyzed the performance of Kimia Farma from 2020 to 2023
bigquery dataanalytics lookerstudio
Last synced: 02 Jan 2025
https://github.com/newtonmunene99/sec-filings
Simple golang app that crawls sec EDGAR filings and loads indices into Google BigQuery
bigquery cloudstorage gcp golang
Last synced: 21 Jan 2025
https://github.com/isaacmg/mimic_iv_bq_queries
Queries needed to recreate time series features for model training
Last synced: 21 Jan 2025
https://github.com/djdhairya/uber-data-analytics
Mage Vm
aiml api bigdata bigquery deep-learning docker google-maps-api ml python3 sql ssh vmware
Last synced: 07 Jan 2025
https://github.com/sintef/bigquery-postgresql-wire-proxy
A PostgreSQL wire protocol proxy server for BigQuery.
Last synced: 12 Jan 2025
https://github.com/yandex-cloud-examples/yc-bigquery-to-object-storage
Экспорт данных из Google Big Query через Google Storage в Object Storage Yandex Cloud.
bigquery object-storage python3 yandex-cloud yandexcloud
Last synced: 29 Dec 2024
https://github.com/ivdatahub/pypi-package-stats
Project for ingest pypi packages data from BigQuery and send to DataDog for analysis and insights with dashboards, monitors and more
bigquery cloud data-engineering data-warehouse gcp software-engineering
Last synced: 21 Nov 2024
https://github.com/marielachirinosr/nyc-taxi-trip-exploration-2019-2020
Explores passenger behavior & impact of COVID-19 on NYC taxi industry (Q1 2019-2020).
bigquery data data-analysis data-visualization python sql tableau
Last synced: 29 Dec 2024
https://github.com/iht/bigquery-dataflow-cdc-example
A Dataflow streaming pipeline written in Java, reading data from Pubsub and recovering the sessions from potentially unordered data, and upserting the session data into BigQuery with no duplicates
apache-beam bigquery cdc dataflow google-cloud pubsub
Last synced: 29 Dec 2024
https://github.com/ivanildobarauna/pypi-package-stats
Project for ingest pypi packages data from BigQuery and send to DataDog for analysis and insights with dashboards, monitors and more
bigquery cloud data-engineering data-warehouse gcp software-engineering
Last synced: 29 Dec 2024
https://github.com/sangnandar/insert-unique-record
This is Cloud Functions script to insert only unique records into BigQuery.
bigquery digital-marketing-analytics google-cloud-functions
Last synced: 29 Dec 2024
https://github.com/push-protocol/push-google-bigquery
The Power of Web3 Big Data: A Guide to Using Google BigQuery and Push Protocol for Data Communication and Analysis
bigquery data push push-notifications web3
Last synced: 31 Jan 2025
https://github.com/oliveroneill/wilt-cloud-functions
Wilt Google Cloud Functions
bigquery google-cloud-functions
Last synced: 07 Jan 2025
https://github.com/oguzgn/a-case-study-for-a-livestreaming-platform
This project aims to analyze livestream watch times of users across different regions. The goal is to identify the top 5 users with the highest watch time for each region. The analysis involves multiple SQL transformations to extract meaningful insights from the data.
bigquery data data-analysis data-modeling live-streaming sql
Last synced: 27 Jan 2025
https://github.com/shegzimus/de_nasa_neow_pipeline
Airflow powered ETL pipeline for moving Near-Earth-Object data from NASA to Google Cloud
airflow-dag airflow-operator airflow-providers bigquery celery-redis docker docker-compose docker-container google-cloud-platform googlecloudstorage nasa-api
Last synced: 27 Jan 2025
https://github.com/allanreda/share-of-search-retrieval-and-visualization
Share of search analysis including data retrieval from Google Ads API, storing data in BigQuery and visualizing it in Looker Studio
bigquery google-ads-api looker-studio python share-of-search
Last synced: 28 Dec 2024
https://github.com/ackeecz/terraform-gcp-cloud-run_pubsub_to_bq
Cloud Run subscribes itself to given topic and inserts each message to BigQuery table.
Last synced: 07 Jan 2025
https://github.com/ackeecz/terraform-gcp-cloud-function_pubsub_to_bq
Cloud function subscribes itself to given topic and inserts each message to BigQuery table.
bigquery cloud-functions pubsub terraform-module
Last synced: 07 Jan 2025
https://github.com/riju18/airflow-data-engineering-with-bigquery-and-dbt
Fetch Data from a simple csv file, send the data in GCP BigQuery table, run dbt to automate the DWH and run SODA to check Data Quality.
apache-airflow bigquery csv dbt python3 soda
Last synced: 28 Jan 2025
https://github.com/yu-iskw/homebrew-bigquery-to-datastore
A homebrew tap for bigquery-to-datastore
bigquery google-datastore homebrew
Last synced: 10 Feb 2025
https://github.com/eddieatgoogle/sql-based-genai-data-pipeline
GenAI data pipeline that performs data preparation, management and performance evaluation tasks for RAG systems using SQL as the primary development language. Please feel free to use this as a starting point for your own projects.
bigquery bqml dataform embeddings gemini google-cloud-platform sql vector-search vertex-ai
Last synced: 08 Jan 2025
https://github.com/oguzgn/data-science-for-business-imp
a case study for business improvment
ab-testing bigquery data-science data-visualization debugging looker marketing-analytics sheets
Last synced: 21 Jan 2025
https://github.com/humairarizwan/uber-ride-dataengineering-analysis
This project creates a pipeline to process data and performs data analytics on Uber data.
bigquery dataanalysis dataengineering gcp-project googlestorage looker-studio
Last synced: 21 Jan 2025
https://github.com/goatcheesesaladwithpeanutoildressing/scio-demo
Playing w/ Scio
Last synced: 08 Jan 2025
https://github.com/sergeimakarovv/wine-recommendation-analytics
Wine recommendation system
airflow bigquery pandas postgresql tableau
Last synced: 08 Jan 2025
https://github.com/kellyjadams/ap-exam-scores
Analyzing AP exam scores for a school.
Last synced: 08 Jan 2025
https://github.com/flowerinthenight/bqstream
A simple library to help facilitate streaming to BigQuery.
Last synced: 08 Jan 2025
https://github.com/paulveillard/cybersecurity-analytics
An ongoing collection of awesome software, libraries, learning tutorials, documents and books, technical resources and cool stuff about Analytics Engineering in Cybersecurity.
analytics bigdata bigquery cybernetics cybersecurity data data-engineering data-science encryption encryption-decryption seo seo-friendly seo-optimization
Last synced: 02 Feb 2025
https://github.com/ansh-info/stockpulse
Real-time stock market analytics pipeline with live visualization dashboard. Built with Python and GCP, featuring automated data processing and interactive Streamlit analytics.
api big-data bigquery cloud cloud-computing cloud-native data-engineering data-pipeline docker docker-compose gcp gcp-automation-gitops gcp-cloud-run gcp-pubsub google-cloud-platform real-time realtime stock-market stocks streamlit
Last synced: 27 Dec 2024
https://github.com/simoun-asmar/clinipet_project
BigQuery
bigquery looker-studio lookerstudio sql
Last synced: 13 Jan 2025
https://github.com/hitthecodelabs/bigquery_ml
Jupyter notebooks that utilize Google BigQuery's machine learning capabilities.
Last synced: 04 Feb 2025
https://github.com/phukon/package-insights
PyPI package reports and insights. The data was ingested from publicly available source using BigQuery and then transformed.
Last synced: 27 Jan 2025
https://github.com/rrmcguinness/protoc-gen-bq-schema
A protocol buffer compiler (protoc) plugin for generating Google BigQuery JSON table definitions.
bigquery bigquery-schema protobuf
Last synced: 13 Jan 2025
https://github.com/aisurjyasamantaray/-optimizing-target-s-brazilian-operations-insights-from-order-processing-pricing-and-payment-trends-
This project offers an in-depth analysis of consumer behavior, logistical performance, and payment preferences within the e-commerce sector. By examining order costs, delivery times, and payment methods, businesses can uncover valuable insights into operational efficiency and customer preferences.
bigquery consumer-insights data-analysis database sql target
Last synced: 21 Jan 2025
https://github.com/vedantwalia/google-data-analytics-capstone-case-study
This is a repository of my work on data analysis as a part of the Google Data Analytics Capstone
bigquery data data-viz datavisualization-project divvy-bikes google googledataanalytics sql tableau tableau-public
Last synced: 21 Jan 2025
https://github.com/aazuspan/landsat-bigquery
Summarizing 51 years of Landsat data using Earth Engine and BigQuery
bigquery google-earth-engine landsat
Last synced: 21 Jan 2025