Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
BigQuery
Google BigQuery enables companies to handle large amounts of data without having to manage infrastructure. Google’s documentation describes it as a « serverless architecture (that) lets you use SQL queries to answer your organization’s biggest questions with zero infrastructure management. BigQuery’s scalable, distributed analysis engine lets you query terabytes in seconds and petabytes in minutes. » Its client libraries allow the use of widely known languages such as Python, Java, JavaScript, and Go. Federated queries are also supported, making it flexible to read data from external sources.
📖 A highly rated canonical book on it is « Google BigQuery: The Definitive Guide », a comprehensive reference. Another enriching read on the subject is the inside story told in the article by the founding product manager of BigQuery celebrating its 10th anniversary.
- GitHub: https://github.com/topics/bigquery
- Wikipedia: https://en.wikipedia.org/wiki/BigQuery/
- Repo: https://github.com/GoogleCloudPlatform/bigquery-utils/
- Released: May 19, 2010
- Related Topics: cloud-computing,
- Aliases: bq,
- Last updated: 2025-02-05 00:03:39 UTC
- JSON Representation
https://github.com/essien1990/etl_pipeline_airflow
Creating pipelines using Python3 and Apache Airflow to load tables into Google Big Query Dataware House
airflow airflow-dags airflow-operators bash bigquery bq datawarehouse etl-pipeline python3
Last synced: 21 Jan 2025
https://github.com/justinbeckwith/bisquick
🥞Synchronize your GitHub issues with BigQuery. Do neat stuff.
Last synced: 19 Dec 2024
https://github.com/cch0/data-engineering-zoomcamp-2024-project
2024 project
bigquery cicd cloud-storage-application cloudstorage gcp mage pipelines terraform
Last synced: 19 Dec 2024
https://github.com/salrashid123/gcp_cloud_status_dataset
BigQuery Dataset to query GCP Cloud Status Dashboard (https://status.cloud.google.com/)
bigquery gcp google-cloud google-cloud-platform
Last synced: 22 Jan 2025
https://github.com/rohitsanj/superset-dbt-demo
This repository contains an example project (Jaffle Shop) demonstrating integration between Superset and dbt, with BigQuery as the data warehouse.
apache-superset bigquery dbt superset
Last synced: 23 Jan 2025
https://github.com/shubhammohanty680/uber_data_analysis
bigquery data-analysis gcp-compute gcp-project looker-studio mageai python
Last synced: 21 Jan 2025
https://github.com/googlecloudplatform/dcm2bq
About A service for creating a JSON metadata representation for DICOM from multiple input sources and storing into Google Cloud Big Query (BQ).
bigquery dicom gcs googlecloud googlecloudplatform googlecloudstorage json
Last synced: 28 Jan 2025
https://github.com/shrawans007/google_cyclistic_2023
Google Data Analytics Capstone Case Study (SQL and Tableau)
big-query bigquery coursera-assignment cyclistic cyclistic-bike-share-analysis-case-study cyclistic-bikshare data-analysis data-analysis-project data-analytics data-cleaning data-combination data-exploration data-science google-data-analytics sql tableau tableau-dashboard tableau-public
Last synced: 08 Jan 2025
https://github.com/miguelapp10/etl_operadorlogistico
extraer datos de la API de SimpliRoute, AndesExpress y Urbano en un rango de fechas específico y procesarlos para su análisis y almacenamiento en Google BigQuery
api-client bigquery pandas python
Last synced: 20 Dec 2024
https://github.com/justinjsd/analytics-engineer-bootcamp
This repository serves as a collection of my work and learnings throughout the bootcamp, focusing on developing skills in analytics engineering, particularly using dbt.
analytics bigquery dbt engineering sql
Last synced: 05 Nov 2024
https://github.com/nghiant3110/google_analytic_4
This is a DA project based on the GA4 Sample dataset on Big Query
bigquery google-analytics looker-studio sql
Last synced: 24 Dec 2024
https://github.com/tosh2230/pubsub-dataflow-bigquery
Google Cloud Dataflow for 'Exactly-Once' streaming insertion, from Google Cloud Pub/Sub to Google BigQuery.
bigquery dataflow gcp google-cloud google-cloud-platform pubsub
Last synced: 21 Jan 2025
https://github.com/esanchezros/bigquery-maven-plugin
Maven plugin for managing BigQuery datasets, tables and views
bigquery java maven maven-plugin
Last synced: 22 Jan 2025
https://github.com/mattwelke/packt-book-bot
Bot that tweets and logs the Packt free eBook of the day in BigQuery daily.
bigquery bot ebooks ibm-cloud-functions java openwhisk
Last synced: 18 Dec 2024
https://github.com/antoinegiraud/dataform_hypermarche
SQL repo orchestrated by Dataform for BigQuery
Last synced: 08 Jan 2025
https://github.com/shinichi-takii/atom-language-sql-bigquery
BigQuery SQL language support in Atom
atom atom-package bigquery grammar snippets sql syntax-highlighting
Last synced: 18 Dec 2024
https://github.com/kellyjadams/bigquery-python-weekly-report
A script to automate a weekly report that runs BigQuery in Python.
Last synced: 22 Jan 2025
https://github.com/zkan/running-bigquery-query-from-airflow-using-bigqueryexecuteoperator
Running BigQuery Query from Airflow using BigQueryExecuteOperator
Last synced: 19 Dec 2024
https://github.com/thunchanokbow/audiblebook-revenue
Manage big data on cloud computing to find a list of best-selling audible books, generate reports and dashboards, and provide products and sales promotions that meet the needs of consumers in Thailand
apache-airflow bigquery cloudcomposer data-visualization datalake datawarehouse googlecloudstorage lookerstudio pandas python3
Last synced: 09 Jan 2025
https://github.com/thunchanokbow/inventory-amazon
Inventory value is also important for determining a company's liquidity, or its ability to meet its short-term financial obligations. A high inventory value can indicate that a company has too much money tied up in inventory, which could make it difficult for the company to pay its bills.
azure bigquery cloudcomposer clouddatabase cloudstorage compute-engine dataproc postgresql powerbi pyspark-sql python3
Last synced: 09 Jan 2025
https://github.com/ajaxbarcelonacruyff/gcp_cost
Monitoring Google Cloud costs with Looker Studio.
bigquery googlecloud googlecloudplatform lookerstudio
Last synced: 25 Dec 2024
https://github.com/ajaxbarcelonacruyff/ga4_bigquery_session_source
Creating GA4 session references in BigQuery.
Last synced: 25 Dec 2024
https://github.com/matt-strautmann/dbt-bigquery-ecommerce-quickstart
Welcome to the dbt-BigQuery Quickstart Project! 🎉 This repository is designed as a hands-on guide to help you build a modern data stack leveraging powerful tools like Airbyte for ingestion, dbt for transformation, and BigQuery for storage and analytics.
bigquery dbt e-commerce quickstarts
Last synced: 17 Jan 2025
https://github.com/squidmin/java17-spring-gradle-bigquery-reference
Java v17⋅ Spring v3 ⋅ Gradle ⋅ BigQuery
bigquery gradle java java-17-gradle java17 java17-spring-boot spring-boot-3
Last synced: 14 Dec 2024
https://github.com/jaehyeon-kim/dbt-cicd-demo
DBT CI/CD Demo
bigquery cicd dataengineering dbt gcp github-actions
Last synced: 21 Nov 2024
https://github.com/miguelapp10/workinghoursbetweentwodate_bigquery
Este proyecto es una calculadora de horas laborales que determina la cantidad de horas trabajadas entre dos fechas, teniendo en cuenta días hábiles y horas de trabajo especificadas con Bigquery
bigquery bigquery-dataset bigquery-table querying sql sql-query
Last synced: 15 Jan 2025
https://github.com/stkchan/web-scraping-with-selenium
bigquery pandas python selenium-webdriver webscraping
Last synced: 21 Jan 2025
https://github.com/gdbecker/dbtlabslearning
Learn the foundational steps of transforming data in dbt Cloud. Start by connecting dbt Cloud to a data warehouse and Git repository, then explore key concepts like modeling, sources, testing, documentation, and deployment. Get hands-on by building a model and running tests in dbt Cloud.
analytics-engineering bigquery dbt dbt-cloud jinja macros models packages sql testing
Last synced: 22 Jan 2025
https://github.com/romange/puma
Bigquery-like engine for processing structured json-like records
Last synced: 23 Jan 2025
https://github.com/teraearlywine/sample_sql
The following repo contains samples of SQL code that can be referenced by future clients or employers.
Last synced: 21 Jan 2025
https://github.com/pedrocarmona/big_query_adapter
An ActiveRecord Google BigQuery adapter
activerecord bigquery gem ruby-on-rails
Last synced: 21 Nov 2024
https://github.com/nais/bqrator
Operator for creating BigQuery datasets
bigquery bigquery-operator kubernetes kubernetes-operator nais-features
Last synced: 04 Feb 2025
https://github.com/miguelapp10/api_simpliroute_urbano
extraer datos de la API de SimpliRoute y Urbano en un rango de fechas específico y procesarlos para su análisis y almacenamiento en Google BigQuery
api-client bigquery pandas python
Last synced: 21 Nov 2024
https://github.com/prathmeshyelne/etl-pipeline-for-employee-data-using-data-fusion-airflow
This repository contains code and configuration files for an Extract, Transform, Load (ETL) project using Google Cloud Data Fusion for data extraction, Apache Airflow/Composer for orchestration, and Google BigQuery for data loading.
airflow bigquery dataengineering etl gcp googlecloudplatform
Last synced: 21 Jan 2025
https://github.com/analyticace/data-engineering-projects
Collection of Open Source Data Engineering Projects
aws big-data bigquery data docker engineering etl oracle-database pipeline sql
Last synced: 22 Dec 2024
https://github.com/morphl-ai/morphl-model-publishers-churning-users-bigquery
BigQuery connector, pre-processor and model for predicting churning users for digital publishers using Google Analytics 360
bigquery google-analytics machine-learning morphl-platform pipeline preprocessor pyspark
Last synced: 11 Jan 2025
https://github.com/tatamiya/new-books-notification
Fetch new books from [版元ドットコム](https://www.hanmoto.com/) and notify them to Slack
bigquery cloudrun-jobs gcs golang slack
Last synced: 12 Jan 2025
https://github.com/samedhi/gaend
Convert GAE Models into endpoints
bigquery elasticsearch google-app-engine restful taskqueue
Last synced: 12 Jan 2025
https://github.com/chukwuemekaaham/uber-gcp-etl-project
Data Engineering Zoomcamp Final Project
bigquery cloud-storage csv docker-compose gcp jupyter-notebook looker-studio mageai python spark spreadsheets terraform
Last synced: 10 Jan 2025
https://github.com/rsachdeva/illuminatingdeposits-gcp-trigger
Terraform usage in the context of Google Cloud Platform GCP based Trigger of Resources applied to Cloud Functions. Both resource creation and destruction is through Terraform.
bigquery bigquery-table cloud-events functions-framework gcp go golang golangci-lint google-cloud google-cloud-function-pubsub-trigger google-cloud-functions google-cloud-pubsub google-cloud-sdk google-cloud-storage google-cloud-terraform sendgrid terraform
Last synced: 18 Jan 2025
https://github.com/nghiant3110/b2b_crm_3
This is a DA project based on the B2B Sales CRM dataset from Maven Analytics
bigquery google-sheets looker-studio sql
Last synced: 24 Dec 2024
https://github.com/andrewm4894/gcp-telemetry-example
Simple HTTP endpoint for telemetry data type events in GCP.
bigquery gcp-cloud-functions gcp-storage python terraform
Last synced: 01 Feb 2025
https://github.com/raqssoriano/hha504_assignment_nosql_dbs
This task is part of my assignment focused on creating and configuring databases in different platforms, such as GCP's BigQuery, MongoDB Atlas, and Redis Cloud.
bigquery mongodb-atlas mongodbcompass redis redisinsight
Last synced: 18 Dec 2024
https://github.com/adadalshabab/data-engineering-gcp-project
An end-to-end modern data engineering project, including deployment of ETL pipeline on Google Cloud Platform, using BigQuery for data analysis and leveraging Looker to generate an insight dashboard.
bigquery data data-science data-visualization databases dataengineering-a engineering etl-pipeline looker-studio powerbi
Last synced: 19 Dec 2024
https://github.com/hcrlau/cyclistic-bike-share-analysis
Google Data Analytics Capstone Project
bigquery cyclistic-bike-share-analysis-case-study data-analysis data-visualization sql tableau
Last synced: 19 Dec 2024
https://github.com/smohanta23/uber_data-engineering_etl-project
This project demonstrates a comprehensive data engineering workflow using the Uber information dataset. It covers the full spectrum of data engineering pipelines, from data transformation to deployment on Google Cloud, with a focus on creating a scalable and insightful data model.
big-data-analytics bigquery cloudcomputing computeengine dashboard-application dataengineering datainsights datamodelling datapipeline datascience datavisualization etl-pipeline gcp-project googlecloudplatform mage opensource python uber uber-api
Last synced: 21 Jan 2025
https://github.com/zkan/data-engineering-on-gcp
Data Engineering on Google Cloud Platform (GCP)
bigquery data-engineering data-lake data-pipeline data-warehouse gcs google-cloud-platform machine-learning
Last synced: 19 Dec 2024
https://github.com/scraly/flume-bigquery-sink
An Apache Flume Sink implementation to publish data to Google BigQuery
Last synced: 25 Dec 2024
https://github.com/francois-lenne/elt-mp4-quiberon
the goal of this project is to retrieve the video of the municipality of quiberon and see if a person is in or no
bigquery cicd data-engineering docker elt google-cloud-functions google-cloud-platform google-cloud-run google-cloud-storage pipeline python sql unstructured-data
Last synced: 25 Dec 2024
https://github.com/fakhri098/project-sql-bigquery
This project aims to analyze taxi trip data with a focus on trip duration patterns, popular routes, and trip costs. The study was conducted to gain in-depth insights into taxi travel behavior based on historical data.
Last synced: 17 Jan 2025
https://github.com/celiason/coffee-funnel
webpage for visualizing sales projections of a small coffee business
bigquery prophet sales-analysis streamlit-webapp
Last synced: 26 Dec 2024
https://github.com/yeha98555/google-maps-analysis-pipeline
Taiwan Travel Attractions Analysis Data Pipeline
airflow bigquery cloudfunctions docker gcp gcs googlemaps googlesheets python terraform
Last synced: 23 Jan 2025
https://github.com/prashhhant213/strategic-analysis-of-retail-brand-in-south-america-using-sql
Leveraged Big Query and MySQL to analyze 100K records for sales optimization, trend identification, and enhancing customer satisfaction for a retail brand in South America and to provide insights and recommendations to improve their userbase and improve their services
bigquery database mysql-server sql
Last synced: 26 Dec 2024
https://github.com/epomatti/gcp-bigquery
Data sync via CDC from GCP Cloud SQL to Big Query using Datastream
bigquery cloud-sql datastream gcp
Last synced: 17 Jan 2025
https://github.com/spacepatcher/google-workspace-gmail-collector
👁 App for collecting Gmail logs from your Google Workspace account and sending them to Kafka
bigquery gmail google-workspace security soc
Last synced: 23 Oct 2024
https://github.com/anyesh/gbq-helpers
GBQ related helper functions and snippets.
Last synced: 10 Jan 2025
https://github.com/entur/terraform-aiven-kafka-connect-bigquery-sink
Terraform module for BigQuery sink connector on Aiven KafkaConnect cluster
aiven bigquery kafka-connect sink-connector terraform terraform-modules
Last synced: 17 Jan 2025
https://github.com/ngangawairimu/clv-rfm-and-customer-segmentation-analysis
This project performs cohort analysis to estimate Customer Lifetime Value (CLV) by analyzing weekly revenue and user registrations over 12 weeks, forecasting future revenue, and providing actionable insights for marketing and business strategy.
bigquery clv-analysis cohort-analysis customer-segmentation excel rfm-analysis
Last synced: 03 Jan 2025
https://github.com/shikanime/seeker
Data platform based on BigQuery
bigquery dataform google-cloud
Last synced: 04 Jan 2025
https://github.com/marceloneppel/map-to-bigquery-structs
Tool to convert a Golang map to a struct containing fields with types like bigquery.Null*.
Last synced: 30 Jan 2025
https://github.com/lu-sketch/google-big-query-sql---credit-risk-analysis
Big Query SQL Credit Risk Analysis
big-data bigquery credit-risk sql
Last synced: 18 Jan 2025
https://github.com/yasarsultan/taxi-trip-analysis
The NYC Taxi Trip Batch Data Pipeline automates processing of large-scale trip data using Apache Spark and Airflow, integrating AWS S3 and Google BigQuery for storage and analytics. It features scalable, containerized workflows with robust data validation.
airflow aws-s3 bash-script batch-processing bigquery data-lake data-warehouse docker python3 spark
Last synced: 11 Jan 2025
https://github.com/denisogr/kaggle-notebook-to-production
This is a study project. I get analytics/ML examples from Kaggle and use different technologies to re-implement them.
bigquery data-engineering gcp kaggle-competition kaggle-dataset python spark
Last synced: 12 Jan 2025
https://github.com/justinjsd/analytics-engineering
📊 A repository focusing on analytics engineering, particularly using dbt on the Northwind Sample dataset
analytics bigquery dbt engineering sql
Last synced: 12 Jan 2025
https://github.com/shvetsihorr/sql-projects
SQL and Google BigQuery-Portfolio Projects
azuredatastudio bigquery mssql postgresql sql
Last synced: 18 Jan 2025
https://github.com/bedirk/sql-projects-studies
My Projects and Studies by using SQL
azuredatastudio bigquery jupyter-notebook kaggle mssqlserver sql
Last synced: 18 Jan 2025
https://github.com/rolandbende/python-bigquery-migrations
Python bigquery-migrations package is for creating and manipulating BigQuery databases easily.
bigquery google migration-automation migration-scripts migration-tool migrations python
Last synced: 24 Jan 2025
https://github.com/ddzikri/analisis-data-kimia-farma
Project Based Internship Kimia Farma Rakamin Academy
Last synced: 24 Jan 2025
https://github.com/stoqey/rasputia
Rasputia Latimore - The Big Data Bitch 💋
Last synced: 19 Jan 2025
https://github.com/vikasgupta1812/google_cloudml_scripts
https://goo.gl/dFjFQf
bigquery google-cloud-ml python-language tensorflow-tutorial
Last synced: 20 Jan 2025
https://github.com/janaom/gcp-de-project-data-pipeline-with-cloud-run-functions-airflow-biggueryml
Build a data pipeline on Google Cloud using an event-driven architecture, leveraging GCS, Cloud Run functions, and BigQuery. Explore both VM and Composer options for Airflow management, and utilize Logging & Monitoring for pipeline health. Discover how SQL-based BigQuery ML can be used for initial ML implementation in specific scenarios.
airflow bigquery bigqueryml cloud-functions cloud-run-functions composer data-engineering-project google-cloud-platform
Last synced: 26 Jan 2025
https://github.com/jasontanx/ridership-headline-project
This end to end data engineering / data analytics project will be about the Malaysian public transport ridership data.
bigquery data-engineering minio-server public-transport-ridership terraform
Last synced: 01 Feb 2025
https://github.com/hariprasath-v/mh_google_cloud_bigquery_ltv_prediction_challenge
Build a model that can predict customers' Long Term Value (LTV).
bigquery colab-notebook klib machine-learning matplotlib numpy pandas python python3 seaborn
Last synced: 13 Jan 2025
https://github.com/richardbnk/data_tools
Python Library to Accelerate Creation of Data ETL Processes on multiple database systems.
Last synced: 02 Feb 2025
https://github.com/knands42/data-ingestion
Data Ingestion project to evaluate my Kotlin skill using concurrency
bigquery golang google-cloud-platform google-storage gradle-kotlin-dsl kotlin kotlin-flow
Last synced: 25 Jan 2025
https://github.com/oguzgn/firebase-ab-test-analysis-for-a-mobile-race-game
This repository showcases an infrastructure designed for analyzing A/B tests in mobile games. It leverages BigQuery to process Firebase and GA4-based event data and uses Looker Studio for dynamic visualization. The project simplifies A/B test comparisons, enabling stakeholders to view results directly through interactive dashboards.
ab-testing ab-testing-analysis bigquery event-based-tracking firebase looker-studio mobile-game-analytics race-game sql
Last synced: 26 Jan 2025
https://github.com/alessio-siciliano/google-cloud-python-class-wrapper
An example of several classes written in Python to interact with GCP
bigquery datatransfer gcp google-cloud
Last synced: 26 Jan 2025
https://github.com/sejalmankar1012/product_data_analyst_assessement
Analyzing the Impact of Business Hour Mismatch on Order Volume in the Food Delivery Industry: A Case Study of UEats and Ghub
assessment-project bigquery loop product-analyst sql-query
Last synced: 26 Jan 2025
https://github.com/lucashomuniz/project-22
[Dashboard] Data and Sustainability: Optimizing Green Flow's Fertilizer Portfolio
agrotech bigquery data-analytics data-structures data-visualization google-cloud-platform powerbi powerbi-visuals powerquery sql sustainability
Last synced: 25 Jan 2025
https://github.com/googleapis/google-cloud-cpp-bigquery
C++ Client Library for Google Cloud BigQuery
bigquery cloud cpp cpp17 google google-cloud-bigquery google-cloud-platform
Last synced: 01 Feb 2025
https://github.com/minhajuddin2510/bigquery_alerts
In today’s data-driven world, organisations heavily rely on timely alerts to monitor critical systems and make informed decisions. However, when working with BigQuery, a popular cloud-based data warehouse, there is no built-in functionality to generate alerts. In this article, we will explore how I recently built a cloud function to address this
alerting bigquery cloudfunctions monitoring-tool slack
Last synced: 31 Jan 2025
https://github.com/jasontanx/terraform-practice
Creating datasets and tables in Google BigQuery via Terraform
bigquery iac-terraform infrastructure-as-code terraform
Last synced: 01 Feb 2025
https://github.com/shahardekel/diabetes-analysis
bigquery cognos-dashboard python sql
Last synced: 18 Dec 2024
https://github.com/sintef/bigquery-postgresql-wire-proxy
A PostgreSQL wire protocol proxy server for BigQuery.
Last synced: 12 Jan 2025
https://github.com/alessio-siciliano/bigquery-advanced-utils
BigQuery-advanced-utils is a lightweight utility library that extends the official Google BigQuery Python client. It simplifies tasks like query management, data processing, and automation. Aimed at developers and data scientists, the project is open to contributions to improve and enhance its functionality.
bigquery datatransfer google-cloud python
Last synced: 01 Feb 2025
https://github.com/ansh-info/stockpulse
Real-time stock market analytics pipeline with live visualization dashboard. Built with Python and GCP, featuring automated data processing and interactive Streamlit analytics.
api big-data bigquery cloud cloud-computing cloud-native data-engineering data-pipeline docker docker-compose gcp gcp-automation-gitops gcp-cloud-run gcp-pubsub google-cloud-platform real-time realtime stock-market stocks streamlit
Last synced: 27 Dec 2024
https://github.com/sangnandar/insert-unique-record
This is Cloud Functions script to insert only unique records into BigQuery.
bigquery digital-marketing-analytics google-cloud-functions
Last synced: 29 Dec 2024
https://github.com/ivanildobarauna/pypi-package-stats
Project for ingest pypi packages data from BigQuery and send to DataDog for analysis and insights with dashboards, monitors and more
bigquery cloud data-engineering data-warehouse gcp software-engineering
Last synced: 29 Dec 2024
https://github.com/iht/bigquery-dataflow-cdc-example
A Dataflow streaming pipeline written in Java, reading data from Pubsub and recovering the sessions from potentially unordered data, and upserting the session data into BigQuery with no duplicates
apache-beam bigquery cdc dataflow google-cloud pubsub
Last synced: 29 Dec 2024
https://github.com/oliveroneill/wilt-cloud-functions
Wilt Google Cloud Functions
bigquery google-cloud-functions
Last synced: 07 Jan 2025
https://github.com/oguzgn/a-case-study-for-a-livestreaming-platform
This project aims to analyze livestream watch times of users across different regions. The goal is to identify the top 5 users with the highest watch time for each region. The analysis involves multiple SQL transformations to extract meaningful insights from the data.
bigquery data data-analysis data-modeling live-streaming sql
Last synced: 27 Jan 2025