Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

BigQuery

Google BigQuery enables companies to handle large amounts of data without having to manage infrastructure. Google’s documentation describes it as a « serverless architecture (that) lets you use SQL queries to answer your organization’s biggest questions with zero infrastructure management. BigQuery’s scalable, distributed analysis engine lets you query terabytes in seconds and petabytes in minutes. » Its client libraries allow the use of widely known languages such as Python, Java, JavaScript, and Go. Federated queries are also supported, making it flexible to read data from external sources.

📖 A highly rated canonical book on it is « Google BigQuery: The Definitive Guide », a comprehensive reference. Another enriching read on the subject is the inside story told in the article by the founding product manager of BigQuery celebrating its 10th anniversary.

https://github.com/tomgorb/project-template-for-production

project template to (help) put a Machine/Deep learning algorithm into production

airflow bigquery gcp

Last synced: 09 Jan 2025

https://github.com/mchmarny/sbomer

Generates daily SBOM and vulnerability reports for container images and saves resulting files into GCS bucket and data into BigQuery tables.

bigquery gcp gcs grype report sbom syft vex vulnerability

Last synced: 31 Dec 2024

https://github.com/pmhalvor/whale-speech

A pipeline to map whale sightings to hydrophone audio

beam bigquery gcs mle model-as-a-service python tensorflow2

Last synced: 20 Dec 2024

https://github.com/miguelapp10/workinghoursbetweentwodate_bigquery

Este proyecto es una calculadora de horas laborales que determina la cantidad de horas trabajadas entre dos fechas, teniendo en cuenta días hábiles y horas de trabajo especificadas con Bigquery

bigquery bigquery-dataset bigquery-table querying sql sql-query

Last synced: 15 Jan 2025

https://github.com/mchmarny/xstreams

Stream processing using Cloud PubSub and Dataflow SQL in BigQuery

bigquery dataflow gce gcp golang pubsub stream

Last synced: 31 Dec 2024

https://github.com/mchmarny/automodel

BigQuery automatic model rebuild based on r2 score deviation

bigquery gcp iot ml model

Last synced: 31 Dec 2024

https://github.com/antoinegiraud/dataform_hypermarche

SQL repo orchestrated by Dataform for BigQuery

bigquery dataform

Last synced: 08 Jan 2025

https://github.com/anilkhichar/bq-table-copy-automation

Copy table from one dataset to another in google big query using bash script

automation bash bash-script big-query bigquery bigquery-cp gcp google

Last synced: 29 Dec 2024

https://github.com/johannaojeling/go-data-ingestion

Cloud Function for ingesting data from Cloud Storage to BigQuery

bigquery cloud-functions cloud-storage go google-cloud

Last synced: 31 Jan 2025

https://github.com/analyticace/data-engineering-projects

Collection of Open Source Data Engineering Projects

aws big-data bigquery data docker engineering etl oracle-database pipeline sql

Last synced: 22 Dec 2024

https://github.com/fpopic/bigquery-schema-select

(Script) Generates SQL query that selects all fields (recursively for nested fields) from the provided BigQuery schema file.

bigquery bigquery-schema scala sql

Last synced: 21 Jan 2025

https://github.com/alimarzouk/paris-aq

ELTL pipeline to monitor air quality in the Paris Île-de-France area

airflow airquality big-data bigquery dataengineering gcs spark

Last synced: 22 Jan 2025

https://github.com/alterra-greeve/de-capstone

Capstone Project SIB Batch 6 x Alterra Academy - Data Engineer

bigquery cloud-function data-engineering docker googlefirebase looker-studio python

Last synced: 21 Nov 2024

https://github.com/rohitsanj/superset-dbt-demo

This repository contains an example project (Jaffle Shop) demonstrating integration between Superset and dbt, with BigQuery as the data warehouse.

apache-superset bigquery dbt superset

Last synced: 23 Jan 2025

https://github.com/dav009/bqt

Local unit tests for your BigQuery queries

bigquery bq data test unittest

Last synced: 21 Jan 2025

https://github.com/miguelapp10/api_simpliroute_urbano

extraer datos de la API de SimpliRoute y Urbano en un rango de fechas específico y procesarlos para su análisis y almacenamiento en Google BigQuery

api-client bigquery pandas python

Last synced: 21 Nov 2024

https://github.com/paulpierre/google-bq-export-downloader

Google BigQuery Export Downloader

big-data bigquery dump export gcs

Last synced: 21 Jan 2025

https://github.com/nais/bqrator

Operator for creating BigQuery datasets

bigquery bigquery-operator kubernetes kubernetes-operator nais-features

Last synced: 04 Feb 2025

https://github.com/pedrocarmona/big_query_adapter

An ActiveRecord Google BigQuery adapter

activerecord bigquery gem ruby-on-rails

Last synced: 21 Nov 2024

https://github.com/thunchanokbow/inventory-amazon

Inventory value is also important for determining a company's liquidity, or its ability to meet its short-term financial obligations. A high inventory value can indicate that a company has too much money tied up in inventory, which could make it difficult for the company to pay its bills.

azure bigquery cloudcomposer clouddatabase cloudstorage compute-engine dataproc postgresql powerbi pyspark-sql python3

Last synced: 09 Jan 2025

https://github.com/thunchanokbow/audiblebook-revenue

Manage big data on cloud computing to find a list of best-selling audible books, generate reports and dashboards, and provide products and sales promotions that meet the needs of consumers in Thailand

apache-airflow bigquery cloudcomposer data-visualization datalake datawarehouse googlecloudstorage lookerstudio pandas python3

Last synced: 09 Jan 2025

https://github.com/nghiant3110/google_analytic_4

This is a DA project based on the GA4 Sample dataset on Big Query

bigquery google-analytics looker-studio sql

Last synced: 24 Dec 2024

https://github.com/kyoshidajp/bqcop

Save your BigQuery cost.

bigquery golang

Last synced: 21 Jan 2025

https://github.com/kellyjadams/bigquery-python-weekly-report

A script to automate a weekly report that runs BigQuery in Python.

bigquery python

Last synced: 22 Jan 2025

https://github.com/mattwelke/packt-book-bot

Bot that tweets and logs the Packt free eBook of the day in BigQuery daily.

bigquery bot ebooks ibm-cloud-functions java openwhisk

Last synced: 18 Dec 2024

https://github.com/zkan/running-bigquery-query-from-airflow-using-bigqueryexecuteoperator

Running BigQuery Query from Airflow using BigQueryExecuteOperator

airflow bigquery sql

Last synced: 19 Dec 2024

https://github.com/mehmoodulhaq570/bigquery_machine_learning_project

Developed a machine learning model to predict incident groups based on data from the London Fire Brigade service calls.

bigquery bigquery-dataset cloud database jupyter-notebook machine-learning machine-learning-algorithms ml models prediction-algorithm prediction-model python

Last synced: 22 Dec 2024

https://github.com/mchirico/gmail

Inserts Gmail messages into BigQuery, then, deletes.

angular9 bigquery gcp gmail python3

Last synced: 23 Jan 2025

https://github.com/ajaxbarcelonacruyff/gcp_cost

Monitoring Google Cloud costs with Looker Studio.

bigquery googlecloud googlecloudplatform lookerstudio

Last synced: 25 Dec 2024

https://github.com/ajaxbarcelonacruyff/ga4_bigquery_session_source

Creating GA4 session references in BigQuery.

bigquery ga4 googleanalytics

Last synced: 25 Dec 2024

https://github.com/matt-strautmann/dbt-bigquery-ecommerce-quickstart

Welcome to the dbt-BigQuery Quickstart Project! 🎉 This repository is designed as a hands-on guide to help you build a modern data stack leveraging powerful tools like Airbyte for ingestion, dbt for transformation, and BigQuery for storage and analytics.

bigquery dbt e-commerce quickstarts

Last synced: 17 Jan 2025

https://github.com/ackeecz/terraform-gcp-dataflow_pubsub_to_bq

Dataflow job subscriber to PubSub subscription. It takes message from subscription and push it into BigQuery table.

bigquery dataflow pubsub terraform-module

Last synced: 07 Jan 2025

https://github.com/ritu456286/smartstockai

SmartStockAI uses AI to predict inventory trends, minimize deadstock risks, and provide actionable insights through advanced models and interactive visualizations.

bigquery bigquery-ml cloud-storage cloudrun cloudsql gemini google-maps-api

Last synced: 30 Jan 2025

https://github.com/yu-iskw/terraform-google-copy-bq-datasets

A terraform module to copy BigQuery datasets across regions

bigquery data-engineering google-cloud terraform

Last synced: 21 Dec 2024

https://github.com/metrics-pli/bigquery-export

Exports collected metrics to Google Big Query

bigquery datastudio lighthouse metrics metrics-pli performance pupeteer

Last synced: 25 Jan 2025

https://github.com/romange/puma

Bigquery-like engine for processing structured json-like records

bigquery cpp11 engine

Last synced: 23 Jan 2025

https://github.com/greenpeace/gpes-old-en-petitions-api-emulator

Emulates the deprecated EN petition's API. Useful if you have legacy microsites with petitions.

bigquery mysql petitions sqlite3

Last synced: 17 Nov 2024

https://github.com/oguzgn/fully-automated-performance-marketing-dashboard

This project integrates data from multiple ad platforms with Google Analytics to track marketing campaigns. It uses a structured naming system and UTM tags. Data is visualized in Looker Studio dashboards to analyze campaign performance and ad spend.

bigquery data-analysis data-engineering data-modeling marketing-analytics marketing-automation marketing-data-science marketingdata sql

Last synced: 29 Jan 2025

https://github.com/gabrieladados/people-analytics

People Analytics: Insights para Retenção de Talentos

bigquery figma people-analytics sql tableau

Last synced: 29 Jan 2025

https://github.com/sejalmankar1012/product_data_analyst_assessement

Analyzing the Impact of Business Hour Mismatch on Order Volume in the Food Delivery Industry: A Case Study of UEats and Ghub

assessment-project bigquery loop product-analyst sql-query

Last synced: 26 Jan 2025

https://github.com/scraly/bigquery

Google BigQuery AaaS tools, tips and fun

bigquery java

Last synced: 25 Dec 2024

https://github.com/tosh2230/cdc-rds-bq

Change data capture from Amazon RDS to Google BigQuery

bigquery changedatacapture rds

Last synced: 21 Jan 2025

https://github.com/brpy/nyc-trips

Data engineering | Zoomcamp journey on nyc trip data with gcp stack

bigquery dbt gcp pyspark

Last synced: 22 Dec 2024

https://github.com/moeabbas6/dbt_analytics_engine

An end-to-end project using dbt to demonstrate data transformations, testing, and visualization with Google BigQuery, and Looker Studio. It showcases a complete data pipeline from extraction/generation to deployment.

analytics-engineering bigquery data data-pipeline data-transformation data-visualization dbt testing

Last synced: 21 Jan 2025

https://github.com/codingsancho/fastapi-bigquery

Learning exercise, Python backend, FastAPI, bigquery, React-JS frontend.

bigquery fastapi javascript python react

Last synced: 20 Dec 2024

https://github.com/adindasarianti/rakamin_kf_analytics

This repository contains my project as a Big Data Analytics intern at Kimia Farma, where I analyzed the performance of Kimia Farma from 2020 to 2023

bigquery dataanalytics lookerstudio

Last synced: 02 Jan 2025

https://github.com/newtonmunene99/sec-filings

Simple golang app that crawls sec EDGAR filings and loads indices into Google BigQuery

bigquery cloudstorage gcp golang

Last synced: 21 Jan 2025

https://github.com/dobsontom/basket-abandonment

Data pipeline for detecting and responding to basket abandonment using BigQuery and Adobe Campaign.

adobe-campaign bigquery ga4 gcp sql

Last synced: 21 Nov 2024

https://github.com/niteshchawla/nc-sql-business-case

A Leading Retail chain brand and a prominent retailer in the United States. It makes itself a preferred shopping destination by offering outstanding value, inspiration, innovation and an exceptional guest experience that no other retailer can deliver.

bigquery retail sql supermarket

Last synced: 21 Jan 2025

https://github.com/nealwp/blobview

Generate BigQuery SQL views from JSON

bigquery cli json sql

Last synced: 21 Jan 2025

https://github.com/hrialan/dataform-prune

An open-source tool for automating the cleanup of outdated objects in Dataform configurations, optimizing data workflows with seamless CI/CD integration.

automation bigquery data-analytics dataform

Last synced: 21 Nov 2024

https://github.com/sahilmb/employee-churn-da

A data analysis project on employee churn rate using Google Bigquery, Looker, Pycaret and Colab

bigquery looker-studio pycaret

Last synced: 21 Nov 2024

https://github.com/ket0825/v1-gcp-preview

Preview 서비스를 위한 GCP 레포 / Manage GCP src for preview services

bigquery cloud-functions cloud-run cloudbuild gcp logging pubsub

Last synced: 21 Nov 2024

https://github.com/marielachirinosr/analysis-urgencias-hospital-pitalito

This project involves analyzing emergency room admission data from the E.S.E Hospital Departamental de Pitalito using a star schema model.

bigquery data data-analysis etl-pipeline tableau

Last synced: 21 Nov 2024

https://github.com/themihirmathur/uber-data-analytics

The goal of this project is to perform comprehensive data analytics on Uber trip data using a modern data engineering stack on Google Cloud Platform (GCP).

bigquery data-analysis data-engineering etl-pipeline google-cloud-platform looker python

Last synced: 21 Nov 2024

https://github.com/thanhloc81/sql-project-bicycles-practise

✨ Utilizing SQL to extract data following a simulated task involving the Sales and Product modules

adventureworks bicycle bigquery google-cloud sql

Last synced: 21 Jan 2025

https://github.com/ivdatahub/pypi-package-stats

Project for ingest pypi packages data from BigQuery and send to DataDog for analysis and insights with dashboards, monitors and more

bigquery cloud data-engineering data-warehouse gcp software-engineering

Last synced: 21 Nov 2024

https://github.com/juldrixx/bigquery-avro-schema-converter

Website to convert a schema from one format to another between BigQuery and Avro

avro avro-schema bigquery bigquery-schema converter schema

Last synced: 22 Jan 2025

https://github.com/branb97/jobstreet-data-eng-project

Building a data pipeline to deliver job listing data from Jobstreet for analysis.

airflow bigquery data-engineering etl-pipeline google-cloud looker-studio python sql

Last synced: 22 Jan 2025

https://github.com/push-protocol/push-google-bigquery

The Power of Web3 Big Data: A Guide to Using Google BigQuery and Push Protocol for Data Communication and Analysis

bigquery data push push-notifications web3

Last synced: 31 Jan 2025

https://github.com/minhajuddin2510/bigquery_alerts

In today’s data-driven world, organisations heavily rely on timely alerts to monitor critical systems and make informed decisions. However, when working with BigQuery, a popular cloud-based data warehouse, there is no built-in functionality to generate alerts. In this article, we will explore how I recently built a cloud function to address this

alerting bigquery cloudfunctions monitoring-tool slack

Last synced: 31 Jan 2025

https://github.com/paulveillard/cybersecurity-analytics

An ongoing collection of awesome software, libraries, learning tutorials, documents and books, technical resources and cool stuff about Analytics Engineering in Cybersecurity.

analytics bigdata bigquery cybernetics cybersecurity data data-engineering data-science encryption encryption-decryption seo seo-friendly seo-optimization

Last synced: 02 Feb 2025

https://github.com/moeabbas6/bq_data_loader

A Python script for executing and logging batch SQL commands in Google BigQuery. Includes tracking of execution times, unique job and statement IDs, and automated logging to a specified BigQuery table.

bigquery data python

Last synced: 29 Jan 2025

https://github.com/rrmcguinness/protoc-gen-bq-schema

A protocol buffer compiler (protoc) plugin for generating Google BigQuery JSON table definitions.

bigquery bigquery-schema protobuf

Last synced: 13 Jan 2025

https://github.com/giorgishengelia/bike-share-analysis-report

Help developing marketing strategy using data analytics to help convert casual riders into members

bigquery sql tableau

Last synced: 05 Feb 2025

https://github.com/sangnandar/load-csvs-from-gcs-to-bigquery

Google Apps Script to streamline loading CSV data from Google Cloud Storage (GCS) into BigQuery.

bigquery csv-import google-apps-script google-cloud-storage

Last synced: 13 Jan 2025

https://github.com/kellyjadams/ap-exam-scores

Analyzing AP exam scores for a school.

bigquery sql

Last synced: 08 Jan 2025

https://github.com/marceloneppel/gcs-to-bigquery

WIP: Moving data from GCS to BigQuery.

bigquery gcs scala scio

Last synced: 30 Jan 2025

https://github.com/bsrikanth24/etl-pipeline-project-sales

This project implements a pipeline ETL to process fictitious sales data.

bigquery pandas-dataframe python

Last synced: 06 Feb 2025

https://github.com/rafal-kowalski-dev/selling-cars-analize

Hobby project for learning PySpark, AirFlow and BigQuery

airflow bigquery gcp pyspark python sqlalchemy

Last synced: 30 Jan 2025

https://github.com/seahrh/nyc-taxi-trips

REST API for the New York City Taxi Trips public dataset, implemented in Scala and Play Framework 2.7

bigquery nyc-taxi-dataset play-framework rest-api scala

Last synced: 03 Feb 2025

https://github.com/siriospa/gcp-helpers-bigquery

Helpers for Google Cloud BigQuery.

bigquery gcp google-cloud-platform sirio

Last synced: 12 Oct 2024

https://github.com/quipper/send-ci-result-to-bigquery-action

Send test results to BigQuery in GitHub Actions

bigquery github-actions google-bigquery junit-xml

Last synced: 09 Jan 2025

https://github.com/chdl17/nyc_green_taxis_peak_hour_analysis

This project analyzes GCP BigQuery data and uses Looker Studio to build a Peak Hour Analysis.

bigquery gcp google-cloud-platform looker-studio sql

Last synced: 21 Nov 2024

https://github.com/denny-b-justin/purdue

The internship was broadly to understand if the topics/events are being covered differently in the different countries and how they affect stock market returns. The provided dataset is a post-processed set of news articles, so already reflects topic modelling and sentiment analysis.

big-data bigquery finance gdelt-events python

Last synced: 21 Jan 2025

https://github.com/amitkumarj441/mysql2bigquery

A script to load a MySQL table in BigQuery. Extracts schema and data as JSON.

bigquery docker mysql scala

Last synced: 26 Jan 2025

https://github.com/sayed-ashfaq/target-sql

In this project, I analyzed Target company's data using SQL in BigQuery, focusing on data extraction, manipulation, and performing various analytical queries to derive insights.

aggregation bigquery cte joins sql

Last synced: 23 Dec 2024

https://github.com/armahdavi/bigdata_pyspark_sales_analytics

Summarizing my big data code in python pyspark to analyze sales data with retail and walmart superstore to draw sales insights

big-data bigquery clustering dataframe hadoop k-means machine-learning pyspark pyspark-ml python spark unsupervised-learning

Last synced: 28 Dec 2024

https://github.com/fsistemas/bigquery-td

ETL to extract data from mysql load and merge in BigQuery

bigquery etl mysql python sql

Last synced: 03 Jan 2025

https://github.com/alexgenovese/machine-learning-bigquery-gcp

These SQL are based on available ecommerce dataset that has millions of Google Analytics records for the Google Merchandise Store loaded into BigQuery.

bigquery google google-cloud-platform purchase sql visitors

Last synced: 28 Dec 2024

https://github.com/gabrieladados/analise-ecommerce

Análise SQL para E-commerce: Estratégias de Crescimento para Impulsionar Vendas

bigquery data-analysis ecommerce sql

Last synced: 06 Feb 2025

https://github.com/ansh-info/stockpulse

Real-time stock market analytics pipeline with live visualization dashboard. Built with Python and GCP, featuring automated data processing and interactive Streamlit analytics.

api big-data bigquery cloud cloud-computing cloud-native data-engineering data-pipeline docker docker-compose gcp gcp-automation-gitops gcp-cloud-run gcp-pubsub google-cloud-platform real-time realtime stock-market stocks streamlit

Last synced: 27 Dec 2024

https://github.com/syedsajjadaskari/end-to-end-chicago-taxi-tip-prediction-with-bigquery-and-vertex-ai

An end-to-end example of Chicago taxi on Google Cloud using TensorFlow, TFX, and Vertex AI

bigquery gcp tensorflow tfx vertex-ai

Last synced: 13 Jan 2025

BigQuery Awesome Lists
BigQuery Categories