Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

BigQuery

Google BigQuery enables companies to handle large amounts of data without having to manage infrastructure. Google’s documentation describes it as a « serverless architecture (that) lets you use SQL queries to answer your organization’s biggest questions with zero infrastructure management. BigQuery’s scalable, distributed analysis engine lets you query terabytes in seconds and petabytes in minutes. » Its client libraries allow the use of widely known languages such as Python, Java, JavaScript, and Go. Federated queries are also supported, making it flexible to read data from external sources.

📖 A highly rated canonical book on it is « Google BigQuery: The Definitive Guide », a comprehensive reference. Another enriching read on the subject is the inside story told in the article by the founding product manager of BigQuery celebrating its 10th anniversary.

https://github.com/miguelapp10/api_simpliroute_urbano

extraer datos de la API de SimpliRoute y Urbano en un rango de fechas específico y procesarlos para su análisis y almacenamiento en Google BigQuery

api-client bigquery pandas python

Last synced: 21 Nov 2024

https://github.com/pedrocarmona/big_query_adapter

An ActiveRecord Google BigQuery adapter

activerecord bigquery gem ruby-on-rails

Last synced: 21 Nov 2024

https://github.com/thunchanokbow/inventory-amazon

Inventory value is also important for determining a company's liquidity, or its ability to meet its short-term financial obligations. A high inventory value can indicate that a company has too much money tied up in inventory, which could make it difficult for the company to pay its bills.

azure bigquery cloudcomposer clouddatabase cloudstorage compute-engine dataproc postgresql powerbi pyspark-sql python3

Last synced: 09 Jan 2025

https://github.com/thunchanokbow/audiblebook-revenue

Manage big data on cloud computing to find a list of best-selling audible books, generate reports and dashboards, and provide products and sales promotions that meet the needs of consumers in Thailand

apache-airflow bigquery cloudcomposer data-visualization datalake datawarehouse googlecloudstorage lookerstudio pandas python3

Last synced: 09 Jan 2025

https://github.com/kellyjadams/bigquery-python-weekly-report

A script to automate a weekly report that runs BigQuery in Python.

bigquery python

Last synced: 22 Jan 2025

https://github.com/cartodb/carto-auth

Python library to authenticate with CARTO

auth bigquery carto carto-dw oauth

Last synced: 12 Oct 2024

https://github.com/icarusso/bigqueryexporter

Export query data from google bigquery to local machine

bigquery csv export python

Last synced: 21 Nov 2024

https://github.com/miguelapp10/etl_operadorlogistico

extraer datos de la API de SimpliRoute, AndesExpress y Urbano en un rango de fechas específico y procesarlos para su análisis y almacenamiento en Google BigQuery

api-client bigquery pandas python

Last synced: 20 Dec 2024

https://github.com/paulveillard/cybersecurity-analytics

An ongoing collection of awesome software, libraries, learning tutorials, documents and books, technical resources and cool stuff about Analytics Engineering in Cybersecurity.

analytics bigdata bigquery cybernetics cybersecurity data data-engineering data-science encryption encryption-decryption seo seo-friendly seo-optimization

Last synced: 07 Dec 2024

https://github.com/ritu456286/smartstockai

SmartStockAI uses AI to predict inventory trends, minimize deadstock risks, and provide actionable insights through advanced models and interactive visualizations.

bigquery bigquery-ml cloud-storage cloudrun cloudsql gemini google-maps-api

Last synced: 30 Jan 2025

https://github.com/windi-wulandari/pbi_kimia-farma-x-rakamin

A data-driven analytics project for Kimia Farma to evaluate business performance from 2020-2023 using BigQuery. Focused on transaction data, inventory, branch operations, and product insights. Results were visualized through an interactive dashboard to support strategic decisions and optimizations.

big-data-analytics bigquery datawarehouse googlelooker sql

Last synced: 23 Jan 2025

https://github.com/yaph/queries

Collection of Data Queries in SPARQL and SQL

bigquery data-mining dbpedia openstreetmap osm queries sparql sql stackoverflow wikidata

Last synced: 08 Jan 2025

https://github.com/justinbeckwith/bisquick

🥞Synchronize your GitHub issues with BigQuery. Do neat stuff.

bigquery dotnet github

Last synced: 19 Dec 2024

https://github.com/nais/bqrator

Operator for creating BigQuery datasets

bigquery bigquery-operator kubernetes kubernetes-operator nais-features

Last synced: 09 Dec 2024

https://github.com/pmhalvor/whale-speech

A pipeline to map whale sightings to hydrophone audio

beam bigquery gcs mle model-as-a-service python tensorflow2

Last synced: 20 Dec 2024

https://github.com/mchmarny/sbomer

Generates daily SBOM and vulnerability reports for container images and saves resulting files into GCS bucket and data into BigQuery tables.

bigquery gcp gcs grype report sbom syft vex vulnerability

Last synced: 31 Dec 2024

https://github.com/mchmarny/xstreams

Stream processing using Cloud PubSub and Dataflow SQL in BigQuery

bigquery dataflow gce gcp golang pubsub stream

Last synced: 31 Dec 2024

https://github.com/mchmarny/automodel

BigQuery automatic model rebuild based on r2 score deviation

bigquery gcp iot ml model

Last synced: 31 Dec 2024

https://github.com/rohitsanj/superset-dbt-demo

This repository contains an example project (Jaffle Shop) demonstrating integration between Superset and dbt, with BigQuery as the data warehouse.

apache-superset bigquery dbt superset

Last synced: 23 Jan 2025

https://github.com/nghiant3110/google_analytic_4

This is a DA project based on the GA4 Sample dataset on Big Query

bigquery google-analytics looker-studio sql

Last synced: 24 Dec 2024

https://github.com/mattwelke/packt-book-bot

Bot that tweets and logs the Packt free eBook of the day in BigQuery daily.

bigquery bot ebooks ibm-cloud-functions java openwhisk

Last synced: 18 Dec 2024

https://github.com/zkan/running-bigquery-query-from-airflow-using-bigqueryexecuteoperator

Running BigQuery Query from Airflow using BigQueryExecuteOperator

airflow bigquery sql

Last synced: 19 Dec 2024

https://github.com/ajaxbarcelonacruyff/gcp_cost

Monitoring Google Cloud costs with Looker Studio.

bigquery googlecloud googlecloudplatform lookerstudio

Last synced: 25 Dec 2024

https://github.com/ajaxbarcelonacruyff/ga4_bigquery_session_source

Creating GA4 session references in BigQuery.

bigquery ga4 googleanalytics

Last synced: 25 Dec 2024

https://github.com/matt-strautmann/dbt-bigquery-ecommerce-quickstart

Welcome to the dbt-BigQuery Quickstart Project! 🎉 This repository is designed as a hands-on guide to help you build a modern data stack leveraging powerful tools like Airbyte for ingestion, dbt for transformation, and BigQuery for storage and analytics.

bigquery dbt e-commerce quickstarts

Last synced: 17 Jan 2025

https://github.com/romange/puma

Bigquery-like engine for processing structured json-like records

bigquery cpp11 engine

Last synced: 23 Jan 2025

https://github.com/gdbecker/dbtlabslearning

Learn the foundational steps of transforming data in dbt Cloud. Start by connecting dbt Cloud to a data warehouse and Git repository, then explore key concepts like modeling, sources, testing, documentation, and deployment. Get hands-on by building a model and running tests in dbt Cloud.

analytics-engineering bigquery dbt dbt-cloud jinja macros models packages sql testing

Last synced: 22 Jan 2025

https://github.com/morphl-ai/morphl-model-publishers-churning-users-bigquery

BigQuery connector, pre-processor and model for predicting churning users for digital publishers using Google Analytics 360

bigquery google-analytics machine-learning morphl-platform pipeline preprocessor pyspark

Last synced: 11 Jan 2025

https://github.com/tatamiya/new-books-notification

Fetch new books from [版元ドットコム](https://www.hanmoto.com/) and notify them to Slack

bigquery cloudrun-jobs gcs golang slack

Last synced: 12 Jan 2025

https://github.com/samedhi/gaend

Convert GAE Models into endpoints

bigquery elasticsearch google-app-engine restful taskqueue

Last synced: 12 Jan 2025

https://github.com/rsachdeva/illuminatingdeposits-gcp-trigger

Terraform usage in the context of Google Cloud Platform GCP based Trigger of Resources applied to Cloud Functions. Both resource creation and destruction is through Terraform.

bigquery bigquery-table cloud-events functions-framework gcp go golang golangci-lint google-cloud google-cloud-function-pubsub-trigger google-cloud-functions google-cloud-pubsub google-cloud-sdk google-cloud-storage google-cloud-terraform sendgrid terraform

Last synced: 18 Jan 2025

https://github.com/elithrar/finding-bugs-with-bigquery

A talk on using BigQuery, the GitHub Public Data & some elbow grease to find bugs in OSS projects.

big-data bigquery bugs github golang open-source

Last synced: 24 Jan 2025

https://github.com/misicode/Kaggle-Intro_to_SQL

Solutions of the exercises from the course "Introduction to SQL with BigQuery" by @Kaggle.

bigquery kaggle kaggle-intro-to-sql sql

Last synced: 23 Oct 2024

https://github.com/johannaojeling/go-data-ingestion

Cloud Function for ingesting data from Cloud Storage to BigQuery

bigquery cloud-functions cloud-storage go google-cloud

Last synced: 31 Jan 2025

https://github.com/metrics-pli/bigquery-export

Exports collected metrics to Google Big Query

bigquery datastudio lighthouse metrics metrics-pli performance pupeteer

Last synced: 25 Jan 2025

https://github.com/justinjsd/analytics-engineer-bootcamp

This repository serves as a collection of my work and learnings throughout the bootcamp, focusing on developing skills in analytics engineering, particularly using dbt.

analytics bigquery dbt engineering sql

Last synced: 05 Nov 2024

https://github.com/vigneshss-07/complete-atoz-sql

This deals with SQL commands, interview preparation and query questions and solutions

azuresql bigquery gcp sql sql-query sql-server sqlalchemy

Last synced: 15 Nov 2024

https://github.com/valenthr/purchase_funnel

Google merch store sales analysis

bigquery product-analysis sql

Last synced: 27 Jan 2025

https://github.com/night-fury-me/real-time-vehicle-data-processing

A repository that contains implementation of a Real-Time Vehicle Data Processing Pipeline that efficiently manages and analyzes vehicle data through a cohesive system.

bigquery cpp data-engineering data-streaming flink grpc kafka python real-time-data-processing

Last synced: 22 Jan 2025

https://github.com/coatless/bigquery-reddit-ask-your-advisor

Analysis code that counts instances of a phrase on Reddit (e.g. "ask your advisor")

ask-your-advisor bigquery r reddit

Last synced: 16 Jan 2025

https://github.com/shikanime/seeker

Data platform based on BigQuery

bigquery dataform google-cloud

Last synced: 04 Jan 2025

https://github.com/jancervenka/bqcli

REPL for BigQuery

bigquery data-science gcp google python

Last synced: 31 Dec 2024

https://github.com/marceloneppel/map-to-bigquery-structs

Tool to convert a Golang map to a struct containing fields with types like bigquery.Null*.

bigquery golang map struct

Last synced: 30 Jan 2025

https://github.com/nikhilsree5/targetcasestudy

An exploratory and in-depth study of the e-commerce market in Brazil.

bigquery eda sql visualization

Last synced: 22 Jan 2025

https://github.com/crudek-data/bigquery-kaggle-apis

kaggle api to download free datasets along with google bigquery api to read/write from cloud data warehouse

bigquery data-engineering kaggle

Last synced: 22 Jan 2025

https://github.com/ackeecz/terraform-gcp-cloud-function_pubsub_to_bq

Cloud function subscribes itself to given topic and inserts each message to BigQuery table.

bigquery cloud-functions pubsub terraform-module

Last synced: 07 Jan 2025

https://github.com/owox/sgtm-owox-ga4-bigquery

OWOX BI Streaming is an advanced tracking to get the most from existing Google Analytics 4 installed on your website

analytics bigquery ga4

Last synced: 20 Dec 2024

https://github.com/digitaloptimizationgroup/digitaloptgroup-r-notebooks

A collection of R notebooks to analyze data from the Digital Optimization Group Platform

ab-testing bigquery jupyter-notebook performance-analysis r web-analytics

Last synced: 21 Jan 2025

https://github.com/ruru-lyy/nyc-taxi-service-pipeline

In this project, I built a data pipeline using Mage.ai for ETL, GCP for storage, BigQuery for querying, and Looker Studio for analytics. This project helped me learn how to process, store, and visualize data effectively using modern tools.

bigquery data-engineering data-modeling etl-pipeline looker mage-ai python

Last synced: 23 Jan 2025

https://github.com/edumoraes1/spam_count_sfmc

Consulta de SQL com contagem de envios de email e spam dos ultimos 365 dias

bigquery marketing-cloud salesforce sql

Last synced: 31 Dec 2024

https://github.com/yoshiyukikato/nightharbor-bigquery-reporter

A nightharbor reporter for GCP BigQuery

bigquery lighthouse

Last synced: 23 Jan 2025

https://github.com/spacepatcher/google-workspace-gmail-collector

👁 App for collecting Gmail logs from your Google Workspace account and sending them to Kafka

bigquery gmail google-workspace security soc

Last synced: 23 Oct 2024

https://github.com/victorcezeh/data-engineering-final-semester-portfolio

This GitHub repository serves as a comprehensive platform for managing and showcasing my data engineering projects and assessments throughout my final semester at Alt School Africa. Designed to foster collaboration, organization, and continuous improvement, this repository is the backbone of my academic journey in data engineering.

bigquery docker gcs-bucket postgresql python

Last synced: 17 Nov 2024

https://github.com/alessio-siciliano/bigquery-utils

A utility library that enhances the official BigQuery Python client with additional tools for query management, data processing, and automation, making it easier to work efficiently with Google BigQuery.

bigquery datatransfer google-cloud python

Last synced: 28 Jan 2025

https://github.com/squidmin/bigquery-labs

GCP BigQuery CLI

bigquery gcp java

Last synced: 14 Dec 2024

https://github.com/martinkalema/bigquery-pubsub

Loading data into BigQuery Table

bigquery data-engineering flat-file kafka

Last synced: 11 Jan 2025

https://github.com/drvipulasharma/e-commerce-data-analysis-sql-big---query

E-Commerce-Data-Analysis-SQL-Big-Query

bigquery sql

Last synced: 23 Jan 2025

https://gitlab.com/solidninja/albion

A Scala BigQuery client

bigquery scala

Last synced: 11 Dec 2024

https://github.com/govau/warcraider

Convert WARC files into Avro for big data processing

avro bigquery crawler rust warc

Last synced: 21 Jan 2025

https://github.com/hayashi-yudai/cloudfunc_login

Example of authentication function for login with Cloud Functions and BigQuery

bigquery gcp-cloud-functions golang server

Last synced: 15 Jan 2025

https://github.com/xennis/particulate-matter-sensor-storage

Store the particulate matter data from a luftdaten.info sensor in BigQuery

bigquery cloud-function luftdaten particulate-matter sensor-data

Last synced: 18 Nov 2024

https://github.com/lixx21/airflow-dbt-gcp

A comprehensive data pipeline leveraging Airflow, DBT, Google Cloud Platform (GCP), and Docker to extract, transform, and load data seamlessly from a staging layer to a data warehouse and data mart.

airflow bigquery data-engineer dbt gcp

Last synced: 29 Jan 2025

https://github.com/oguzgn/fully-automated-performance-marketing-dashboard

This project integrates data from multiple ad platforms with Google Analytics to track marketing campaigns. It uses a structured naming system and UTM tags. Data is visualized in Looker Studio dashboards to analyze campaign performance and ad spend.

bigquery data-analysis data-engineering data-modeling marketing-analytics marketing-automation marketing-data-science marketingdata sql

Last synced: 29 Jan 2025

https://github.com/gabrieladados/people-analytics

People Analytics: Insights para Retenção de Talentos

bigquery figma people-analytics sql tableau

Last synced: 29 Jan 2025

https://github.com/rrmcguinness/protoc-gen-bq-schema

A protocol buffer compiler (protoc) plugin for generating Google BigQuery JSON table definitions.

bigquery bigquery-schema protobuf

Last synced: 13 Jan 2025

https://github.com/angulartist/scio-demo

Playing w/ Scio

apache-beam bigquery scio

Last synced: 10 Nov 2024

https://github.com/sangnandar/load-csvs-from-gcs-to-bigquery

Google Apps Script to streamline loading CSV data from Google Cloud Storage (GCS) into BigQuery.

bigquery csv-import google-apps-script google-cloud-storage

Last synced: 13 Jan 2025

https://github.com/brpy/nyc-trips

Data engineering | Zoomcamp journey on nyc trip data with gcp stack

bigquery dbt gcp pyspark

Last synced: 22 Dec 2024

https://github.com/sintef/bigquery-postgresql-wire-proxy

A PostgreSQL wire protocol proxy server for BigQuery.

bigquery postgresql proxy

Last synced: 12 Jan 2025

https://github.com/yiu31802/gcp-project

GCP AppEngine project of Twitter data and some sample code

appengine bigquery gcp google-bigquery google-cloud google-datastore resas twitter twitter-data twitter4j

Last synced: 07 Dec 2024

https://github.com/rifa8/extract-load-demo

Learning Google Cloud Platform (GCP)

airbyte bigquery bucket gcp

Last synced: 27 Jan 2025

https://github.com/mdornseif/datastore-to-bigquery

The missing Data Transfer Tool: Dump Google Cloud Datastore contents and load them into BigQuery.

backup bigquery bigquery-backup cloud datastore google

Last synced: 21 Jan 2025

https://github.com/dobsontom/basket-abandonment

Data pipeline for detecting and responding to basket abandonment using BigQuery and Adobe Campaign.

adobe-campaign bigquery ga4 gcp sql

Last synced: 21 Nov 2024

https://github.com/niteshchawla/nc-sql-business-case

A Leading Retail chain brand and a prominent retailer in the United States. It makes itself a preferred shopping destination by offering outstanding value, inspiration, innovation and an exceptional guest experience that no other retailer can deliver.

bigquery retail sql supermarket

Last synced: 21 Jan 2025

https://github.com/nealwp/blobview

Generate BigQuery SQL views from JSON

bigquery cli json sql

Last synced: 21 Jan 2025

https://github.com/hrialan/dataform-prune

An open-source tool for automating the cleanup of outdated objects in Dataform configurations, optimizing data workflows with seamless CI/CD integration.

automation bigquery data-analytics dataform

Last synced: 21 Nov 2024

https://github.com/sahilmb/employee-churn-da

A data analysis project on employee churn rate using Google Bigquery, Looker, Pycaret and Colab

bigquery looker-studio pycaret

Last synced: 21 Nov 2024

https://github.com/arhea/go-mock-bigquery

Creates a mock BigQuery client based on the bigquery-emulator for testing in Golang projects.

bigquery golang golang-module google-bigquery google-cloud-platform testcontainers-go testing

Last synced: 21 Jan 2025

https://github.com/ket0825/v1-gcp-preview

Preview 서비스를 위한 GCP 레포 / Manage GCP src for preview services

bigquery cloud-functions cloud-run cloudbuild gcp logging pubsub

Last synced: 21 Nov 2024

https://github.com/marielachirinosr/analysis-urgencias-hospital-pitalito

This project involves analyzing emergency room admission data from the E.S.E Hospital Departamental de Pitalito using a star schema model.

bigquery data data-analysis etl-pipeline tableau

Last synced: 21 Nov 2024

https://github.com/themihirmathur/uber-data-analytics

The goal of this project is to perform comprehensive data analytics on Uber trip data using a modern data engineering stack on Google Cloud Platform (GCP).

bigquery data-analysis data-engineering etl-pipeline google-cloud-platform looker python

Last synced: 21 Nov 2024

BigQuery Awesome Lists
BigQuery Categories