Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

BigQuery

Google BigQuery enables companies to handle large amounts of data without having to manage infrastructure. Google’s documentation describes it as a « serverless architecture (that) lets you use SQL queries to answer your organization’s biggest questions with zero infrastructure management. BigQuery’s scalable, distributed analysis engine lets you query terabytes in seconds and petabytes in minutes. » Its client libraries allow the use of widely known languages such as Python, Java, JavaScript, and Go. Federated queries are also supported, making it flexible to read data from external sources.

📖 A highly rated canonical book on it is « Google BigQuery: The Definitive Guide », a comprehensive reference. Another enriching read on the subject is the inside story told in the article by the founding product manager of BigQuery celebrating its 10th anniversary.

https://github.com/viant/bqwt

BigQuery Windowed Tables

bigquery etl

Last synced: 07 Dec 2024

https://github.com/nodeart/koatuu-to-bigquery

load koatuu from https://data.gov.ua/dataset/dc081fb0-f504-4696-916c-a5b24312ab6e to Google BigQuey in denormalized form

bigquery google-bigquey koatuu

Last synced: 18 Nov 2024

https://github.com/wintermi/bqwrite-test

A command line application designed to provide a method to test the BigQuery Streaming API or BigQuery Storage Write API, allowing you to get a view of the potential throughput available via a given host.

bigquery google-cloud google-cloud-platform

Last synced: 09 Nov 2024

https://github.com/dmytrovoytko/data-engineering-amazon-reviews

Data Engineering project for ZoomCamp`24: JSONL -> PostgreSQL/BigQuery + Metabase + Mage.AI

bash-script bigquery codespaces data-analysis data-visualization etl metabase pipeline python-script

Last synced: 14 Nov 2024

https://github.com/wintermi/fashion-dataform

An example Dataform project to load and transform the publicly available dataset from H&M Group into a format which could be imported into Discovery AI for Retail or Vertex AI Search and Conversation, , allowing you to train a retail recommendations model.

bigquery dataform google-cloud google-cloud-platform vertex-ai

Last synced: 09 Nov 2024

https://github.com/wintermi/bq2csv

A command line application designed to provide a simple method to execute a BigQuery SQL script from "stdin", outputting all results to "stdout" in CSV format. A detailed log is output to the console "stderr" providing you with the available execution statistics.

bigquery google-cloud google-cloud-platform

Last synced: 12 Oct 2024

https://github.com/moh-ayman/mongodb-to-bigquery---cloud-func-etl

Google Cloud Function built to perform an ETL Job to Collect MongoDB Data and Transform it to be able to Import it to Bigquery.

bigquery etl-pipeline gcp-cloud-functions mongodb pandas-python

Last synced: 15 Nov 2024

https://github.com/42digital/bqtools

Python Tools for BigQuery

bigquery bigquery-schema migrations python

Last synced: 12 Oct 2024

https://github.com/jolares/example-gcp-dataform

Example end-to-end ELT data pipeline using GCP Dataform.

bigquery dataform etl-pipeline

Last synced: 28 Nov 2024

https://github.com/wintermi/bqrunner

A command line application designed to provide a simple method to execute one or more SQL queries against a given dataset in BigQuery. A detailed log is output to the console providing you with the available execution statistics.

bigquery google-cloud google-cloud-platform

Last synced: 24 Oct 2024

https://github.com/indatawetrust/bigquery-api

A quick and easy to use package for Google Cloud BigQuery

bigdata bigquery google-cloud

Last synced: 13 Dec 2024

https://github.com/neo4j-field/bigquery-connector

Bi-directional connectivity between Google BigQuery and Neo4j AuraDS

arrow-flight bigquery neo4j protobuf python spark

Last synced: 23 Dec 2024

https://github.com/gjbae1212/go-bqworker

go-esworker is an async worker that data can bulk insert, update to the BigQuery.

async bigquery bigquery-bulk gcp go golang parallel worker

Last synced: 25 Dec 2024

https://github.com/greenpeace/gpes-bigquery-recipes

Google Big Query recipes to Analyse our data.

bigquery database-management sql

Last synced: 17 Nov 2024

https://github.com/toddbirchard/bigquery-to-sql

:bar_chart: :arrow_right: :floppy_disk: Lightweight ETL script to migrate data from BigQuery to SQL.

bigquery etl google-cloud python sql sqlalchemy

Last synced: 16 Jan 2025

https://github.com/stkchan/googleanalytics4-publicdataset-ecommerce-dashboard-powerbi

This dashboard uses Power BI Desktop as a visualization tool by extracting data from Google BigQuery.

analytics bigquery dashboard portfolio portfolio-project powerbi sql

Last synced: 13 Oct 2024

https://github.com/yashika-malhotra/strategic-analysis-of-retail-brand-in-south-america-using-sql

Leveraged Big Query and MySQL to analyze 100K records for sales optimization, trend identification, and enhancing customer satisfaction for a retail brand in South America and to provide insights and recommendations to improve their userbase and improve their services

bigquery data-analysis data-science database database-schema google-bigquery mysql-server sql

Last synced: 14 Jan 2025

https://github.com/vigneshss-07/google-cloud-professional-data-engineer-acompleteguide

This Repo contains all study, lab and supportive materials for Udemy course on "Google Cloud Professional Data Engineer - A Complete Guide".

big-data bigquery cloud-computing dataengineering elt-pipeline etl-framework gcp-services gcp-storage google-cloud machine-learning

Last synced: 12 Oct 2024

https://github.com/moh-ayman/stripeapi-to-bq---cfunc-etl

Google Cloud Function built to perform an ETL Job to Collect StripeAPI Data and Transform it to be able to Import it to Bigquery.

bigquery dataengineering etl-pipeline gcp gcp-cloud-functions pandas-dataframe python stripe-api

Last synced: 15 Jan 2025

https://github.com/jasontanx/gsheet-to-bq-ingestion

Data ingestion from Google Sheet to BigQuery

bigquery data-engineering data-ingestion gsheets

Last synced: 05 Dec 2024

https://github.com/triglav-dataflow/triglav-agent-bigquery

BigQuery agent for Triglav, data-driven workflow tool

bigquery ruby triglav-agent

Last synced: 12 Oct 2024

https://github.com/evry-ace/statsbot

Slack Bot to forward message statistics to BigQuery

bigquery slack slack-bot slackbot

Last synced: 12 Oct 2024

https://github.com/ovotech/bigquery-metrics-exporter

A Golang application to export table level metrics from BigQuery into Datadog.

bigquery company-ovo datadog

Last synced: 10 Dec 2024

https://github.com/jroakes/npath

Exploring path sequences in GA4 BigQuery data

analytics bigquery pathfinding-algorithm

Last synced: 23 Dec 2024

https://github.com/borisgra/fullweb

FullStack Web Applications with React and Kotlin JS and NPM(ag-grid). Look and modify ANY view from ANY base (PGSql, Bigquery Google)

bigquery compose docker dockerhub fullstack kotlin kotlin-fullstack kotlin-js kotlin-js-react kotlin-jvm kotlin-multiplatform ktor npm-module postgresql

Last synced: 25 Jan 2025

https://github.com/ryanmcdowell/dataflow-bigquery-dynamic-destinations

An example pipeline for dynamically routing events from Pub/Sub to different BigQuery tables based on a message attribute.

apache-beam bigquery google-cloud-dataflow google-cloud-platform

Last synced: 21 Nov 2024

https://github.com/airscholar/dbt-bigquery-crash-course

A deep dive into the powerful combination of DBT and BigQuery, the game-changers in modern data engineering.

bigquery data-engineering dbt google-cloud

Last synced: 14 Nov 2024

https://github.com/dav009/dbtmock

end to end unit tests for dbt ( Data build tool ) pipelines

bigquery data-build-tool dbt mock pipelines test testing unittest unittesting

Last synced: 12 Dec 2024

https://github.com/wintermi/tmdb-dataform

An example Dataform project to load and transform the publicly available dataset from The Movie Database into a format which could be imported into Vertex AI Search for Media, allowing you to build a search engine for movies.

bigquery dataform google-cloud google-cloud-platform

Last synced: 21 Nov 2024

https://github.com/t3n/gtmetrix-bq

A script running browser test of specified urls through GTmetrix and saving metrics in BigQuery.

bigquery gtmetrix

Last synced: 17 Dec 2024

https://github.com/rezuankassim/bqanalytic

Laravel package to use analytic data imported to Big Query from Firebase Analytic

bigquery firebase-analytics laravel

Last synced: 18 Dec 2024

https://github.com/brews/bucket2bq

Create an inventory of objects in GCS Bucket with metadata and upload to Big Query

bigquery gcp golang google-cloud-storage

Last synced: 20 Oct 2024

https://github.com/pirate-emperor/k2bq

K2BQ is a dataflow pipeline that streams data from Kafka to BigQuery. It uses Google Cloud’s managed Kafka, Dataflow for processing, and BigQuery for real-time analytics, offering scalable, automated data integration for fast insights.

bigquery cloud-computing cloud-infrastructure data-integration data-streaming dataflow google-cloud infrastructure-as-code kafka python realtime-analytics terraform

Last synced: 17 Nov 2024

https://github.com/deepraj1729/gcp-cloud-billing-api

Cloud Billing - Cost Monitoring and Alerting API for Google Cloud (Billing Exports)

bigquery fastapi gcp python redis

Last synced: 11 Jan 2025

https://github.com/poogles/pytest-bq

pytest fixtures for a local bigquery suitable for local development.

bigquery bigquery-emulator pytest

Last synced: 26 Dec 2024

https://github.com/jehiah/socrata_to_bigquery

A tool to copy public data to BigQuery

bigquery opendata socrata

Last synced: 23 Oct 2024

https://github.com/leandronasx/agro-data

Projeto final da formação de analista de dados e dashboard da SoulCode Academy.

bigquery data-analysis gcp looker pandas powerbi python

Last synced: 12 Oct 2024

https://github.com/aymane-maghouti/sales-data-pipeline

This ETL (Extract, Transform, Load) project demonstrates the process of extracting data from a SQL Server database, transforming it using Python, orchestrating the data pipeline with Apache Airflow (running in a Docker container), loading the transformed data into Google BigQuery data warehouse, and finally creating a dashboard using Looker Studio.

airflow bigquery etl-pipeline gcp looker-studio orchestrator python sql-server-database

Last synced: 17 Jan 2025

https://github.com/xlfe/pyjdbq

The easiest way to ship journald logs to Google BigQuery

bigquery journald journald-logs logging security

Last synced: 18 Jan 2025

https://github.com/takegue/bigquery-porter

BigQuery Deployment and Metadata Management tool

bigquery

Last synced: 12 Oct 2024

https://github.com/pcorbel/go-bigquery-acl

Simply apply ACL on BigQuery resources

acl bigquery config golang security

Last synced: 19 Jan 2025

https://github.com/akihokurino/recruitment-server-gae

job listings api server. create go application in google app engine 2nd. use twirp for api interface and use sops with kms for secure environment. use cloud build for cicd. use algolia for search engine. sync datastore to bigquery.

algolia bigquery boom cloud-scheduler cloudbuild datastore firebase-auth gae gcp golang grpc kms realtime-database twirp

Last synced: 05 Dec 2024

https://github.com/leereilly/wee-queries

Query sets for Google Cloud Platform's BigQuery :mag:

bigquery

Last synced: 24 Jan 2025

https://github.com/seftimie/feedyext

Feedy EXT - Brand Sentiment Analysis

bigquery chrome crun gcp genai hackathon llms looker ml ondevice

Last synced: 04 Dec 2024

https://github.com/olajideolagunju/gcp_mage_data_pipeline

An end-to-end data pipeline solution to process and analyze Maintenance Work Orders using Mage, Google BigQuery, Cloud SQL, and Looker Studio. Features a seamless integration of cloud tools for scalable data storage, transformation, and visualization.

automation bigquery cloud cloud-sql compute-engine data data-engineering database database-schema docker-compose excel gcp mage-ai maintenance orchestration python sql virtual-machine visualization-dashboard work-orders

Last synced: 21 Oct 2024

https://github.com/galois1915/google-ml-engineer

This program provides the skills you need to advance your career and provides training to support your preparation for the industry-recognized Google Cloud Professional Machine Learning Engineer certification.

api automl bigquery keras mlops-workflow tensorflow2 vertex-ai

Last synced: 22 Jan 2025

https://github.com/tuancamtbtx/gcp-udfs-example

Google BigQuery Javascript UDF Function Examples

bigquery gcp javascript nodejs npm udf

Last synced: 02 Jan 2025

https://github.com/pedrocarmona/big_query_adapter

An ActiveRecord Google BigQuery adapter

activerecord bigquery gem ruby-on-rails

Last synced: 21 Nov 2024

https://github.com/salrashid123/gcp_cloud_status_dataset

BigQuery Dataset to query GCP Cloud Status Dashboard (https://status.cloud.google.com/)

bigquery gcp google-cloud google-cloud-platform

Last synced: 22 Jan 2025

https://github.com/thunchanokbow/inventory-amazon

Inventory value is also important for determining a company's liquidity, or its ability to meet its short-term financial obligations. A high inventory value can indicate that a company has too much money tied up in inventory, which could make it difficult for the company to pay its bills.

azure bigquery cloudcomposer clouddatabase cloudstorage compute-engine dataproc postgresql powerbi pyspark-sql python3

Last synced: 09 Jan 2025

https://github.com/thunchanokbow/audiblebook-revenue

Manage big data on cloud computing to find a list of best-selling audible books, generate reports and dashboards, and provide products and sales promotions that meet the needs of consumers in Thailand

apache-airflow bigquery cloudcomposer data-visualization datalake datawarehouse googlecloudstorage lookerstudio pandas python3

Last synced: 09 Jan 2025

https://github.com/dav009/bqt

Local unit tests for your BigQuery queries

bigquery bq data test unittest

Last synced: 21 Jan 2025

https://github.com/kellyjadams/bigquery-python-weekly-report

A script to automate a weekly report that runs BigQuery in Python.

bigquery python

Last synced: 22 Jan 2025

https://github.com/zkan/running-bigquery-query-from-airflow-using-bigqueryexecuteoperator

Running BigQuery Query from Airflow using BigQueryExecuteOperator

airflow bigquery sql

Last synced: 19 Dec 2024

https://github.com/ostrokach/uniparc_xml_parser

UniParc dataset describing ~300 million protein sequences converted into relational tables accessible through Google BigQuery (and as Parquet files).

bigquery bioinformatics csv-files parquet-files protein-domains protein-sequences

Last synced: 21 Jan 2025

https://github.com/ackeecz/terraform-gcp-dataflow_pubsub_to_bq

Dataflow job subscriber to PubSub subscription. It takes message from subscription and push it into BigQuery table.

bigquery dataflow pubsub terraform-module

Last synced: 07 Jan 2025

https://github.com/nais/bqrator

Operator for creating BigQuery datasets

bigquery bigquery-operator kubernetes kubernetes-operator nais-features

Last synced: 09 Dec 2024

https://github.com/justinjsd/analytics-engineer-bootcamp

This repository serves as a collection of my work and learnings throughout the bootcamp, focusing on developing skills in analytics engineering, particularly using dbt.

analytics bigquery dbt engineering sql

Last synced: 05 Nov 2024

https://github.com/metrics-pli/bigquery-export

Exports collected metrics to Google Big Query

bigquery datastudio lighthouse metrics metrics-pli performance pupeteer

Last synced: 25 Jan 2025

https://github.com/ajaxbarcelonacruyff/gcp_cost

Monitoring Google Cloud costs with Looker Studio.

bigquery googlecloud googlecloudplatform lookerstudio

Last synced: 25 Dec 2024

https://github.com/ajaxbarcelonacruyff/ga4_bigquery_session_source

Creating GA4 session references in BigQuery.

bigquery ga4 googleanalytics

Last synced: 25 Dec 2024

https://github.com/miguelapp10/etl_operadorlogistico

extraer datos de la API de SimpliRoute, AndesExpress y Urbano en un rango de fechas específico y procesarlos para su análisis y almacenamiento en Google BigQuery

api-client bigquery pandas python

Last synced: 20 Dec 2024

https://github.com/matt-strautmann/dbt-bigquery-ecommerce-quickstart

Welcome to the dbt-BigQuery Quickstart Project! 🎉 This repository is designed as a hands-on guide to help you build a modern data stack leveraging powerful tools like Airbyte for ingestion, dbt for transformation, and BigQuery for storage and analytics.

bigquery dbt e-commerce quickstarts

Last synced: 17 Jan 2025

https://github.com/elithrar/finding-bugs-with-bigquery

A talk on using BigQuery, the GitHub Public Data & some elbow grease to find bugs in OSS projects.

big-data bigquery bugs github golang open-source

Last synced: 24 Jan 2025

https://github.com/googlecloudplatform/dcm2bq

About A service for creating a JSON metadata representation for DICOM from multiple input sources and storing into Google Cloud Big Query (BQ).

bigquery dicom gcs googlecloud googlecloudplatform googlecloudstorage json

Last synced: 28 Jan 2025

https://github.com/paulpierre/google-bq-export-downloader

Google BigQuery Export Downloader

big-data bigquery dump export gcs

Last synced: 21 Jan 2025

https://github.com/yu-iskw/terraform-google-copy-bq-datasets

A terraform module to copy BigQuery datasets across regions

bigquery data-engineering google-cloud terraform

Last synced: 21 Dec 2024

https://github.com/m-mizutani/bqs

BigQuery Schema utility in Go

bigquery bigquery-schema go

Last synced: 08 Jan 2025

https://github.com/kyoshidajp/bqcop

Save your BigQuery cost.

bigquery golang

Last synced: 21 Jan 2025

https://github.com/teraearlywine/sample_sql

The following repo contains samples of SQL code that can be referenced by future clients or employers.

bigquery database mysql sql

Last synced: 21 Jan 2025

https://github.com/paulveillard/cybersecurity-analytics

An ongoing collection of awesome software, libraries, learning tutorials, documents and books, technical resources and cool stuff about Analytics Engineering in Cybersecurity.

analytics bigdata bigquery cybernetics cybersecurity data data-engineering data-science encryption encryption-decryption seo seo-friendly seo-optimization

Last synced: 07 Dec 2024

https://github.com/romange/puma

Bigquery-like engine for processing structured json-like records

bigquery cpp11 engine

Last synced: 23 Jan 2025

https://github.com/prathmeshyelne/etl-pipeline-for-employee-data-using-data-fusion-airflow

This repository contains code and configuration files for an Extract, Transform, Load (ETL) project using Google Cloud Data Fusion for data extraction, Apache Airflow/Composer for orchestration, and Google BigQuery for data loading.

airflow bigquery dataengineering etl gcp googlecloudplatform

Last synced: 21 Jan 2025

https://github.com/mehmoodulhaq570/bigquery_machine_learning_project

Developed a machine learning model to predict incident groups based on data from the London Fire Brigade service calls.

bigquery bigquery-dataset cloud database jupyter-notebook machine-learning machine-learning-algorithms ml models prediction-algorithm prediction-model python

Last synced: 22 Dec 2024

https://github.com/icarusso/bigqueryexporter

Export query data from google bigquery to local machine

bigquery csv export python

Last synced: 21 Nov 2024

https://github.com/analyticace/data-engineering-projects

Collection of Open Source Data Engineering Projects

aws big-data bigquery data docker engineering etl oracle-database pipeline sql

Last synced: 22 Dec 2024

https://github.com/ritu456286/smartstockai

SmartStockAI uses AI to predict inventory trends, minimize deadstock risks, and provide actionable insights through advanced models and interactive visualizations.

bigquery bigquery-ml cloud-storage cloudrun cloudsql gemini google-maps-api

Last synced: 30 Jan 2025

https://github.com/mchirico/gmail

Inserts Gmail messages into BigQuery, then, deletes.

angular9 bigquery gcp gmail python3

Last synced: 23 Jan 2025

BigQuery Awesome Lists
BigQuery Categories