Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
BigQuery
Google BigQuery enables companies to handle large amounts of data without having to manage infrastructure. Google’s documentation describes it as a « serverless architecture (that) lets you use SQL queries to answer your organization’s biggest questions with zero infrastructure management. BigQuery’s scalable, distributed analysis engine lets you query terabytes in seconds and petabytes in minutes. » Its client libraries allow the use of widely known languages such as Python, Java, JavaScript, and Go. Federated queries are also supported, making it flexible to read data from external sources.
📖 A highly rated canonical book on it is « Google BigQuery: The Definitive Guide », a comprehensive reference. Another enriching read on the subject is the inside story told in the article by the founding product manager of BigQuery celebrating its 10th anniversary.
- GitHub: https://github.com/topics/bigquery
- Wikipedia: https://en.wikipedia.org/wiki/BigQuery/
- Repo: https://github.com/GoogleCloudPlatform/bigquery-utils/
- Released: May 19, 2010
- Related Topics: cloud-computing,
- Aliases: bq,
- Last updated: 2025-01-27 00:03:06 UTC
- JSON Representation
https://github.com/hirosassa/bqvalid
SQL linter tool for BigQuery GoogleSQL (formerly known as StandardSQL).
Last synced: 02 Nov 2024
https://github.com/hackersandslackers/bigquery-sqlalchemy-tutorial
:bar_chart: :arrow_right: :floppy_disk: ETL script to migrate data from BigQuery to SQL.
bigquery bigquery-sqlalchemy-tutorial databases etl mysql postgres python sql sqlalchemy tutorial
Last synced: 09 Nov 2024
https://github.com/zkan/dtc-data-engineering-zoomcamp-project
DataTalks.Club's Data Engineering Zoomcamp Project
apache-airflow bigquery data-engineering data-studio dbt docker kafka minio python stomp terraform
Last synced: 19 Dec 2024
https://github.com/digitalghost-dev/stock-data-pipeline
Visualizing S&P 500 data on a webpage with Python.
bigquery google-cloud-platform python
Last synced: 06 Nov 2024
https://github.com/naseemkullah/gcp-accountant
A tool to identify high cost resources in GCP at a granular level
bigquery cost cost-engineering cost-resources gcp gcp-accountant
Last synced: 09 Nov 2024
https://github.com/shinichi-takii/vscode-language-sql-bigquery
Syntax highlighting and code snippets for BigQuery SQL in Visual Studio Code
bigquery grammar snippets sql syntax-highlighting vscode vscode-extension
Last synced: 31 Oct 2024
https://github.com/ottogroup/bquest
Effortlessly validate and test your Google BigQuery queries with the power of pandas DataFrames in Python.
bigquery google-big-query google-cloud integration testing
Last synced: 18 Nov 2024
https://github.com/medjed/embulk-input-bigquery
BigQuery input plugin for Embulk loads records from BigQuery
Last synced: 12 Oct 2024
https://github.com/prologuetech/laravel-big
Google BigQuery for Laravel
bigquery google google-bigquery laravel laravel-5-package php
Last synced: 25 Nov 2024
https://github.com/unytics/catalog_builder
Data Catalogs Made Easy
bigquery data-catalog data-discovery databricks dbt redshift snowflake
Last synced: 07 Nov 2024
https://github.com/yoheimuta/dbq
CLI tool to easily Decorate BigQuery table name
bigquery bq cli golang table-decorator
Last synced: 13 Oct 2024
https://github.com/mesmacosta/bq-fake-pii-table-creator
Library for creating BigQuery tables with fake PII data
bigquery fake-data faker governance-dapps metadata piidata piii
Last synced: 11 Nov 2024
https://github.com/modataconsulting/dbt_ga4_project
This project uses Google Analytics 4 BigQuery Exports as its source data, and offers useful base transformations to provide report-ready dimension & fact models that can be used for reporting purposes, blending with other data, and/or feature engineering for ML models.
bigquery bq data-build-tool dbt ga4 google-analytics-4 sql
Last synced: 12 Oct 2024
https://github.com/googlecloudplatform/datacatalog-tag-history
Historical metadata of your data warehouse is a treasure trove to discover not just insights about changing data patterns, but also quality and user behaviour. This solution creates Data Catalog Tags history in BigQuery since Data Catalog keeps only the latest version of metadata for fast searchability.
analytics bigquery data-catalog data-governance metadata-management
Last synced: 22 Jan 2025
https://github.com/googlecloudplatform/cloud-composer-mssql-dataflow-bigquery
This repository contains an example of how to leverage Cloud Composer and Cloud Dataflow to move data from a Microsoft SQL Server to BigQuery. The diagrams below demonstrate the workflow pipeline.
airflow bigquery cloud-composer dataflow microsoft-sql-server
Last synced: 07 Oct 2024
https://github.com/livebook-dev/req_bigquery
Conveniences for querying Google BigQuery with Req
Last synced: 11 Nov 2024
https://github.com/data-tools/big-data-types
A library to transform Scala product types and Schemes from different systems into other Schemes. Any implemented type automatically gets methods to convert it into the rest of the types and vice versa. E.g: a Spark Schema can be transformed into a BigQuery table.
apache-spark bigquery bigquery-tables cassandra circe database-types scala schemas spark typeclass typeclass-derivation typesafe
Last synced: 12 Oct 2024
https://github.com/oliveroneill/bigqueryswift
BigQuery client for Swift
bigquery google-cloud-platform swift
Last synced: 11 Oct 2024
https://github.com/sungchun12/schedule-python-script-using-google-cloud
:clock4: Schedules a Python script to append data into Bigquery using Google Cloud's App Engine with a cron job
appengine-python bigquery chicago-traffic cron google-cloud python-script
Last synced: 28 Oct 2024
https://github.com/badal-io/gcp-airflow-foundations
Opinionated framework based on Airflow 2.0 for building pipelines to ingest data into a BigQuery data warehouse
airflow apache-airflow bigquery dags data-engineering data-pipeline etl-pipeline
Last synced: 29 Oct 2024
https://github.com/victorcouste/google-cloudfunctions-dataprep
Google Cloud Functions examples for Google Cloud Dataprep
api api-rest bigquery cloud-functions cloudfunctions-dataprep dataprep dataprep-job google-bigquery google-cloud-dataprep google-sheet trifacta
Last synced: 27 Nov 2024
https://github.com/toddbirchard/ghost-webhook-api
📑 🎛️ API to automate optimizations for self-hosted blogging platforms.
api automation bigquery blogging ghost github-api google-cloud-storage python webhook-api
Last synced: 16 Nov 2024
https://github.com/google-marketing-solutions/cwv_from_ga4_exports
Simple solution to make reporting on CWVs from BQ simpler to set up.
analytics bigquery google google-cloud-platform
Last synced: 05 Dec 2024
https://github.com/googlecloudplatform/google-cloud-abap
ABAP SDK for Google Cloud and BigQuery Connector for SAP enable customers to easily consume Google Products and Services natively from their SAP Landscape.
abap abap-development abapsdk abapsdkforgcp bigquery google-cloud-platform google-generative-ai google-maps-api vertex-ai
Last synced: 07 Oct 2024
https://github.com/mondo-mob/mondokit
Simplify building NodeJS applications on cloud platforms
bigquery cloud cloud-storage cloud-tasks database-migrations datastore datastore-backups express firebase-auth firestore firestore-database firestore-database-backup firestore-migrations gcp gcs google-cloud google-cloud-platform google-cloud-tasks
Last synced: 12 Oct 2024
https://github.com/vickyjkwan/sqlanalyzer
A SQL parser and analyzer for sql flavors including MySQL, PostgreSQL, BigQuery Standard SQL, Presto SQL and Hive SQL.
athena bigquery hiveql metastore presto sqlparser standardsql
Last synced: 12 Oct 2024
https://github.com/splitmedialabslimited/supermigration
A CLI tool to perform migrations on BigQuery tables
bigquery bigquery-schema gcp node nodejs
Last synced: 06 Dec 2024
https://github.com/noahgift/pragmaticai-gcp
Pragmatic AI solutions on GCP
bigquery colaboratory gcp google jupyter-notebook pragai pragmaticai-gcp python sheets
Last synced: 11 Oct 2024
https://github.com/manuelguerra1987/data-engineering-zoomcamp-notes
Notes and material from 2025 Data Engineering Zoomcamp by Datatalks.Club
airflow bigquery data-engineering docker kubernetes
Last synced: 16 Jan 2025
https://github.com/kesin11/ts-junit2json
Convert JUnit XML format to JSON with TypeScript
Last synced: 10 Nov 2024
https://github.com/jashparekh/bigquery-action
This Github action can be used to deploy tables/views schemas to BigQuery.
actions bigquery gbq github-actions google google-bigquery google-cloud-platform hacktoberfest
Last synced: 23 Oct 2024
https://github.com/pcorbel/metaquery
An API to analyze BigQuery metadata
bigquery golang gorm vue-router vuejs vuetifyjs vuex
Last synced: 10 Nov 2024
https://github.com/hackersandslackers/bigquery-python-tutorial
:bar_chart: :snake: Create tables in Google BigQuery, auto-generate their schemas, and retrieve said schemas.
bigquery data-warehouse gcs google-bigquery google-cloud google-cloud-sdk google-cloud-storage python tutorial
Last synced: 09 Nov 2024
https://github.com/wayfair-incubator/bigquery-buildkite-plugin
Buildkite Plugin to create/update structures on BigQuery
bigquery buildkite buildkite-plugin gbq google google-bigquery google-cloud-platform hacktoberfest
Last synced: 07 Nov 2024
https://github.com/stape-io/request-to-gcs-function
Google Cloud Function that saves everything that came in request to Google Cloud Storage
bigquery gtm gtm-server-side stape
Last synced: 11 Dec 2024
https://github.com/tufin/espresso
A framework for writing testable BigQuery queries
Last synced: 23 Jan 2025
https://github.com/sweetpand/py_scripts_bots
The moderate bots for re-crawling from social medias.
bigquery bot bots crawling instagram-bot practice-programming python regex scrapy scripts social-networks tweetbot whatsapp-bot youtube-bot
Last synced: 11 Nov 2024
https://github.com/urish/nn-function-generator
Experimenting with automatic generation of TS function bodies using ANN models
bigquery tensorflow tsquery typescript
Last synced: 12 Nov 2024
https://github.com/minodisk/zoq
Convert Zod to BigQuery Schema
bigquery bigquery-schema bigquery-schema-converter zod
Last synced: 19 Oct 2024
https://github.com/janaom/gcp-de-project-streaming-pubsub-beam-dataflow
This project demonstrates an end-to-end solution for processing and analyzing real-time conversations data from a JSON file using GCP services and infrastructure automation, showcasing data storage, streaming, processing, and analysis at scale.
apache-beam bigquery dataflow de-project gcp pubsub streaming-data
Last synced: 20 Jan 2025
https://github.com/janaom/gcp-de-project-uber-etl-pipeline
Technologies used: GCS, Compute Engine, Mage, BigQuery, Looker, Python
Last synced: 20 Jan 2025
https://github.com/ksalama/data2cooc2emb2ann
Learning embeddings from item co-occurrence statistics, and building an approx. nearest neighbour index
apache-beam bigquery dataflow embeddings machine-learning python3 tensorflow
Last synced: 04 Dec 2024
https://github.com/snithish/tpc-di_benchmark
Benchmark for Airflow with BigQuery as the Data Warehouse using TPC - DI
airflow benchmark bigquery tpc-di
Last synced: 17 Nov 2024
https://github.com/k1low/setup-tbls
GitHub Action for tbls
bigquery continuous-integration database-document database-schema documentation-tool dynamodb er-diagram excel mariadb markdown mermaid mysql plantuml postgresql redshift snowflake spanner sqlite sqlserver
Last synced: 17 Oct 2024
https://github.com/nodefluent/purpur
:diamond_shape_with_a_dot_inside: kafka-connectors as a service | ETL :purple_heart:
bigquery connectors etl gcloud kafka kafka-connect mysql nodejs redis saas
Last synced: 23 Jan 2025
https://github.com/tomayac/http-archive-progressive-web-apps
Different approaches to estimate the number of Progressive Web Apps in the HTTP Archive
Last synced: 16 Oct 2024
https://github.com/kestra-io/plugin-gcp
bigquery firestore gcp google-cloud google-cloud-platform google-cloud-storage kestra plugin vertex-ai
Last synced: 01 Nov 2024
https://github.com/edgarrmondragon/meltano-dogfood
Personal dogfood Meltano project
bigquery dbt dogfood elt evidence-dev meltano
Last synced: 15 Oct 2024
https://github.com/dp6/raft-suite-hub
O Hub é a solução responsável por centralizar a consolidação dos dados no BigQuery, ferramenta escolhida para servir de data warehouse do raft-suite.
bigquery data data-quality google-cloud google-cloud-functions hacktoberfest
Last synced: 04 Dec 2024
https://github.com/mchmarny/pubsub-to-bigquery-pump
Simple utility combining Cloud Run and Stackdriver metrics to drain JSON messages from PubSub topic into BigQuery table
bigquery cloudrun events golang metrics pubsub stackdriver
Last synced: 18 Oct 2024
https://github.com/memsjava/bigquery-helper
A helper package for Google BigQuery operations
bigquery google pandas-dataframe
Last synced: 14 Oct 2024
https://github.com/christippett/bigquery-geo-router
Calculate routes from long/lat coordinates in BigQuery using OpenStreetMap/OSRM
bigquery geospatial google openstreetmap osrm
Last synced: 19 Nov 2024
https://github.com/wintermi/imdb-dataform
An example Dataform project to load and transform the publicly available dataset from IMDB.
bigquery dataform google-cloud google-cloud-platform
Last synced: 09 Nov 2024
https://github.com/armanbilge/gcp4s
Cross-platform JVM/JS Google Cloud Platform integrations for fs2 and friends
Last synced: 12 Oct 2024
https://github.com/snithish/tpc-ds_big-query
Scripts to execute TPC - DS on Big Query
benchmark bigquery tpc-ds-benchmark tpc-ds-queries
Last synced: 17 Nov 2024
https://github.com/rajaprerak/twitteranalysis
Twitter sentiment analysis of trending movies and songs.
bigquery bootstrap css dataflow datastudio gae gcp google-app-engine google-cloud-platform html pubsub python sentiment-analysis spotipy tmdb-api tweepy twitter twitter-sentiment-analysis
Last synced: 06 Nov 2024
https://github.com/gr8distance/blanton
BigQuery API wrapped by Elixir
bigquery bigquery-schema elixir
Last synced: 29 Oct 2024
https://github.com/tamanobi/bq-query-unittest
BigQueryのクエリのロジックをデータ走査量を最小限してテストできるツール
Last synced: 19 Nov 2024
https://github.com/tobked/fetch-apache-ga-stats
Repository to make "snapshots" of GitHub Action queue for later analysis
bigquery gcp github github-actions
Last synced: 15 Oct 2024
https://github.com/sigpwned/litecene
A simple cross-data store full-text search language for Java 8+
bigquery full-text-search java query-language search
Last synced: 14 Oct 2024
https://github.com/aliasoblomov/bigquery-ga4-queries
List of all queries for Google Analytics 4 data export in BigQuery
Last synced: 13 Dec 2024
https://github.com/wayfair-incubator/gbq
Python wrapper for interacting with Google BigQuery.
bigquery gbq google google-bigquery google-cloud-platform hacktoberfest python
Last synced: 12 Oct 2024
https://github.com/corneliusweig/krew-index-tracker
Saves download statistics of `krew.dev` plugins to BigQuery
bigquery history krew krew-index statistics
Last synced: 18 Oct 2024
https://github.com/k1low/tbls-meta
tbls-meta is an external subcommand of tbls for applying metadata managed by tbls to the datasource.
bigquery data-catalog-management
Last synced: 12 Oct 2024
https://github.com/kellyjadams/run-sql-in-python
Scripts to connect python to BigQuery or a PostgreSQL database.
Last synced: 13 Oct 2024
https://github.com/yassun7010/turu-py
Simple Database API for Typed Python
async bigquery pep249 postgres postgresql postgresql-database python snowflake snowflakedb sqlite sqlite-database sqlite3 sqlite3-database typed-python
Last synced: 11 Oct 2024
https://github.com/fairscript/interact
A database interaction library for node.js/JavaScript/TypeScript that uses code reflection to maximize type safety and minimize friction. Supports PostgreSQL, Google BigQuery and SQLite.
bigquery data database linq orm postgresql reflection sql sqlite typesafe
Last synced: 12 Dec 2024
https://github.com/dataform-co/bigquery-ml-pipeline
An example of machine pipeline on Bigquery ML using Dataform
bigquery bigquery-ml dataform machine-learning-pip sql
Last synced: 12 Jan 2025
https://github.com/doitintl/terraform-bq-scheduled-queries
This is a demo project to use Terraform to manage BigQuery scheduled queries with Cloud Build CI/CD
bigquery cicd cloudbuild terraform
Last synced: 12 Nov 2024
https://github.com/shnewto/bqjson
bqjson - Serialize/Deserialzie BigQuery TableResults to/from JSON
bigquery java json maven serde serde-json serialization serializer tableresult testing tests
Last synced: 27 Oct 2024
https://github.com/cata-network/cadence-docs
cadence document, Chinese version
Last synced: 07 Nov 2024
https://github.com/webocs/mining-github-microservices
Gihub mining replication package for the article "Microservices in the Wild: the Github Landscape". It's A short node program that takes a prefiltered set of github repositories (Filtered with Google BigQuery) and uses GitHub API to find the ones that have a X nubmer of stars
bigdata bigquery microservices node
Last synced: 16 Jan 2025
https://github.com/hsbc/bqtools
The code repo for bqtools-json a package for managing bigquery using json exemplar data structure and home of the bqsync utility.
Last synced: 23 Nov 2024
https://github.com/ergut/mcp-bigquery-server
A Model Context Protocol (MCP) server that provides secure, read-only access to BigQuery datasets. Enables Large Language Models (LLMs) to safely query and analyze data through a standardized interface.
bigquery google-cloud mcp mcp-servers model-context-protocol sql
Last synced: 13 Dec 2024
https://github.com/pierrec1024/airflow-provider-bigquery-reservation
Airflow provider for bigquery reservation operators.
Last synced: 12 Oct 2024
https://github.com/trocco-io/embulk-output-bigquery_java
Java flavor faster Embulk output plugin to load/insert data into Google BigQuery
Last synced: 12 Nov 2024
https://github.com/jordicenzano/brighcove-live-ssai-ccu
POC (proof of concept) to show a possible way to calculate the real time CCU (concurrent viewers) for any Brightcove live stream with SSAI
analytics bigquery brightcove gcp-ap gcp-appengine-flex gcp-cloud-functions hls live streaming video
Last synced: 06 Jan 2025
https://github.com/wintermi/movielens-dataform
An example Dataform project which will use the publicly available Movielens dataset to demonstrate how to upload your product catalog and user events into either the Google Cloud Retail API or Google Cloud Discovery Engine and train a personalised product recommendation model.
bigquery dataform google-cloud google-cloud-platform vertex-ai
Last synced: 09 Nov 2024
https://github.com/blockchain-etl/iotex-etl
ETL (extract, transform and load) tools for ingesting IoTeX blockchain data to Google BigQuery and Pub/Sub
bigquery blockchain-data iotex sql
Last synced: 21 Jan 2025
https://github.com/terashim/dataform-google-analytics-4-example
Dataform による Google アナリティクス 4 エクスポートデータの変換パイプライン
bigquery dataform google-analytics
Last synced: 30 Nov 2024
https://github.com/trk54ylmz/spark-bigquery
Google BigQuery support for Spark SQL
Last synced: 18 Nov 2024
https://github.com/fivetran/zetasql-npm-examples
This repo contains examples of usage of zetasql-npm library
Last synced: 12 Dec 2024
https://github.com/adam-cowley/neo4j-bigquery
Yo dawg, I heard you like queries so we put some BigQuery in your query so you can query BigQuery from your query
bigquery cypher neo4j neo4j-procedures
Last synced: 18 Dec 2024
https://github.com/phstudy/postgresql-zetasketch
ZetaSketch HLL++ functions for PostgreSQL
bigquery hll java postgresql postgresql-extension zetasketch
Last synced: 21 Jan 2025
https://github.com/bzzt/alchemy_table
Opinionated framework for working with Bigtable and BigQuery
bigquery bigtable database elixir gcp googlecloud googlecloudplatform
Last synced: 19 Oct 2024
https://github.com/sukanyabag/gcp-ai-notebooks
This repository contains all practice notebooks with which I performed hands-on labs in Google Cloud Training Program's "Cloud ML-AI Track"
bigquery cloudml-samples data-science dataprep tensorflow-tutorials
Last synced: 21 Dec 2024
https://github.com/tirendazacademy/hands-on-data-science-with-gcp
Google BigQuery Tutorial
big-data big-data-analytics bigdata bigquery bigquery-ml bigqueryml cloud-computing data-analysis data-analytics data-engineering data-science dataanalysis dataengineering google-bigquery google-cloud-platform machienlearning machine-learning
Last synced: 01 Jan 2025
https://github.com/badal-io/dataflow-timeseries-iot-gas-demo
Dataflow code for integration with GCP Core IoT and FogLamp
Last synced: 11 Nov 2024
https://github.com/wintermi/bqe-dataform
A Dataform project which aggregates BigQuery system metadata for the purpose of analysing the slot usage and storage within an organization by project.
bigquery dataform google-cloud google-cloud-platform
Last synced: 09 Nov 2024