Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
BigQuery
Google BigQuery enables companies to handle large amounts of data without having to manage infrastructure. Google’s documentation describes it as a « serverless architecture (that) lets you use SQL queries to answer your organization’s biggest questions with zero infrastructure management. BigQuery’s scalable, distributed analysis engine lets you query terabytes in seconds and petabytes in minutes. » Its client libraries allow the use of widely known languages such as Python, Java, JavaScript, and Go. Federated queries are also supported, making it flexible to read data from external sources.
📖 A highly rated canonical book on it is « Google BigQuery: The Definitive Guide », a comprehensive reference. Another enriching read on the subject is the inside story told in the article by the founding product manager of BigQuery celebrating its 10th anniversary.
- GitHub: https://github.com/topics/bigquery
- Wikipedia: https://en.wikipedia.org/wiki/BigQuery/
- Repo: https://github.com/GoogleCloudPlatform/bigquery-utils/
- Released: May 19, 2010
- Related Topics: cloud-computing,
- Aliases: bq,
- Last updated: 2024-11-12 00:03:12 UTC
- JSON Representation
https://github.com/data-tools/big-data-types
A library to transform Scala product types and Schemes from different systems into other Schemes. Any implemented type automatically gets methods to convert it into the rest of the types and vice versa. E.g: a Spark Schema can be transformed into a BigQuery table.
apache-spark bigquery bigquery-tables cassandra circe database-types scala schemas spark typeclass typeclass-derivation typesafe
Last synced: 12 Oct 2024
https://github.com/sungchun12/schedule-python-script-using-google-cloud
:clock4: Schedules a Python script to append data into Bigquery using Google Cloud's App Engine with a cron job
appengine-python bigquery chicago-traffic cron google-cloud python-script
Last synced: 28 Oct 2024
https://github.com/badal-io/gcp-airflow-foundations
Opinionated framework based on Airflow 2.0 for building pipelines to ingest data into a BigQuery data warehouse
airflow apache-airflow bigquery dags data-engineering data-pipeline etl-pipeline
Last synced: 29 Oct 2024
https://github.com/victorcouste/google-cloudfunctions-dataprep
Google Cloud Functions examples for Google Cloud Dataprep
api api-rest bigquery cloud-functions cloudfunctions-dataprep dataprep dataprep-job google-bigquery google-cloud-dataprep google-sheet trifacta
Last synced: 08 Aug 2024
https://github.com/mondo-mob/mondokit
Simplify building NodeJS applications on cloud platforms
bigquery cloud cloud-storage cloud-tasks database-migrations datastore datastore-backups express firebase-auth firestore firestore-database firestore-database-backup firestore-migrations gcp gcs google-cloud google-cloud-platform google-cloud-tasks
Last synced: 12 Oct 2024
https://github.com/vickyjkwan/sqlanalyzer
A SQL parser and analyzer for sql flavors including MySQL, PostgreSQL, BigQuery Standard SQL, Presto SQL and Hive SQL.
athena bigquery hiveql metastore presto sqlparser standardsql
Last synced: 12 Oct 2024
https://github.com/googlecloudplatform/google-cloud-abap
ABAP SDK for Google Cloud and BigQuery Connector for SAP enable customers to easily consume Google Products and Services natively from their SAP Landscape.
abap abap-development abapsdk abapsdkforgcp bigquery google-cloud-platform google-generative-ai google-maps-api vertex-ai
Last synced: 07 Oct 2024
https://github.com/noahgift/pragmaticai-gcp
Pragmatic AI solutions on GCP
bigquery colaboratory gcp google jupyter-notebook pragai pragmaticai-gcp python sheets
Last synced: 11 Oct 2024
https://github.com/kesin11/ts-junit2json
Convert JUnit XML format to JSON with TypeScript
Last synced: 10 Nov 2024
https://github.com/jashparekh/bigquery-action
This Github action can be used to deploy tables/views schemas to BigQuery.
actions bigquery gbq github-actions google google-bigquery google-cloud-platform hacktoberfest
Last synced: 23 Oct 2024
https://github.com/pcorbel/metaquery
An API to analyze BigQuery metadata
bigquery golang gorm vue-router vuejs vuetifyjs vuex
Last synced: 10 Nov 2024
https://github.com/hackersandslackers/bigquery-python-tutorial
:bar_chart: :snake: Create tables in Google BigQuery, auto-generate their schemas, and retrieve said schemas.
bigquery data-warehouse gcs google-bigquery google-cloud google-cloud-sdk google-cloud-storage python tutorial
Last synced: 09 Nov 2024
https://github.com/minodisk/zoq
Convert Zod to BigQuery Schema
bigquery bigquery-schema bigquery-schema-converter zod
Last synced: 19 Oct 2024
https://github.com/tufin/espresso
A framework for writing testable BigQuery queries
Last synced: 29 Sep 2024
https://github.com/k1low/setup-tbls
GitHub Action for tbls
bigquery continuous-integration database-document database-schema documentation-tool dynamodb er-diagram excel mariadb markdown mermaid mysql plantuml postgresql redshift snowflake spanner sqlite sqlserver
Last synced: 17 Oct 2024
https://github.com/urish/nn-function-generator
Experimenting with automatic generation of TS function bodies using ANN models
bigquery tensorflow tsquery typescript
Last synced: 12 Nov 2024
https://github.com/sweetpand/py_scripts_bots
The moderate bots for re-crawling from social medias.
bigquery bot bots crawling instagram-bot practice-programming python regex scrapy scripts social-networks tweetbot whatsapp-bot youtube-bot
Last synced: 11 Nov 2024
https://github.com/wayfair-incubator/bigquery-buildkite-plugin
Buildkite Plugin to create/update structures on BigQuery
bigquery buildkite buildkite-plugin gbq google google-bigquery google-cloud-platform hacktoberfest
Last synced: 07 Nov 2024
https://github.com/mchmarny/pubsub-to-bigquery-pump
Simple utility combining Cloud Run and Stackdriver metrics to drain JSON messages from PubSub topic into BigQuery table
bigquery cloudrun events golang metrics pubsub stackdriver
Last synced: 18 Oct 2024
https://github.com/armanbilge/gcp4s
Cross-platform JVM/JS Google Cloud Platform integrations for fs2 and friends
Last synced: 12 Oct 2024
https://github.com/tomayac/http-archive-progressive-web-apps
Different approaches to estimate the number of Progressive Web Apps in the HTTP Archive
Last synced: 16 Oct 2024
https://github.com/wintermi/imdb-dataform
An example Dataform project to load and transform the publicly available dataset from IMDB.
bigquery dataform google-cloud google-cloud-platform
Last synced: 09 Nov 2024
https://github.com/nodefluent/purpur
:diamond_shape_with_a_dot_inside: kafka-connectors as a service | ETL :purple_heart:
bigquery connectors etl gcloud kafka kafka-connect mysql nodejs redis saas
Last synced: 29 Sep 2024
https://github.com/kestra-io/plugin-gcp
bigquery firestore gcp google-cloud google-cloud-platform google-cloud-storage kestra plugin vertex-ai
Last synced: 01 Nov 2024
https://github.com/edgarrmondragon/meltano-dogfood
Personal dogfood Meltano project
bigquery dbt dogfood elt evidence-dev meltano
Last synced: 15 Oct 2024
https://github.com/memsjava/bigquery-helper
A helper package for Google BigQuery operations
bigquery google pandas-dataframe
Last synced: 14 Oct 2024
https://github.com/tobked/fetch-apache-ga-stats
Repository to make "snapshots" of GitHub Action queue for later analysis
bigquery gcp github github-actions
Last synced: 15 Oct 2024
https://github.com/wayfair-incubator/gbq
Python wrapper for interacting with Google BigQuery.
bigquery gbq google google-bigquery google-cloud-platform hacktoberfest python
Last synced: 12 Oct 2024
https://github.com/k1low/tbls-meta
tbls-meta is an external subcommand of tbls for applying metadata managed by tbls to the datasource.
bigquery data-catalog-management
Last synced: 12 Oct 2024
https://github.com/corneliusweig/krew-index-tracker
Saves download statistics of `krew.dev` plugins to BigQuery
bigquery history krew krew-index statistics
Last synced: 18 Oct 2024
https://github.com/kellyjadams/run-sql-in-python
Scripts to connect python to BigQuery or a PostgreSQL database.
Last synced: 13 Oct 2024
https://github.com/gr8distance/blanton
BigQuery API wrapped by Elixir
bigquery bigquery-schema elixir
Last synced: 29 Oct 2024
https://github.com/yassun7010/turu-py
Simple Database API for Typed Python
async bigquery pep249 postgres postgresql postgresql-database python snowflake snowflakedb sqlite sqlite-database sqlite3 sqlite3-database typed-python
Last synced: 11 Oct 2024
https://github.com/sigpwned/litecene
A simple cross-data store full-text search language for Java 8+
bigquery full-text-search java query-language search
Last synced: 14 Oct 2024
https://github.com/rajaprerak/twitteranalysis
Twitter sentiment analysis of trending movies and songs.
bigquery bootstrap css dataflow datastudio gae gcp google-app-engine google-cloud-platform html pubsub python sentiment-analysis spotipy tmdb-api tweepy twitter twitter-sentiment-analysis
Last synced: 06 Nov 2024
https://github.com/doitintl/terraform-bq-scheduled-queries
This is a demo project to use Terraform to manage BigQuery scheduled queries with Cloud Build CI/CD
bigquery cicd cloudbuild terraform
Last synced: 12 Nov 2024
https://github.com/wintermi/movielens-dataform
An example Dataform project which will use the publicly available Movielens dataset to demonstrate how to upload your product catalog and user events into either the Google Cloud Retail API or Google Cloud Discovery Engine and train a personalised product recommendation model.
bigquery dataform google-cloud google-cloud-platform vertex-ai
Last synced: 09 Nov 2024
https://github.com/shnewto/bqjson
bqjson - Serialize/Deserialzie BigQuery TableResults to/from JSON
bigquery java json maven serde serde-json serialization serializer tableresult testing tests
Last synced: 27 Oct 2024
https://github.com/trocco-io/embulk-output-bigquery_java
Java flavor faster Embulk output plugin to load/insert data into Google BigQuery
Last synced: 12 Nov 2024
https://github.com/cata-network/cadence-docs
cadence document, Chinese version
Last synced: 07 Nov 2024
https://github.com/pierrec1024/airflow-provider-bigquery-reservation
Airflow provider for bigquery reservation operators.
Last synced: 12 Oct 2024
https://github.com/rittmananalytics/ra_dbt_to_dataform
An open-source tool that partially automates the migration of dbt packages to Dataform
bigquery dataform dbt dbt-core migration-tool
Last synced: 13 Oct 2024
https://github.com/wintermi/bqe-dataform
A Dataform project which aggregates BigQuery system metadata for the purpose of analysing the slot usage and storage within an organization by project.
bigquery dataform google-cloud google-cloud-platform
Last synced: 09 Nov 2024
https://github.com/badal-io/dataflow-timeseries-iot-gas-demo
Dataflow code for integration with GCP Core IoT and FogLamp
Last synced: 11 Nov 2024
https://github.com/bzzt/alchemy_table
Opinionated framework for working with Bigtable and BigQuery
bigquery bigtable database elixir gcp googlecloud googlecloudplatform
Last synced: 19 Oct 2024
https://github.com/lin-jun-xiang/pyga4
📊Python Google Analytics 4 (GA4) Data Extraction and Analysis Toolkit
bigquery free ga ga4 google-analytics google-analytics-python-api python
Last synced: 27 Oct 2024
https://github.com/sukanyabag/gcp-ai-notebooks
This repository contains all practice notebooks with which I performed hands-on labs in Google Cloud Training Program's "Cloud ML-AI Track"
bigquery cloudml-samples data-science dataprep tensorflow-tutorials
Last synced: 03 Nov 2024
https://github.com/adam-cowley/neo4j-bigquery
Yo dawg, I heard you like queries so we put some BigQuery in your query so you can query BigQuery from your query
bigquery cypher neo4j neo4j-procedures
Last synced: 30 Oct 2024
https://github.com/pualien/py-gcloud-connectors
Utilities to simplify connection with Google APIs
bigquery data-analysis google-analytics google-analytics-4 google-cloud google-cloud-platform google-cloud-storage pandas python
Last synced: 12 Oct 2024
https://github.com/jolares/example-gcp-dataform
Example end-to-end ELT data pipeline using GCP Dataform.
bigquery dataform etl-pipeline
Last synced: 14 Oct 2024
https://github.com/wintermi/bqrunner
A command line application designed to provide a simple method to execute one or more SQL queries against a given dataset in BigQuery. A detailed log is output to the console providing you with the available execution statistics.
bigquery google-cloud google-cloud-platform
Last synced: 24 Oct 2024
https://github.com/wintermi/bq2csv
A command line application designed to provide a simple method to execute a BigQuery SQL script from "stdin", outputting all results to "stdout" in CSV format. A detailed log is output to the console "stderr" providing you with the available execution statistics.
bigquery google-cloud google-cloud-platform
Last synced: 12 Oct 2024
https://github.com/greenpeace/gpes-bigquery-recipes
Google Big Query recipes to Analyse our data.
bigquery database-management sql
Last synced: 03 Aug 2024
https://github.com/gjbae1212/go-bqworker
go-esworker is an async worker that data can bulk insert, update to the BigQuery.
async bigquery bigquery-bulk gcp go golang parallel worker
Last synced: 06 Nov 2024
https://github.com/vigneshss-07/google-cloud-professional-data-engineer-acompleteguide
This Repo contains all study, lab and supportive materials for Udemy course on "Google Cloud Professional Data Engineer - A Complete Guide".
big-data bigquery cloud-computing dataengineering elt-pipeline etl-framework gcp-services gcp-storage google-cloud machine-learning
Last synced: 12 Oct 2024
https://github.com/wintermi/bqwrite-test
A command line application designed to provide a method to test the BigQuery Streaming API or BigQuery Storage Write API, allowing you to get a view of the potential throughput available via a given host.
bigquery google-cloud google-cloud-platform
Last synced: 09 Nov 2024
https://github.com/wintermi/fashion-dataform
An example Dataform project to load and transform the publicly available dataset from H&M Group into a format which could be imported into Discovery AI for Retail or Vertex AI Search and Conversation, , allowing you to train a retail recommendations model.
bigquery dataform google-cloud google-cloud-platform vertex-ai
Last synced: 09 Nov 2024
https://github.com/sungchun12/image-labeling-and-translation-data-analysis-google-cloud
:label: Invokes cloud vision and translation APIs with Python
bigquery cloud-translation-api cloud-vision-api colaboratory google-cloud python sql
Last synced: 11 Nov 2024
https://github.com/stkchan/googleanalytics4-publicdataset-ecommerce-dashboard-powerbi
This dashboard uses Power BI Desktop as a visualization tool by extracting data from Google BigQuery.
analytics bigquery dashboard portfolio portfolio-project powerbi sql
Last synced: 13 Oct 2024
https://github.com/42digital/bqtools
Python Tools for BigQuery
bigquery bigquery-schema migrations python
Last synced: 12 Oct 2024
https://github.com/triglav-dataflow/triglav-agent-bigquery
BigQuery agent for Triglav, data-driven workflow tool
Last synced: 12 Oct 2024
https://github.com/takegue/bigquery-porter
BigQuery Deployment and Metadata Management tool
Last synced: 12 Oct 2024
https://github.com/wintermi/tmdb-dataform
An example Dataform project to load and transform the publicly available dataset from The Movie Database into a format which could be imported into Vertex AI Search for Media, allowing you to build a search engine for movies.
bigquery dataform google-cloud google-cloud-platform
Last synced: 12 Oct 2024
https://github.com/leandronasx/agro-data
Projeto final da formação de analista de dados e dashboard da SoulCode Academy.
bigquery data-analysis gcp looker pandas powerbi python
Last synced: 12 Oct 2024
https://github.com/jehiah/socrata_to_bigquery
A tool to copy public data to BigQuery
Last synced: 23 Oct 2024
https://github.com/brews/bucket2bq
Create an inventory of objects in GCS Bucket with metadata and upload to Big Query
bigquery gcp golang google-cloud-storage
Last synced: 20 Oct 2024
https://github.com/phstudy/postgresql-zetasketch
ZetaSketch HLL++ functions for PostgreSQL
bigquery hll java postgresql postgresql-extension zetasketch
Last synced: 12 Oct 2024
https://github.com/evry-ace/statsbot
Slack Bot to forward message statistics to BigQuery
bigquery slack slack-bot slackbot
Last synced: 12 Oct 2024
https://github.com/rezuankassim/bqanalytic
Laravel package to use analytic data imported to Big Query from Firebase Analytic
bigquery firebase-analytics laravel
Last synced: 12 Oct 2024
https://github.com/dav009/dbtmock
end to end unit tests for dbt ( Data build tool ) pipelines
bigquery data-build-tool dbt mock pipelines test testing unittest unittesting
Last synced: 24 Oct 2024
https://github.com/leereilly/wee-queries
Query sets for Google Cloud Platform's BigQuery :mag:
Last synced: 13 Oct 2024
https://github.com/vertexclique/olayufku
Schema Registry for BigQuery
bigquery bigquery-schema migration schema-migrations schema-registry
Last synced: 14 Oct 2024
https://github.com/olajideolagunju/gcp_mage_data_pipeline
An end-to-end data pipeline solution to process and analyze Maintenance Work Orders using Mage, Google BigQuery, Cloud SQL, and Looker Studio. Features a seamless integration of cloud tools for scalable data storage, transformation, and visualization.
automation bigquery cloud cloud-sql compute-engine data data-engineering database database-schema docker-compose excel gcp mage-ai maintenance orchestration python sql virtual-machine visualization-dashboard work-orders
Last synced: 21 Oct 2024
https://github.com/airscholar/dbt-bigquery-crash-course
A deep dive into the powerful combination of DBT and BigQuery, the game-changers in modern data engineering.
bigquery data-engineering dbt google-cloud
Last synced: 14 Nov 2024
https://github.com/icarusso/bigqueryexporter
Export query data from google bigquery to local machine
Last synced: 12 Oct 2024
https://github.com/rsachdeva/illuminatingdeposits-gcp-trigger
Terraform usage in the context of Google Cloud Platform GCP based Trigger of Resources applied to Cloud Functions. Both resource creation and destruction is through Terraform.
bigquery bigquery-table cloud-events functions-framework gcp go golang golangci-lint google-cloud google-cloud-function-pubsub-trigger google-cloud-functions google-cloud-pubsub google-cloud-sdk google-cloud-storage google-cloud-terraform sendgrid terraform
Last synced: 12 Oct 2024
https://github.com/ostrokach/uniparc_xml_parser
UniParc dataset describing ~300 million protein sequences converted into relational tables accessible through Google BigQuery (and as Parquet files).
bigquery bioinformatics csv-files parquet-files protein-domains protein-sequences
Last synced: 12 Oct 2024
https://github.com/benitomartin/benitomartin
Personal profile 😎
anaconda artificial-intelligence aws bash-script bigquery data-science gcp lambda-functions large-language-models linux machine-learning python pytorch retrieval-augmented-generation sagemaker scikit-learn tensorflow terraform
Last synced: 08 Nov 2024
https://github.com/chukwuemekaaham/data-engineering-zoomcamp
Datatalks Club Free Data Engineering Zoomcamp Project
bigquery dbt docker-compose duckdb gcp gcp-cloud-storage github-actions jupyter-notebook kafka linux looker-studio mageai pandas postgresql prefect python redpanda risingwave spark terraform
Last synced: 11 Oct 2024
https://github.com/misszeferino/sql-projects
bigquery data-analysis mysql queries sql sqlite3
Last synced: 12 Oct 2024
https://github.com/justinjsd/analytics-engineer-bootcamp
This repository serves as a collection of my work and learnings throughout the bootcamp, focusing on developing skills in analytics engineering, particularly using dbt.
analytics bigquery dbt engineering sql
Last synced: 05 Nov 2024
https://github.com/greenpeace/gpes-old-en-petitions-api-emulator
Emulates the deprecated EN petition's API. Useful if you have legacy microsites with petitions.
bigquery mysql petitions sqlite3
Last synced: 03 Aug 2024
https://github.com/coatless/bigquery-reddit-ask-your-advisor
Analysis code that counts instances of a phrase on Reddit (e.g. "ask your advisor")
ask-your-advisor bigquery r reddit
Last synced: 11 Oct 2024
https://github.com/alimarzouk/paris-aq
ELTL pipeline to monitor air quality in the Paris Île-de-France area
airflow airquality big-data bigquery dataengineering gcs spark
Last synced: 13 Oct 2024
https://github.com/squidmin/java17-spring-gradle-bigquery-reference
Java v17⋅ Spring v3 ⋅ Gradle ⋅ BigQuery
bigquery gradle java java-17-gradle java17 java17-spring-boot spring-boot-3
Last synced: 27 Oct 2024
https://github.com/ackeecz/terraform-gcp-dataflow_pubsub_to_bq
Dataflow job subscriber to PubSub subscription. It takes message from subscription and push it into BigQuery table.
bigquery dataflow pubsub terraform-module
Last synced: 10 Nov 2024
https://github.com/miguelapp10/api_simpliroute_urbano
extraer datos de la API de SimpliRoute y Urbano en un rango de fechas específico y procesarlos para su análisis y almacenamiento en Google BigQuery
api-client bigquery pandas python
Last synced: 12 Oct 2024
https://github.com/seandavi/aisr-data-warehouse
Animal Image Shared Resource PACS/Viewer
api bigquery clinical-information-system dicom dicom-files gcp image-analysis pacs radiology
Last synced: 05 Nov 2024
https://github.com/misicode/Kaggle-Intro_to_SQL
Solutions of the exercises from the course "Introduction to SQL with BigQuery" by @Kaggle.
bigquery kaggle kaggle-intro-to-sql sql
Last synced: 23 Oct 2024
https://github.com/shinichi-takii/atom-language-sql-bigquery
BigQuery SQL language support in Atom
atom atom-package bigquery grammar snippets sql syntax-highlighting
Last synced: 31 Oct 2024
https://github.com/mattwelke/packt-book-bot
Bot that tweets and logs the Packt free eBook of the day in BigQuery daily.
bigquery bot ebooks ibm-cloud-functions java openwhisk
Last synced: 13 Oct 2024