Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
BigQuery
Google BigQuery enables companies to handle large amounts of data without having to manage infrastructure. Google’s documentation describes it as a « serverless architecture (that) lets you use SQL queries to answer your organization’s biggest questions with zero infrastructure management. BigQuery’s scalable, distributed analysis engine lets you query terabytes in seconds and petabytes in minutes. » Its client libraries allow the use of widely known languages such as Python, Java, JavaScript, and Go. Federated queries are also supported, making it flexible to read data from external sources.
📖 A highly rated canonical book on it is « Google BigQuery: The Definitive Guide », a comprehensive reference. Another enriching read on the subject is the inside story told in the article by the founding product manager of BigQuery celebrating its 10th anniversary.
- GitHub: https://github.com/topics/bigquery
- Wikipedia: https://en.wikipedia.org/wiki/BigQuery/
- Repo: https://github.com/GoogleCloudPlatform/bigquery-utils/
- Released: May 19, 2010
- Related Topics: cloud-computing,
- Aliases: bq,
- Last updated: 2025-02-05 00:03:39 UTC
- JSON Representation
https://github.com/badal-io/dataflow-timeseries-iot-gas-demo
Dataflow code for integration with GCP Core IoT and FogLamp
Last synced: 11 Nov 2024
https://github.com/adam-cowley/neo4j-bigquery
Yo dawg, I heard you like queries so we put some BigQuery in your query so you can query BigQuery from your query
bigquery cypher neo4j neo4j-procedures
Last synced: 18 Dec 2024
https://github.com/wintermi/bqe-dataform
A Dataform project which aggregates BigQuery system metadata for the purpose of analysing the slot usage and storage within an organization by project.
bigquery dataform google-cloud google-cloud-platform
Last synced: 09 Nov 2024
https://github.com/wintermi/bq2csv
A command line application designed to provide a simple method to execute a BigQuery SQL script from "stdin", outputting all results to "stdout" in CSV format. A detailed log is output to the console "stderr" providing you with the available execution statistics.
bigquery google-cloud google-cloud-platform
Last synced: 12 Oct 2024
https://github.com/sigpwned/jdbq
JDBI-inspired Database Access Framework for Java + BigQuery
bigquery data-access-framework data-access-layer data-access-library data-lake java persistence persistence-framework persistence-layer
Last synced: 19 Nov 2024
https://github.com/yashika-malhotra/strategic-analysis-of-retail-brand-in-south-america-using-sql
Leveraged Big Query and MySQL to analyze 100K records for sales optimization, trend identification, and enhancing customer satisfaction for a retail brand in South America and to provide insights and recommendations to improve their userbase and improve their services
bigquery data-analysis data-science database database-schema google-bigquery mysql-server sql
Last synced: 14 Jan 2025
https://github.com/gjbae1212/go-bqworker
go-esworker is an async worker that data can bulk insert, update to the BigQuery.
async bigquery bigquery-bulk gcp go golang parallel worker
Last synced: 25 Dec 2024
https://github.com/greenpeace/gpes-bigquery-recipes
Google Big Query recipes to Analyse our data.
bigquery database-management sql
Last synced: 17 Nov 2024
https://github.com/wintermi/bqwrite-test
A command line application designed to provide a method to test the BigQuery Streaming API or BigQuery Storage Write API, allowing you to get a view of the potential throughput available via a given host.
bigquery google-cloud google-cloud-platform
Last synced: 09 Nov 2024
https://github.com/dmytrovoytko/data-engineering-amazon-reviews
Data Engineering project for ZoomCamp`24: JSONL -> PostgreSQL/BigQuery + Metabase + Mage.AI
bash-script bigquery codespaces data-analysis data-visualization etl metabase pipeline python-script
Last synced: 14 Nov 2024
https://github.com/neo4j-field/bigquery-connector
Bi-directional connectivity between Google BigQuery and Neo4j AuraDS
arrow-flight bigquery neo4j protobuf python spark
Last synced: 23 Dec 2024
https://github.com/sungchun12/image-labeling-and-translation-data-analysis-google-cloud
:label: Invokes cloud vision and translation APIs with Python
bigquery cloud-translation-api cloud-vision-api colaboratory google-cloud python sql
Last synced: 08 Jan 2025
https://github.com/stkchan/googleanalytics4-publicdataset-ecommerce-dashboard-powerbi
This dashboard uses Power BI Desktop as a visualization tool by extracting data from Google BigQuery.
analytics bigquery dashboard portfolio portfolio-project powerbi sql
Last synced: 13 Oct 2024
https://github.com/jolares/example-gcp-dataform
Example end-to-end ELT data pipeline using GCP Dataform.
bigquery dataform etl-pipeline
Last synced: 28 Nov 2024
https://github.com/pualien/py-gcloud-connectors
Utilities to simplify connection with Google APIs
bigquery data-analysis google-analytics google-analytics-4 google-cloud google-cloud-platform google-cloud-storage pandas python
Last synced: 12 Oct 2024
https://github.com/wintermi/fashion-dataform
An example Dataform project to load and transform the publicly available dataset from H&M Group into a format which could be imported into Discovery AI for Retail or Vertex AI Search and Conversation, , allowing you to train a retail recommendations model.
bigquery dataform google-cloud google-cloud-platform vertex-ai
Last synced: 09 Nov 2024
https://github.com/indatawetrust/bigquery-api
A quick and easy to use package for Google Cloud BigQuery
Last synced: 13 Dec 2024
https://github.com/vigneshss-07/google-cloud-professional-data-engineer-acompleteguide
This Repo contains all study, lab and supportive materials for Udemy course on "Google Cloud Professional Data Engineer - A Complete Guide".
big-data bigquery cloud-computing dataengineering elt-pipeline etl-framework gcp-services gcp-storage google-cloud machine-learning
Last synced: 12 Oct 2024
https://github.com/42digital/bqtools
Python Tools for BigQuery
bigquery bigquery-schema migrations python
Last synced: 12 Oct 2024
https://github.com/googlecloudplatform/aicoe
This repository contains an end-to-end walkthrough to leverage Google Cloud services to demonstrate Solution Accelerators for few business domains
aimlops bigquery dataflow googlecloudplatform mlops security-analytics solution-accelerators vertex-ai
Last synced: 05 Feb 2025
https://github.com/toddbirchard/bigquery-to-sql
:bar_chart: :arrow_right: :floppy_disk: Lightweight ETL script to migrate data from BigQuery to SQL.
bigquery etl google-cloud python sql sqlalchemy
Last synced: 16 Jan 2025
https://github.com/wintermi/bqrunner
A command line application designed to provide a simple method to execute one or more SQL queries against a given dataset in BigQuery. A detailed log is output to the console providing you with the available execution statistics.
bigquery google-cloud google-cloud-platform
Last synced: 24 Oct 2024
https://github.com/moh-ayman/mongodb-to-bigquery---cloud-func-etl
Google Cloud Function built to perform an ETL Job to Collect MongoDB Data and Transform it to be able to Import it to Bigquery.
bigquery etl-pipeline gcp-cloud-functions mongodb pandas-python
Last synced: 15 Nov 2024
https://github.com/rezuankassim/bqanalytic
Laravel package to use analytic data imported to Big Query from Firebase Analytic
bigquery firebase-analytics laravel
Last synced: 18 Dec 2024
https://github.com/airscholar/dbt-bigquery-crash-course
A deep dive into the powerful combination of DBT and BigQuery, the game-changers in modern data engineering.
bigquery data-engineering dbt google-cloud
Last synced: 14 Nov 2024
https://github.com/borisgra/fullweb
FullStack Web Applications with React and Kotlin JS and NPM(ag-grid). Look and modify ANY view from ANY base (PGSql, Bigquery Google)
bigquery compose docker dockerhub fullstack kotlin kotlin-fullstack kotlin-js kotlin-js-react kotlin-jvm kotlin-multiplatform ktor npm-module postgresql
Last synced: 25 Jan 2025
https://github.com/wintermi/tmdb-dataform
An example Dataform project to load and transform the publicly available dataset from The Movie Database into a format which could be imported into Vertex AI Search for Media, allowing you to build a search engine for movies.
bigquery dataform google-cloud google-cloud-platform
Last synced: 21 Nov 2024
https://github.com/dav009/dbtmock
end to end unit tests for dbt ( Data build tool ) pipelines
bigquery data-build-tool dbt mock pipelines test testing unittest unittesting
Last synced: 12 Dec 2024
https://github.com/masmovil/rx-gcloud-connectors
bigquery datastore firestore gcloud gcloud-sdk pubsub reactive rxjava2 vertx
Last synced: 24 Jan 2025
https://github.com/pirate-emperor/k2bq
K2BQ is a dataflow pipeline that streams data from Kafka to BigQuery. It uses Google Cloud’s managed Kafka, Dataflow for processing, and BigQuery for real-time analytics, offering scalable, automated data integration for fast insights.
bigquery cloud-computing cloud-infrastructure data-integration data-streaming dataflow google-cloud infrastructure-as-code kafka python realtime-analytics terraform
Last synced: 17 Nov 2024
https://github.com/vertexclique/olayufku
Schema Registry for BigQuery
bigquery bigquery-schema migration schema-migrations schema-registry
Last synced: 28 Jan 2025
https://github.com/evry-ace/statsbot
Slack Bot to forward message statistics to BigQuery
bigquery slack slack-bot slackbot
Last synced: 12 Oct 2024
https://github.com/triglav-dataflow/triglav-agent-bigquery
BigQuery agent for Triglav, data-driven workflow tool
Last synced: 12 Oct 2024
https://github.com/paty-oliveira/dbt-playground
Repository for data modelling with dbt
analytics-engineering bigquery dbt docker-compose jinja postgresql python sql
Last synced: 06 Dec 2024
https://github.com/achint08/tech-diffusion
Patents data analysis on PySpark
bert big-data bigquery google-patents-dataset machine-learning nlp pagera patents-analysis pyspark
Last synced: 28 Jan 2025
https://github.com/jasontanx/gsheet-to-bq-ingestion
Data ingestion from Google Sheet to BigQuery
bigquery data-engineering data-ingestion gsheets
Last synced: 01 Feb 2025
https://github.com/t3n/gtmetrix-bq
A script running browser test of specified urls through GTmetrix and saving metrics in BigQuery.
Last synced: 17 Dec 2024
https://github.com/ryanmcdowell/dataflow-bigquery-dynamic-destinations
An example pipeline for dynamically routing events from Pub/Sub to different BigQuery tables based on a message attribute.
apache-beam bigquery google-cloud-dataflow google-cloud-platform
Last synced: 21 Nov 2024
https://github.com/poogles/pytest-bq
pytest fixtures for a local bigquery suitable for local development.
bigquery bigquery-emulator pytest
Last synced: 26 Dec 2024
https://github.com/jehiah/socrata_to_bigquery
A tool to copy public data to BigQuery
Last synced: 23 Oct 2024
https://github.com/olajideolagunju/gcp_mage_data_pipeline
An end-to-end data pipeline solution to process and analyze Maintenance Work Orders using Mage, Google BigQuery, Cloud SQL, and Looker Studio. Features a seamless integration of cloud tools for scalable data storage, transformation, and visualization.
automation bigquery cloud cloud-sql compute-engine data data-engineering database database-schema docker-compose excel gcp mage-ai maintenance orchestration python sql virtual-machine visualization-dashboard work-orders
Last synced: 21 Oct 2024
https://github.com/galois1915/google-ml-engineer
This program provides the skills you need to advance your career and provides training to support your preparation for the industry-recognized Google Cloud Professional Machine Learning Engineer certification.
api automl bigquery keras mlops-workflow tensorflow2 vertex-ai
Last synced: 22 Jan 2025
https://github.com/leereilly/wee-queries
Query sets for Google Cloud Platform's BigQuery :mag:
Last synced: 24 Jan 2025
https://github.com/ovotech/bigquery-metrics-exporter
A Golang application to export table level metrics from BigQuery into Datadog.
Last synced: 05 Feb 2025
https://github.com/jasontanx/data-engineer-project-1
End-to-end data engineering project
airline bigquery data-engineering etl-pipeline looker-studio mage-ai
Last synced: 01 Feb 2025
https://github.com/leandronasx/agro-data
Projeto final da formação de analista de dados e dashboard da SoulCode Academy.
bigquery data-analysis gcp looker pandas powerbi python
Last synced: 12 Oct 2024
https://github.com/akihokurino/recruitment-server-gae
job listings api server. create go application in google app engine 2nd. use twirp for api interface and use sops with kms for secure environment. use cloud build for cicd. use algolia for search engine. sync datastore to bigquery.
algolia bigquery boom cloud-scheduler cloudbuild datastore firebase-auth gae gcp golang grpc kms realtime-database twirp
Last synced: 01 Feb 2025
https://github.com/moh-ayman/stripeapi-to-bq---cfunc-etl
Google Cloud Function built to perform an ETL Job to Collect StripeAPI Data and Transform it to be able to Import it to Bigquery.
bigquery dataengineering etl-pipeline gcp gcp-cloud-functions pandas-dataframe python stripe-api
Last synced: 15 Jan 2025
https://github.com/aymane-maghouti/sales-data-pipeline
This ETL (Extract, Transform, Load) project demonstrates the process of extracting data from a SQL Server database, transforming it using Python, orchestrating the data pipeline with Apache Airflow (running in a Docker container), loading the transformed data into Google BigQuery data warehouse, and finally creating a dashboard using Looker Studio.
airflow bigquery etl-pipeline gcp looker-studio orchestrator python sql-server-database
Last synced: 17 Jan 2025
https://github.com/jroakes/npath
Exploring path sequences in GA4 BigQuery data
analytics bigquery pathfinding-algorithm
Last synced: 23 Dec 2024
https://github.com/brews/bucket2bq
Create an inventory of objects in GCS Bucket with metadata and upload to Big Query
bigquery gcp golang google-cloud-storage
Last synced: 20 Oct 2024
https://github.com/takegue/bigquery-porter
BigQuery Deployment and Metadata Management tool
Last synced: 12 Oct 2024
https://github.com/xlfe/pyjdbq
The easiest way to ship journald logs to Google BigQuery
bigquery journald journald-logs logging security
Last synced: 18 Jan 2025
https://github.com/tuancamtbtx/gcp-udfs-example
Google BigQuery Javascript UDF Function Examples
bigquery gcp javascript nodejs npm udf
Last synced: 02 Jan 2025
https://github.com/benitomartin/benitomartin
Personal profile 😎
anaconda artificial-intelligence aws bash-script bigquery data-science gcp lambda-functions large-language-models linux machine-learning python pytorch retrieval-augmented-generation sagemaker scikit-learn tensorflow terraform
Last synced: 31 Dec 2024
https://github.com/shrawans007/google_cyclistic_2023
Google Data Analytics Capstone Case Study (SQL and Tableau)
big-query bigquery coursera-assignment cyclistic cyclistic-bike-share-analysis-case-study cyclistic-bikshare data-analysis data-analysis-project data-analytics data-cleaning data-combination data-exploration data-science google-data-analytics sql tableau tableau-dashboard tableau-public
Last synced: 08 Jan 2025
https://github.com/antoinegiraud/dataform_hypermarche
SQL repo orchestrated by Dataform for BigQuery
Last synced: 08 Jan 2025
https://github.com/chukwuemekaaham/data-engineering-zoomcamp
Datatalks Club Free Data Engineering Zoomcamp Project
bigquery dbt docker-compose duckdb gcp gcp-cloud-storage github-actions jupyter-notebook kafka linux looker-studio mageai pandas postgresql prefect python redpanda risingwave spark terraform
Last synced: 17 Jan 2025
https://github.com/i62navpm/vue-front-app
Front app + Google cloud tools
appengine bigquery firebase firestore-database google-cloud google-functions google-storage puppeteer
Last synced: 17 Jan 2025
https://github.com/ritu456286/smartstockai
SmartStockAI uses AI to predict inventory trends, minimize deadstock risks, and provide actionable insights through advanced models and interactive visualizations.
bigquery bigquery-ml cloud-storage cloudrun cloudsql gemini google-maps-api
Last synced: 30 Jan 2025
https://github.com/metrics-pli/bigquery-export
Exports collected metrics to Google Big Query
bigquery datastudio lighthouse metrics metrics-pli performance pupeteer
Last synced: 25 Jan 2025
https://github.com/yu-iskw/terraform-google-copy-bq-datasets
A terraform module to copy BigQuery datasets across regions
bigquery data-engineering google-cloud terraform
Last synced: 21 Dec 2024
https://github.com/johannaojeling/go-data-ingestion
Cloud Function for ingesting data from Cloud Storage to BigQuery
bigquery cloud-functions cloud-storage go google-cloud
Last synced: 31 Jan 2025
https://github.com/kellyjadams/bigquery-python-weekly-report
A script to automate a weekly report that runs BigQuery in Python.
Last synced: 22 Jan 2025
https://github.com/thunchanokbow/audiblebook-revenue
Manage big data on cloud computing to find a list of best-selling audible books, generate reports and dashboards, and provide products and sales promotions that meet the needs of consumers in Thailand
apache-airflow bigquery cloudcomposer data-visualization datalake datawarehouse googlecloudstorage lookerstudio pandas python3
Last synced: 09 Jan 2025
https://github.com/thunchanokbow/inventory-amazon
Inventory value is also important for determining a company's liquidity, or its ability to meet its short-term financial obligations. A high inventory value can indicate that a company has too much money tied up in inventory, which could make it difficult for the company to pay its bills.
azure bigquery cloudcomposer clouddatabase cloudstorage compute-engine dataproc postgresql powerbi pyspark-sql python3
Last synced: 09 Jan 2025
https://github.com/pedrocarmona/big_query_adapter
An ActiveRecord Google BigQuery adapter
activerecord bigquery gem ruby-on-rails
Last synced: 21 Nov 2024
https://github.com/ajaxbarcelonacruyff/gcp_cost
Monitoring Google Cloud costs with Looker Studio.
bigquery googlecloud googlecloudplatform lookerstudio
Last synced: 25 Dec 2024
https://github.com/salrashid123/gcp_cloud_status_dataset
BigQuery Dataset to query GCP Cloud Status Dashboard (https://status.cloud.google.com/)
bigquery gcp google-cloud google-cloud-platform
Last synced: 22 Jan 2025
https://github.com/mchmarny/sbomer
Generates daily SBOM and vulnerability reports for container images and saves resulting files into GCS bucket and data into BigQuery tables.
bigquery gcp gcs grype report sbom syft vex vulnerability
Last synced: 31 Dec 2024
https://github.com/miguelapp10/api_simpliroute_urbano
extraer datos de la API de SimpliRoute y Urbano en un rango de fechas específico y procesarlos para su análisis y almacenamiento en Google BigQuery
api-client bigquery pandas python
Last synced: 21 Nov 2024
https://github.com/teraearlywine/sample_sql
The following repo contains samples of SQL code that can be referenced by future clients or employers.
Last synced: 21 Jan 2025
https://github.com/mahendra077/ml_model_using_gcp_bigquery
ML Using Big Query
Last synced: 01 Feb 2025
https://github.com/misszeferino/sql-projects
bigquery data-analysis mysql queries sql sqlite3
Last synced: 21 Jan 2025
https://github.com/ackeecz/terraform-gcp-dataflow_pubsub_to_bq
Dataflow job subscriber to PubSub subscription. It takes message from subscription and push it into BigQuery table.
bigquery dataflow pubsub terraform-module
Last synced: 07 Jan 2025
https://github.com/prathmeshyelne/etl-pipeline-for-employee-data-using-data-fusion-airflow
This repository contains code and configuration files for an Extract, Transform, Load (ETL) project using Google Cloud Data Fusion for data extraction, Apache Airflow/Composer for orchestration, and Google BigQuery for data loading.
airflow bigquery dataengineering etl gcp googlecloudplatform
Last synced: 21 Jan 2025
https://github.com/alterra-greeve/de-capstone
Capstone Project SIB Batch 6 x Alterra Academy - Data Engineer
bigquery cloud-function data-engineering docker googlefirebase looker-studio python
Last synced: 21 Nov 2024
https://github.com/stkchan/web-scraping-with-selenium
bigquery pandas python selenium-webdriver webscraping
Last synced: 21 Jan 2025
https://github.com/george-nyamao/gcp_etl_project
An ETL pipeline to move an uploaded flat file ffrom GCS, mask PII, store Big Query, and Create a report in Looker.
airflow bigquery cloudcomposer data-fusion gcs-bucket looker python3 wrangler
Last synced: 21 Jan 2025
https://github.com/nguyendangxuanlinh/newyorkbike-rental-trip-time-prediction-model-googlebigquery
The ML project uses Linear Regression to predict the trip time of a bike rental for a new prediction system in new mobile application. The ML datasets have been collected and stored in a BigQuery public dataset
bigquery linear-regression machine-learning
Last synced: 21 Jan 2025
https://github.com/pmhalvor/whale-speech
A pipeline to map whale sightings to hydrophone audio
beam bigquery gcs mle model-as-a-service python tensorflow2
Last synced: 20 Dec 2024
https://github.com/jaehyeon-kim/dbt-cicd-demo
DBT CI/CD Demo
bigquery cicd dataengineering dbt gcp github-actions
Last synced: 21 Nov 2024
https://github.com/zkan/running-bigquery-query-from-airflow-using-bigqueryexecuteoperator
Running BigQuery Query from Airflow using BigQueryExecuteOperator
Last synced: 19 Dec 2024
https://github.com/justinbeckwith/bisquick
🥞Synchronize your GitHub issues with BigQuery. Do neat stuff.
Last synced: 19 Dec 2024
https://github.com/shubhammohanty680/uber_data_analysis
bigquery data-analysis gcp-compute gcp-project looker-studio mageai python
Last synced: 21 Jan 2025
https://github.com/essien1990/etl_pipeline_airflow
Creating pipelines using Python3 and Apache Airflow to load tables into Google Big Query Dataware House
airflow airflow-dags airflow-operators bash bigquery bq datawarehouse etl-pipeline python3
Last synced: 21 Jan 2025
https://github.com/cch0/data-engineering-zoomcamp-2024-project
2024 project
bigquery cicd cloud-storage-application cloudstorage gcp mage pipelines terraform
Last synced: 19 Dec 2024