Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
BigQuery
Google BigQuery enables companies to handle large amounts of data without having to manage infrastructure. Google’s documentation describes it as a « serverless architecture (that) lets you use SQL queries to answer your organization’s biggest questions with zero infrastructure management. BigQuery’s scalable, distributed analysis engine lets you query terabytes in seconds and petabytes in minutes. » Its client libraries allow the use of widely known languages such as Python, Java, JavaScript, and Go. Federated queries are also supported, making it flexible to read data from external sources.
📖 A highly rated canonical book on it is « Google BigQuery: The Definitive Guide », a comprehensive reference. Another enriching read on the subject is the inside story told in the article by the founding product manager of BigQuery celebrating its 10th anniversary.
- GitHub: https://github.com/topics/bigquery
- Wikipedia: https://en.wikipedia.org/wiki/BigQuery/
- Repo: https://github.com/GoogleCloudPlatform/bigquery-utils/
- Released: May 19, 2010
- Related Topics: cloud-computing,
- Aliases: bq,
- Last updated: 2025-02-07 00:03:32 UTC
- JSON Representation
https://github.com/antoinegiraud/dataform_hypermarche
SQL repo orchestrated by Dataform for BigQuery
Last synced: 08 Jan 2025
https://github.com/ritu456286/smartstockai
SmartStockAI uses AI to predict inventory trends, minimize deadstock risks, and provide actionable insights through advanced models and interactive visualizations.
bigquery bigquery-ml cloud-storage cloudrun cloudsql gemini google-maps-api
Last synced: 30 Jan 2025
https://github.com/salrashid123/gcp_cloud_status_dataset
BigQuery Dataset to query GCP Cloud Status Dashboard (https://status.cloud.google.com/)
bigquery gcp google-cloud google-cloud-platform
Last synced: 22 Jan 2025
https://github.com/elithrar/finding-bugs-with-bigquery
A talk on using BigQuery, the GitHub Public Data & some elbow grease to find bugs in OSS projects.
big-data bigquery bugs github golang open-source
Last synced: 24 Jan 2025
https://github.com/misszeferino/sql-projects
bigquery data-analysis mysql queries sql sqlite3
Last synced: 21 Jan 2025
https://github.com/esanchezros/bigquery-maven-plugin
Maven plugin for managing BigQuery datasets, tables and views
bigquery java maven maven-plugin
Last synced: 22 Jan 2025
https://github.com/mchmarny/sbomer
Generates daily SBOM and vulnerability reports for container images and saves resulting files into GCS bucket and data into BigQuery tables.
bigquery gcp gcs grype report sbom syft vex vulnerability
Last synced: 31 Dec 2024
https://github.com/alterra-greeve/de-capstone
Capstone Project SIB Batch 6 x Alterra Academy - Data Engineer
bigquery cloud-function data-engineering docker googlefirebase looker-studio python
Last synced: 21 Nov 2024
https://github.com/jaehyeon-kim/dbt-cicd-demo
DBT CI/CD Demo
bigquery cicd dataengineering dbt gcp github-actions
Last synced: 21 Nov 2024
https://github.com/anilkhichar/bq-table-copy-automation
Copy table from one dataset to another in google big query using bash script
automation bash bash-script big-query bigquery bigquery-cp gcp google
Last synced: 29 Dec 2024
https://github.com/googlecloudplatform/dcm2bq
About A service for creating a JSON metadata representation for DICOM from multiple input sources and storing into Google Cloud Big Query (BQ).
bigquery dicom gcs googlecloud googlecloudplatform googlecloudstorage json
Last synced: 28 Jan 2025
https://github.com/mahendra077/ml_model_using_gcp_bigquery
ML Using Big Query
Last synced: 01 Feb 2025
https://github.com/chukwuemekaaham/data-engineering-zoomcamp
Datatalks Club Free Data Engineering Zoomcamp Project
bigquery dbt docker-compose duckdb gcp gcp-cloud-storage github-actions jupyter-notebook kafka linux looker-studio mageai pandas postgresql prefect python redpanda risingwave spark terraform
Last synced: 17 Jan 2025
https://github.com/yaph/queries
Collection of Data Queries in SPARQL and SQL
bigquery data-mining dbpedia openstreetmap osm queries sparql sql stackoverflow wikidata
Last synced: 08 Jan 2025
https://github.com/misicode/Kaggle-Intro_to_SQL
Solutions of the exercises from the course "Introduction to SQL with BigQuery" by @Kaggle.
bigquery kaggle kaggle-intro-to-sql sql
Last synced: 23 Oct 2024
https://github.com/miguelapp10/workinghoursbetweentwodate_bigquery
Este proyecto es una calculadora de horas laborales que determina la cantidad de horas trabajadas entre dos fechas, teniendo en cuenta días hábiles y horas de trabajo especificadas con Bigquery
bigquery bigquery-dataset bigquery-table querying sql sql-query
Last synced: 15 Jan 2025
https://github.com/alimarzouk/paris-aq
ELTL pipeline to monitor air quality in the Paris Île-de-France area
airflow airquality big-data bigquery dataengineering gcs spark
Last synced: 22 Jan 2025
https://github.com/miguelapp10/api_simpliroute_urbano
extraer datos de la API de SimpliRoute y Urbano en un rango de fechas específico y procesarlos para su análisis y almacenamiento en Google BigQuery
api-client bigquery pandas python
Last synced: 21 Nov 2024
https://github.com/teraearlywine/sample_sql
The following repo contains samples of SQL code that can be referenced by future clients or employers.
Last synced: 21 Jan 2025
https://github.com/seandavi/aisr-data-warehouse
Animal Image Shared Resource PACS/Viewer
api bigquery clinical-information-system dicom dicom-files gcp image-analysis pacs radiology
Last synced: 22 Dec 2024
https://github.com/morphl-ai/morphl-model-publishers-churning-users-bigquery
BigQuery connector, pre-processor and model for predicting churning users for digital publishers using Google Analytics 360
bigquery google-analytics machine-learning morphl-platform pipeline preprocessor pyspark
Last synced: 11 Jan 2025
https://github.com/windi-wulandari/pbi_kimia-farma-x-rakamin
A data-driven analytics project for Kimia Farma to evaluate business performance from 2020-2023 using BigQuery. Focused on transaction data, inventory, branch operations, and product insights. Results were visualized through an interactive dashboard to support strategic decisions and optimizations.
big-data-analytics bigquery datawarehouse googlelooker sql
Last synced: 23 Jan 2025
https://github.com/danlessa/meta_qa
A practical one-liner metalanguage for describing common-sense in an machine-friendly way.
Last synced: 08 Feb 2025
https://github.com/mehmoodulhaq570/bigquery_machine_learning_project
Developed a machine learning model to predict incident groups based on data from the London Fire Brigade service calls.
bigquery bigquery-dataset cloud database jupyter-notebook machine-learning machine-learning-algorithms ml models prediction-algorithm prediction-model python
Last synced: 22 Dec 2024
https://github.com/justinjsd/analytics-engineer-bootcamp
This repository serves as a collection of my work and learnings throughout the bootcamp, focusing on developing skills in analytics engineering, particularly using dbt.
analytics bigquery dbt engineering sql
Last synced: 05 Nov 2024
https://github.com/tosh2230/pubsub-dataflow-bigquery
Google Cloud Dataflow for 'Exactly-Once' streaming insertion, from Google Cloud Pub/Sub to Google BigQuery.
bigquery dataflow gcp google-cloud google-cloud-platform pubsub
Last synced: 21 Jan 2025
https://github.com/romange/puma
Bigquery-like engine for processing structured json-like records
Last synced: 23 Jan 2025
https://github.com/pedrocarmona/big_query_adapter
An ActiveRecord Google BigQuery adapter
activerecord bigquery gem ruby-on-rails
Last synced: 21 Nov 2024
https://github.com/i62navpm/vue-front-app
Front app + Google cloud tools
appengine bigquery firebase firestore-database google-cloud google-functions google-storage puppeteer
Last synced: 17 Jan 2025
https://github.com/thunchanokbow/inventory-amazon
Inventory value is also important for determining a company's liquidity, or its ability to meet its short-term financial obligations. A high inventory value can indicate that a company has too much money tied up in inventory, which could make it difficult for the company to pay its bills.
azure bigquery cloudcomposer clouddatabase cloudstorage compute-engine dataproc postgresql powerbi pyspark-sql python3
Last synced: 09 Jan 2025
https://github.com/ostrokach/uniparc_xml_parser
UniParc dataset describing ~300 million protein sequences converted into relational tables accessible through Google BigQuery (and as Parquet files).
bigquery bioinformatics csv-files parquet-files protein-domains protein-sequences
Last synced: 21 Jan 2025
https://github.com/miguelapp10/etl_operadorlogistico
extraer datos de la API de SimpliRoute, AndesExpress y Urbano en un rango de fechas específico y procesarlos para su análisis y almacenamiento en Google BigQuery
api-client bigquery pandas python
Last synced: 20 Dec 2024
https://github.com/thunchanokbow/audiblebook-revenue
Manage big data on cloud computing to find a list of best-selling audible books, generate reports and dashboards, and provide products and sales promotions that meet the needs of consumers in Thailand
apache-airflow bigquery cloudcomposer data-visualization datalake datawarehouse googlecloudstorage lookerstudio pandas python3
Last synced: 09 Jan 2025
https://github.com/squidmin/java17-spring-gradle-bigquery-reference
Java v17⋅ Spring v3 ⋅ Gradle ⋅ BigQuery
bigquery gradle java java-17-gradle java17 java17-spring-boot spring-boot-3
Last synced: 07 Feb 2025
https://github.com/nais/bqrator
Operator for creating BigQuery datasets
bigquery bigquery-operator kubernetes kubernetes-operator nais-features
Last synced: 04 Feb 2025
https://github.com/kellyjadams/bigquery-python-weekly-report
A script to automate a weekly report that runs BigQuery in Python.
Last synced: 22 Jan 2025
https://github.com/matt-strautmann/dbt-bigquery-ecommerce-quickstart
Welcome to the dbt-BigQuery Quickstart Project! 🎉 This repository is designed as a hands-on guide to help you build a modern data stack leveraging powerful tools like Airbyte for ingestion, dbt for transformation, and BigQuery for storage and analytics.
bigquery dbt e-commerce quickstarts
Last synced: 17 Jan 2025
https://github.com/fpopic/bigquery-schema-select
(Script) Generates SQL query that selects all fields (recursively for nested fields) from the provided BigQuery schema file.
bigquery bigquery-schema scala sql
Last synced: 21 Jan 2025
https://github.com/ajaxbarcelonacruyff/ga4_bigquery_session_source
Creating GA4 session references in BigQuery.
Last synced: 25 Dec 2024
https://github.com/squidmin/java11-spring-gradle-bigquery-reference
Java v11 ⋅ Spring v2 ⋅ Gradle ⋅ BigQuery
bigquery gradle gradle-java java java-gradle java11 java11-spring-boot spring spring-boot-2 spring-mvc spring-rest
Last synced: 22 Jan 2025
https://github.com/ansh-info/stockpulse
Real-time stock market analytics pipeline with live visualization dashboard. Built with Python and GCP, featuring automated data processing and interactive Streamlit analytics.
api big-data bigquery cloud cloud-computing cloud-native data-engineering data-pipeline docker docker-compose gcp gcp-automation-gitops gcp-cloud-run gcp-pubsub google-cloud-platform real-time realtime stock-market stocks streamlit
Last synced: 27 Dec 2024
https://github.com/sintef/bigquery-postgresql-wire-proxy
A PostgreSQL wire protocol proxy server for BigQuery.
Last synced: 12 Jan 2025
https://github.com/shahardekel/diabetes-analysis
bigquery cognos-dashboard python sql
Last synced: 18 Dec 2024
https://github.com/googleapis/google-cloud-cpp-bigquery
C++ Client Library for Google Cloud BigQuery
bigquery cloud cpp cpp17 google google-cloud-bigquery google-cloud-platform
Last synced: 01 Feb 2025
https://github.com/sejalmankar1012/product_data_analyst_assessement
Analyzing the Impact of Business Hour Mismatch on Order Volume in the Food Delivery Industry: A Case Study of UEats and Ghub
assessment-project bigquery loop product-analyst sql-query
Last synced: 26 Jan 2025
https://github.com/alessio-siciliano/google-cloud-python-class-wrapper
An example of several classes written in Python to interact with GCP
bigquery datatransfer gcp google-cloud
Last synced: 26 Jan 2025
https://github.com/oguzgn/firebase-ab-test-analysis-for-a-mobile-race-game
This repository showcases an infrastructure designed for analyzing A/B tests in mobile games. It leverages BigQuery to process Firebase and GA4-based event data and uses Looker Studio for dynamic visualization. The project simplifies A/B test comparisons, enabling stakeholders to view results directly through interactive dashboards.
ab-testing ab-testing-analysis bigquery event-based-tracking firebase looker-studio mobile-game-analytics race-game sql
Last synced: 26 Jan 2025
https://github.com/richardbnk/data_tools
Python Library to Accelerate Creation of Data ETL Processes on multiple database systems.
Last synced: 02 Feb 2025
https://github.com/hariprasath-v/mh_google_cloud_bigquery_ltv_prediction_challenge
Build a model that can predict customers' Long Term Value (LTV).
bigquery colab-notebook klib machine-learning matplotlib numpy pandas python python3 seaborn
Last synced: 13 Jan 2025
https://github.com/jasontanx/ridership-headline-project
This end to end data engineering / data analytics project will be about the Malaysian public transport ridership data.
bigquery data-engineering minio-server public-transport-ridership terraform
Last synced: 01 Feb 2025
https://github.com/janaom/gcp-de-project-data-pipeline-with-cloud-run-functions-airflow-biggueryml
Build a data pipeline on Google Cloud using an event-driven architecture, leveraging GCS, Cloud Run functions, and BigQuery. Explore both VM and Composer options for Airflow management, and utilize Logging & Monitoring for pipeline health. Discover how SQL-based BigQuery ML can be used for initial ML implementation in specific scenarios.
airflow bigquery bigqueryml cloud-functions cloud-run-functions composer data-engineering-project google-cloud-platform
Last synced: 26 Jan 2025
https://github.com/spacepatcher/google-workspace-gmail-collector
👁 App for collecting Gmail logs from your Google Workspace account and sending them to Kafka
bigquery gmail google-workspace security soc
Last synced: 23 Oct 2024
https://github.com/yeha98555/google-maps-analysis-pipeline
Taiwan Travel Attractions Analysis Data Pipeline
airflow bigquery cloudfunctions docker gcp gcs googlemaps googlesheets python terraform
Last synced: 23 Jan 2025
https://github.com/smohanta23/uber_data-engineering_etl-project
This project demonstrates a comprehensive data engineering workflow using the Uber information dataset. It covers the full spectrum of data engineering pipelines, from data transformation to deployment on Google Cloud, with a focus on creating a scalable and insightful data model.
big-data-analytics bigquery cloudcomputing computeengine dashboard-application dataengineering datainsights datamodelling datapipeline datascience datavisualization etl-pipeline gcp-project googlecloudplatform mage opensource python uber uber-api
Last synced: 21 Jan 2025
https://github.com/victorelexpe/bq-schema-sync
bigquery gcp google-cloud python schema sync
Last synced: 12 Oct 2024
https://github.com/syou6162/mackerel-plugin-bigquery-query-result-importer
Mackerel plugin to post bigquery's query result
Last synced: 12 Oct 2024
https://github.com/simhayn/genomics-cannabis-bigquery
BigQuery's Cannabis_Genomics Dataset Exploration using SQL in a Python Environment
big-data bigquery bioinformatics exploratory-data-analysis genomics python sql
Last synced: 22 Jan 2025
https://github.com/khanovico/energy-data-analysis
This is the cloud model analyzing real world dataset with BigQuery and other big-data analyzing tools. I implemented docker image for running this app on cross-platform environments.
big-data-processing bigquery docker google-app-engine jupyter-notebook mlflow python scikit-learn seaborn xgboost
Last synced: 10 Oct 2024
https://github.com/vigneshss-07/mastering-sql-and-bigquery-on-google-cloud-platform
Take your Data Analytics skills to the next level with this comprehensive playlist. Learn SQL from the basics to advanced techniques while mastering BigQuery on Google Cloud.
Last synced: 05 Jan 2025
https://github.com/mikeghen/metadata
Pulls data from Socrata open data portals
Last synced: 27 Dec 2024
https://github.com/manesioz/airflow-without-code
Dynamically generate DAGs to ingest SQL files into BigQuery with one line of "code"
airflow airflow-plugin bigquery python sql
Last synced: 05 Jan 2025
https://github.com/mateuszk098/sql-queries
SQL Queries Training.
bigquery hackerrank-solutions query sql
Last synced: 28 Dec 2024
https://github.com/allanreda/video-processing-and-categorization
Video processing and categorization using computer vision, machine learning and cloud computing
bigquery cloud-storage-bucket cnn computer-vision google-cloud kmeans-clustering machine-learning opencv2 tensorflow virtual-machine
Last synced: 28 Dec 2024
https://github.com/yiu31802/gcp-project
GCP AppEngine project of Twitter data and some sample code
appengine bigquery gcp google-bigquery google-cloud google-datastore resas twitter twitter-data twitter4j
Last synced: 02 Feb 2025
https://github.com/mysto-007/cyclistic-bike-share-analysis
Analyzed the dataset of Cyclistic Rental Service as the Capstone project for Google Data Analytics SpecializationAnalyzed the dataset of Cyclistic bike-share (Capstone project for Google Data Analytics Specialization)
bigquery data-analysis excel ms-sql-server sql tableau tableau-public
Last synced: 18 Jan 2025
https://github.com/syedsajjadaskari/end-to-end-chicago-taxi-tip-prediction-with-bigquery-and-vertex-ai
An end-to-end example of Chicago taxi on Google Cloud using TensorFlow, TFX, and Vertex AI
bigquery gcp tensorflow tfx vertex-ai
Last synced: 13 Jan 2025
https://github.com/alexgenovese/machine-learning-bigquery-gcp
These SQL are based on available ecommerce dataset that has millions of Google Analytics records for the Google Merchandise Store loaded into BigQuery.
bigquery google google-cloud-platform purchase sql visitors
Last synced: 28 Dec 2024
https://github.com/armahdavi/bigdata_pyspark_sales_analytics
Summarizing my big data code in python pyspark to analyze sales data with retail and walmart superstore to draw sales insights
big-data bigquery clustering dataframe hadoop k-means machine-learning pyspark pyspark-ml python spark unsupervised-learning
Last synced: 28 Dec 2024
https://github.com/amitkumarj441/mysql2bigquery
A script to load a MySQL table in BigQuery. Extracts schema and data as JSON.
Last synced: 26 Jan 2025
https://github.com/jmfeck/bigquery-local-framework
This repo provides tools to manage BigQuery operations locally, simplifying tasks like uploading flat files, running SQL queries, and downloading tables. It offers a unified interface for local BigQuery interactions, enabling more efficient interaction with it.
bigquery data-engineering ingestion pandas python
Last synced: 18 Jan 2025
https://github.com/chdl17/nyc_green_taxis_peak_hour_analysis
This project analyzes GCP BigQuery data and uses Looker Studio to build a Peak Hour Analysis.
bigquery gcp google-cloud-platform looker-studio sql
Last synced: 21 Nov 2024
https://github.com/siriospa/gcp-helpers-bigquery
Helpers for Google Cloud BigQuery.
bigquery gcp google-cloud-platform sirio
Last synced: 12 Oct 2024
https://github.com/sangnandar/load-csvs-from-gcs-to-bigquery
Google Apps Script to streamline loading CSV data from Google Cloud Storage (GCS) into BigQuery.
bigquery csv-import google-apps-script google-cloud-storage
Last synced: 13 Jan 2025
https://github.com/rrmcguinness/protoc-gen-bq-schema
A protocol buffer compiler (protoc) plugin for generating Google BigQuery JSON table definitions.
bigquery bigquery-schema protobuf
Last synced: 13 Jan 2025
https://github.com/simoun-asmar/clinipet_project
BigQuery
bigquery looker-studio lookerstudio sql
Last synced: 13 Jan 2025
https://github.com/paulveillard/cybersecurity-analytics
An ongoing collection of awesome software, libraries, learning tutorials, documents and books, technical resources and cool stuff about Analytics Engineering in Cybersecurity.
analytics bigdata bigquery cybernetics cybersecurity data data-engineering data-science encryption encryption-decryption seo seo-friendly seo-optimization
Last synced: 02 Feb 2025
https://github.com/push-protocol/push-google-bigquery
The Power of Web3 Big Data: A Guide to Using Google BigQuery and Push Protocol for Data Communication and Analysis
bigquery data push push-notifications web3
Last synced: 31 Jan 2025
https://github.com/ivdatahub/pypi-package-stats
Project for ingest pypi packages data from BigQuery and send to DataDog for analysis and insights with dashboards, monitors and more
bigquery cloud data-engineering data-warehouse gcp software-engineering
Last synced: 21 Nov 2024
https://github.com/newtonmunene99/sec-filings
Simple golang app that crawls sec EDGAR filings and loads indices into Google BigQuery
bigquery cloudstorage gcp golang
Last synced: 21 Jan 2025
https://github.com/adindasarianti/rakamin_kf_analytics
This repository contains my project as a Big Data Analytics intern at Kimia Farma, where I analyzed the performance of Kimia Farma from 2020 to 2023
bigquery dataanalytics lookerstudio
Last synced: 02 Jan 2025
https://github.com/codingsancho/fastapi-bigquery
Learning exercise, Python backend, FastAPI, bigquery, React-JS frontend.
bigquery fastapi javascript python react
Last synced: 20 Dec 2024
https://github.com/tosh2230/cdc-rds-bq
Change data capture from Amazon RDS to Google BigQuery
bigquery changedatacapture rds
Last synced: 21 Jan 2025
https://github.com/scraly/bigquery
Google BigQuery AaaS tools, tips and fun
Last synced: 25 Dec 2024
https://github.com/lupusruber/music_analytics
This project processes real-time music event data using Kafka, Apache Spark on Google Cloud Dataproc, and stores the transformed data in BigQuery for analytics, all orchestrated by Airflow and managed with Terraform.
bigquery data-proc dimensional-modeling gcp-project kafka spark-structured-streaming
Last synced: 02 Feb 2025
https://github.com/antbit96/dataform_poc
Template for basic data preparation
bigquery bigquery-dataform data-preparation
Last synced: 14 Dec 2024
https://github.com/francois-lenne/play-bq-gcp
Data pipeline in order to retrieve data from the playstation API to BigQuery
bigquery cicd data-engineering google-cloud python
Last synced: 13 Jan 2025
https://github.com/xxmadkillerx10/data-engineering-zoomcamp
The Data Engineering Zoomcamp covers essential skills in containerization, workflow orchestration, data warehousing, analytics engineering, batch, and streaming processing. It includes tools like Docker, Terraform, BigQuery, dbt, Spark, Kafka, Kestra, Postgres, Google Data Studio, and Metabase.
airflow bigquery data-visualization dbt dbt-clickhouse docker-compose etl gcs google-cloud kafka postgresql spark sql streaming
Last synced: 03 Feb 2025
https://github.com/chiamakaukwuoma/portfolio
This repository contains various projects I've been privileged to work on outside of work.
aws-rds azure-fabric bigquery data-analysis docker-container elasticsearch excel grafana hadoop looker-studio mssql mysql postgresql powerbi python sql tableau
Last synced: 03 Feb 2025
https://github.com/valenthr/purchase_funnel
Google merch store sales analysis
Last synced: 27 Jan 2025