Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
BigQuery
Google BigQuery enables companies to handle large amounts of data without having to manage infrastructure. Google’s documentation describes it as a « serverless architecture (that) lets you use SQL queries to answer your organization’s biggest questions with zero infrastructure management. BigQuery’s scalable, distributed analysis engine lets you query terabytes in seconds and petabytes in minutes. » Its client libraries allow the use of widely known languages such as Python, Java, JavaScript, and Go. Federated queries are also supported, making it flexible to read data from external sources.
📖 A highly rated canonical book on it is « Google BigQuery: The Definitive Guide », a comprehensive reference. Another enriching read on the subject is the inside story told in the article by the founding product manager of BigQuery celebrating its 10th anniversary.
- GitHub: https://github.com/topics/bigquery
- Wikipedia: https://en.wikipedia.org/wiki/BigQuery/
- Repo: https://github.com/GoogleCloudPlatform/bigquery-utils/
- Released: May 19, 2010
- Related Topics: cloud-computing,
- Aliases: bq,
- Last updated: 2025-01-30 00:03:10 UTC
- JSON Representation
https://github.com/squidmin/java17-spring-gradle-bigquery-reference
Java v17⋅ Spring v3 ⋅ Gradle ⋅ BigQuery
bigquery gradle java java-17-gradle java17 java17-spring-boot spring-boot-3
Last synced: 14 Dec 2024
https://github.com/shubhammohanty680/uber_data_analysis
bigquery data-analysis gcp-compute gcp-project looker-studio mageai python
Last synced: 21 Jan 2025
https://github.com/morphl-ai/morphl-model-publishers-churning-users-bigquery
BigQuery connector, pre-processor and model for predicting churning users for digital publishers using Google Analytics 360
bigquery google-analytics machine-learning morphl-platform pipeline preprocessor pyspark
Last synced: 11 Jan 2025
https://github.com/alimarzouk/paris-aq
ELTL pipeline to monitor air quality in the Paris Île-de-France area
airflow airquality big-data bigquery dataengineering gcs spark
Last synced: 22 Jan 2025
https://github.com/benitomartin/benitomartin
Personal profile 😎
anaconda artificial-intelligence aws bash-script bigquery data-science gcp lambda-functions large-language-models linux machine-learning python pytorch retrieval-augmented-generation sagemaker scikit-learn tensorflow terraform
Last synced: 31 Dec 2024
https://github.com/seandavi/aisr-data-warehouse
Animal Image Shared Resource PACS/Viewer
api bigquery clinical-information-system dicom dicom-files gcp image-analysis pacs radiology
Last synced: 22 Dec 2024
https://github.com/windi-wulandari/pbi_kimia-farma-x-rakamin
A data-driven analytics project for Kimia Farma to evaluate business performance from 2020-2023 using BigQuery. Focused on transaction data, inventory, branch operations, and product insights. Results were visualized through an interactive dashboard to support strategic decisions and optimizations.
big-data-analytics bigquery datawarehouse googlelooker sql
Last synced: 23 Jan 2025
https://github.com/antoinegiraud/dataform_hypermarche
SQL repo orchestrated by Dataform for BigQuery
Last synced: 08 Jan 2025
https://github.com/analyticace/data-engineering-projects
Collection of Open Source Data Engineering Projects
aws big-data bigquery data docker engineering etl oracle-database pipeline sql
Last synced: 22 Dec 2024
https://github.com/gdbecker/dbtlabslearning
Learn the foundational steps of transforming data in dbt Cloud. Start by connecting dbt Cloud to a data warehouse and Git repository, then explore key concepts like modeling, sources, testing, documentation, and deployment. Get hands-on by building a model and running tests in dbt Cloud.
analytics-engineering bigquery dbt dbt-cloud jinja macros models packages sql testing
Last synced: 22 Jan 2025
https://github.com/salrashid123/gcp_cloud_status_dataset
BigQuery Dataset to query GCP Cloud Status Dashboard (https://status.cloud.google.com/)
bigquery gcp google-cloud google-cloud-platform
Last synced: 22 Jan 2025
https://github.com/tatamiya/new-books-notification
Fetch new books from [版元ドットコム](https://www.hanmoto.com/) and notify them to Slack
bigquery cloudrun-jobs gcs golang slack
Last synced: 12 Jan 2025
https://github.com/samedhi/gaend
Convert GAE Models into endpoints
bigquery elasticsearch google-app-engine restful taskqueue
Last synced: 12 Jan 2025
https://github.com/yaph/queries
Collection of Data Queries in SPARQL and SQL
bigquery data-mining dbpedia openstreetmap osm queries sparql sql stackoverflow wikidata
Last synced: 08 Jan 2025
https://github.com/i62navpm/vue-front-app
Front app + Google cloud tools
appengine bigquery firebase firestore-database google-cloud google-functions google-storage puppeteer
Last synced: 17 Jan 2025
https://github.com/rsachdeva/illuminatingdeposits-gcp-trigger
Terraform usage in the context of Google Cloud Platform GCP based Trigger of Resources applied to Cloud Functions. Both resource creation and destruction is through Terraform.
bigquery bigquery-table cloud-events functions-framework gcp go golang golangci-lint google-cloud google-cloud-function-pubsub-trigger google-cloud-functions google-cloud-pubsub google-cloud-sdk google-cloud-storage google-cloud-terraform sendgrid terraform
Last synced: 18 Jan 2025
https://github.com/stkchan/web-scraping-with-selenium
bigquery pandas python selenium-webdriver webscraping
Last synced: 21 Jan 2025
https://github.com/tosh2230/pubsub-dataflow-bigquery
Google Cloud Dataflow for 'Exactly-Once' streaming insertion, from Google Cloud Pub/Sub to Google BigQuery.
bigquery dataflow gcp google-cloud google-cloud-platform pubsub
Last synced: 21 Jan 2025
https://github.com/rohitsanj/superset-dbt-demo
This repository contains an example project (Jaffle Shop) demonstrating integration between Superset and dbt, with BigQuery as the data warehouse.
apache-superset bigquery dbt superset
Last synced: 23 Jan 2025
https://github.com/justinbeckwith/bisquick
🥞Synchronize your GitHub issues with BigQuery. Do neat stuff.
Last synced: 19 Dec 2024
https://github.com/esanchezros/bigquery-maven-plugin
Maven plugin for managing BigQuery datasets, tables and views
bigquery java maven maven-plugin
Last synced: 22 Jan 2025
https://github.com/essien1990/etl_pipeline_airflow
Creating pipelines using Python3 and Apache Airflow to load tables into Google Big Query Dataware House
airflow airflow-dags airflow-operators bash bigquery bq datawarehouse etl-pipeline python3
Last synced: 21 Jan 2025
https://github.com/alterra-greeve/de-capstone
Capstone Project SIB Batch 6 x Alterra Academy - Data Engineer
bigquery cloud-function data-engineering docker googlefirebase looker-studio python
Last synced: 21 Nov 2024
https://github.com/george-nyamao/gcp_etl_project
An ETL pipeline to move an uploaded flat file ffrom GCS, mask PII, store Big Query, and Create a report in Looker.
airflow bigquery cloudcomposer data-fusion gcs-bucket looker python3 wrangler
Last synced: 21 Jan 2025
https://github.com/fabioba/netflix-analytics
Analyze personal Netflix usage
airflow astronomer bigquery google-cloud-platform netflix tableau
Last synced: 19 Jan 2025
https://github.com/nguyendangxuanlinh/newyorkbike-rental-trip-time-prediction-model-googlebigquery
The ML project uses Linear Regression to predict the trip time of a bike rental for a new prediction system in new mobile application. The ML datasets have been collected and stored in a BigQuery public dataset
bigquery linear-regression machine-learning
Last synced: 21 Jan 2025
https://github.com/pmhalvor/whale-speech
A pipeline to map whale sightings to hydrophone audio
beam bigquery gcs mle model-as-a-service python tensorflow2
Last synced: 20 Dec 2024
https://github.com/misicode/Kaggle-Intro_to_SQL
Solutions of the exercises from the course "Introduction to SQL with BigQuery" by @Kaggle.
bigquery kaggle kaggle-intro-to-sql sql
Last synced: 23 Oct 2024
https://github.com/mchmarny/sbomer
Generates daily SBOM and vulnerability reports for container images and saves resulting files into GCS bucket and data into BigQuery tables.
bigquery gcp gcs grype report sbom syft vex vulnerability
Last synced: 31 Dec 2024
https://github.com/miguelapp10/workinghoursbetweentwodate_bigquery
Este proyecto es una calculadora de horas laborales que determina la cantidad de horas trabajadas entre dos fechas, teniendo en cuenta días hábiles y horas de trabajo especificadas con Bigquery
bigquery bigquery-dataset bigquery-table querying sql sql-query
Last synced: 15 Jan 2025
https://github.com/miguelapp10/api_simpliroute_urbano
extraer datos de la API de SimpliRoute y Urbano en un rango de fechas específico y procesarlos para su análisis y almacenamiento en Google BigQuery
api-client bigquery pandas python
Last synced: 21 Nov 2024
https://github.com/nghiant3110/google_analytic_4
This is a DA project based on the GA4 Sample dataset on Big Query
bigquery google-analytics looker-studio sql
Last synced: 24 Dec 2024
https://github.com/tomgorb/project-template-for-production
project template to (help) put a Machine/Deep learning algorithm into production
Last synced: 09 Jan 2025
https://github.com/mattwelke/packt-book-bot
Bot that tweets and logs the Packt free eBook of the day in BigQuery daily.
bigquery bot ebooks ibm-cloud-functions java openwhisk
Last synced: 18 Dec 2024
https://github.com/mlabarrere/pygquery
🐷 Multitread your data with Google BigQuery
bigquery dataframe google-bigquery multithreading pandas python
Last synced: 21 Jan 2025
https://github.com/shinichi-takii/atom-language-sql-bigquery
BigQuery SQL language support in Atom
atom atom-package bigquery grammar snippets sql syntax-highlighting
Last synced: 18 Dec 2024
https://github.com/fpopic/bigquery-schema-select
(Script) Generates SQL query that selects all fields (recursively for nested fields) from the provided BigQuery schema file.
bigquery bigquery-schema scala sql
Last synced: 21 Jan 2025
https://github.com/pedrocarmona/big_query_adapter
An ActiveRecord Google BigQuery adapter
activerecord bigquery gem ruby-on-rails
Last synced: 21 Nov 2024
https://github.com/ivanildobarauna/ivanildobarauna
Special Repository to Make README
ai airflow big-data bigquery data-engineering gcp python
Last synced: 22 Jan 2025
https://github.com/juldrixx/bigquery-avro-schema-converter
Website to convert a schema from one format to another between BigQuery and Avro
avro avro-schema bigquery bigquery-schema converter schema
Last synced: 22 Jan 2025
https://github.com/branb97/jobstreet-data-eng-project
Building a data pipeline to deliver job listing data from Jobstreet for analysis.
airflow bigquery data-engineering etl-pipeline google-cloud looker-studio python sql
Last synced: 22 Jan 2025
https://github.com/francois-lenne/play-bq-gcp
Data pipeline in order to retrieve data from the playstation API to BigQuery
bigquery cicd data-engineering google-cloud python
Last synced: 13 Jan 2025
https://github.com/antbit96/dataform_poc
Template for basic data preparation
bigquery bigquery-dataform data-preparation
Last synced: 14 Dec 2024
https://github.com/scraly/bigquery
Google BigQuery AaaS tools, tips and fun
Last synced: 25 Dec 2024
https://github.com/tosh2230/cdc-rds-bq
Change data capture from Amazon RDS to Google BigQuery
bigquery changedatacapture rds
Last synced: 21 Jan 2025
https://github.com/codingsancho/fastapi-bigquery
Learning exercise, Python backend, FastAPI, bigquery, React-JS frontend.
bigquery fastapi javascript python react
Last synced: 20 Dec 2024
https://github.com/moeabbas6/bq_data_loader
A Python script for executing and logging batch SQL commands in Google BigQuery. Includes tracking of execution times, unique job and statement IDs, and automated logging to a specified BigQuery table.
Last synced: 29 Jan 2025
https://github.com/adindasarianti/rakamin_kf_analytics
This repository contains my project as a Big Data Analytics intern at Kimia Farma, where I analyzed the performance of Kimia Farma from 2020 to 2023
bigquery dataanalytics lookerstudio
Last synced: 02 Jan 2025
https://github.com/newtonmunene99/sec-filings
Simple golang app that crawls sec EDGAR filings and loads indices into Google BigQuery
bigquery cloudstorage gcp golang
Last synced: 21 Jan 2025
https://github.com/ivdatahub/pypi-package-stats
Project for ingest pypi packages data from BigQuery and send to DataDog for analysis and insights with dashboards, monitors and more
bigquery cloud data-engineering data-warehouse gcp software-engineering
Last synced: 21 Nov 2024
https://github.com/lupusruber/music_analytics
This project processes real-time music event data using Kafka, Apache Spark on Google Cloud Dataproc, and stores the transformed data in BigQuery for analytics, all orchestrated by Airflow and managed with Terraform.
bigquery data-proc dimensional-modeling gcp-project kafka spark-structured-streaming
Last synced: 08 Dec 2024
https://github.com/hckhanh/pg2bigquery
A CLI tool to convert query from PostgreSQL to BigQuery
big bigquery google pg pgsql postgres postgres-tool postgresql postgresql-database postgressql query query-parser querybuilder sql sql-toolkit sql-tools tool toolbox toolkit utility
Last synced: 29 Jan 2025
https://github.com/marceloneppel/gcs-to-bigquery
WIP: Moving data from GCS to BigQuery.
Last synced: 30 Jan 2025
https://github.com/rafal-kowalski-dev/selling-cars-analize
Hobby project for learning PySpark, AirFlow and BigQuery
airflow bigquery gcp pyspark python sqlalchemy
Last synced: 30 Jan 2025
https://github.com/seahrh/nyc-taxi-trips
REST API for the New York City Taxi Trips public dataset, implemented in Scala and Play Framework 2.7
bigquery nyc-taxi-dataset play-framework rest-api scala
Last synced: 08 Dec 2024
https://github.com/quipper/send-ci-result-to-bigquery-action
Send test results to BigQuery in GitHub Actions
bigquery github-actions google-bigquery junit-xml
Last synced: 09 Jan 2025
https://github.com/simoun-asmar/clinipet_project
BigQuery
bigquery looker-studio lookerstudio sql
Last synced: 13 Jan 2025
https://github.com/sayed-ashfaq/target-sql
In this project, I analyzed Target company's data using SQL in BigQuery, focusing on data extraction, manipulation, and performing various analytical queries to derive insights.
aggregation bigquery cte joins sql
Last synced: 23 Dec 2024
https://github.com/rrmcguinness/protoc-gen-bq-schema
A protocol buffer compiler (protoc) plugin for generating Google BigQuery JSON table definitions.
bigquery bigquery-schema protobuf
Last synced: 13 Jan 2025
https://github.com/sangnandar/load-csvs-from-gcs-to-bigquery
Google Apps Script to streamline loading CSV data from Google Cloud Storage (GCS) into BigQuery.
bigquery csv-import google-apps-script google-cloud-storage
Last synced: 13 Jan 2025
https://github.com/nszoni/dbtgen
dbt: write nothing, generate (almost) everything.
analytics bigquery dbt documentation generative-ai github tooling
Last synced: 04 Dec 2024
https://github.com/patriciavalentine/loan-data-queries
In this project, I analyzed a vehicle loan dataset using BigQuery to identify demographic, financial, and loan patterns. Through SQL queries, I extracted insights such as the credit scores, and loan distribution by region, and explored high-risk profiles. The findings are visualized in Looker Studio, thus helping to inform strategic decisions.
asset-finance bigquery loan-data looker-studio
Last synced: 09 Dec 2024
https://github.com/janmin123/cyclistic
Capstone project for Google/Coursera Data Analytics Course
analysis bigquery sql tableau visualization
Last synced: 09 Dec 2024
https://github.com/push-protocol/push-google-bigquery
The Power of Web3 Big Data: A Guide to Using Google BigQuery and Push Protocol for Data Communication and Analysis
bigquery data push push-notifications web3
Last synced: 04 Dec 2024
https://github.com/akansharajput280799/strategic-analysis-of-retail-brand-in-south-america-using-sql
Leveraged Big Query and MySQL to analyze 100K records for sales optimization, trend identification, and enhancing customer satisfaction for a retail brand in South America and to provide insights and recommendations to improve their userbase and improve their services
bigquery data-analysis data-science database database-schema google-bigquery mysql-server sql
Last synced: 09 Dec 2024
https://github.com/minhajuddin2510/bigquery_alerts
In today’s data-driven world, organisations heavily rely on timely alerts to monitor critical systems and make informed decisions. However, when working with BigQuery, a popular cloud-based data warehouse, there is no built-in functionality to generate alerts. In this article, we will explore how I recently built a cloud function to address this
alerting bigquery cloudfunctions monitoring-tool slack
Last synced: 04 Dec 2024
https://github.com/andrewm4894/gcp-telemetry-example
Simple HTTP endpoint for telemetry data type events in GCP.
bigquery gcp-cloud-functions gcp-storage python terraform
Last synced: 05 Dec 2024
https://github.com/jasontanx/ridership-headline-project
This end to end data engineering / data analytics project will be about the Malaysian public transport ridership data.
bigquery data-engineering minio-server public-transport-ridership terraform
Last synced: 05 Dec 2024
https://github.com/jasontanx/terraform-practice
Creating datasets and tables in Google BigQuery via Terraform
bigquery iac-terraform infrastructure-as-code terraform
Last synced: 05 Dec 2024
https://github.com/machinelearningzuu/data-engineering-projects
This repository is a curated collection of projects and tools that exemplify best practices in data engineering. It serves as a resource for data professionals seeking to enhance their data infrastructure, optimize data pipelines, and implement cutting-edge data processing techniques.
airflow bigquery data-engineering data-science data-visualization data-warehouse
Last synced: 10 Dec 2024
https://github.com/ankita-selokar/fitbit-for-her-crafting-fitbit-s-strategy-for-women
This project analyzes smart device usage data to uncover trends and insights, guiding Fitbit by Google’s product and marketing strategies for their new women-focused product launch. It combines competitive market analysis with customer behavior insights to inform key decisions.
bigquery excel powerbi spreadsheet sql
Last synced: 10 Dec 2024
https://github.com/ivanildobarauna-dev/pypi-package-stats
Project for ingest pypi packages data from BigQuery and send to DataDog for analysis and insights with dashboards, monitors and more
bigquery cloud data-engineering data-warehouse gcp software-engineering
Last synced: 11 Dec 2024
https://github.com/pawel045/big-tech-stocks
ETL project
big-data bigquery dataengineering etl
Last synced: 11 Dec 2024
https://github.com/samanthalang/samanthalang_portfolio
Une data analyste avec la vision d'une consommatrice et la stratégie d'une marketeuse.
bigquery excel figma mysql notebook numpy pandas postgresql powerbi powerquery python sql sqlite wordpress
Last synced: 11 Dec 2024
https://github.com/mchmarny/stocker
Using tweeter sentiment and stock market price signal correlation to predict next day closing price
bigquery ml prediction regression-models
Last synced: 31 Dec 2024
https://github.com/siriospa/gcp-helpers-bigquery
Helpers for Google Cloud BigQuery.
bigquery gcp google-cloud-platform sirio
Last synced: 12 Oct 2024
https://github.com/davidkhala/dwh-migration-tools
dwh-migration-tools: contribution fork
Last synced: 23 Jan 2025
https://github.com/chdl17/nyc_green_taxis_peak_hour_analysis
This project analyzes GCP BigQuery data and uses Looker Studio to build a Peak Hour Analysis.
bigquery gcp google-cloud-platform looker-studio sql
Last synced: 21 Nov 2024
https://github.com/jmfeck/bigquery-local-framework
This repo provides tools to manage BigQuery operations locally, simplifying tasks like uploading flat files, running SQL queries, and downloading tables. It offers a unified interface for local BigQuery interactions, enabling more efficient interaction with it.
bigquery data-engineering ingestion pandas python
Last synced: 18 Jan 2025
https://github.com/amitkumarj441/mysql2bigquery
A script to load a MySQL table in BigQuery. Extracts schema and data as JSON.
Last synced: 26 Jan 2025
https://github.com/armahdavi/bigdata_pyspark_sales_analytics
Summarizing my big data code in python pyspark to analyze sales data with retail and walmart superstore to draw sales insights
big-data bigquery clustering dataframe hadoop k-means machine-learning pyspark pyspark-ml python spark unsupervised-learning
Last synced: 28 Dec 2024
https://github.com/alexgenovese/machine-learning-bigquery-gcp
These SQL are based on available ecommerce dataset that has millions of Google Analytics records for the Google Merchandise Store loaded into BigQuery.
bigquery google google-cloud-platform purchase sql visitors
Last synced: 28 Dec 2024
https://github.com/syedsajjadaskari/end-to-end-chicago-taxi-tip-prediction-with-bigquery-and-vertex-ai
An end-to-end example of Chicago taxi on Google Cloud using TensorFlow, TFX, and Vertex AI
bigquery gcp tensorflow tfx vertex-ai
Last synced: 13 Jan 2025
https://github.com/giorgishengelia/bike-share-analysis-report
Help developing marketing strategy using data analytics to help convert casual riders into members
Last synced: 12 Dec 2024
https://github.com/kevin-rsj/real-estate-investments
Sistema de scoring que clasifica ciudades francesas para inversión en segundas viviendas según perfil de riesgo(alto, moderado y bajo). Evalúa ratios clave en áreas como demanda, disponibilidad, infraestructura, demografía y precios.
bigquery data-analytics looker-studio numpy pandas python sklearn-library sql visualization
Last synced: 17 Dec 2024
https://github.com/tupizz/fiap_pnad-covid-19
Este projeto realiza a análise e transformação de dados da PNAD COVID-19 de maio a julho de 2020, utilizando PySpark para processamento de dados em larga escala e BigQuery como destino para armazenamento e análise posterior. O objetivo é consolidar os dados mensais em um único conjunto de dados transformado.
analysis bigquery pyspark python
Last synced: 17 Dec 2024
https://github.com/neo4j-field/dataflow-flex-pyarrow-to-gds
Google Dataflow Flex Templates (in Python) for large scale Graph Loading with GDS and Apache Arrow
apache-arrow apache-beam bigquery dataflow neo4j python
Last synced: 23 Dec 2024
https://github.com/lucashomuniz/project-22
[Dashboard] Data and Sustainability: Optimizing Green Flow's Fertilizer Portfolio
agrotech bigquery data-analytics data-structures data-visualization google-cloud-platform powerbi powerbi-visuals powerquery sql sustainability
Last synced: 25 Jan 2025
https://github.com/siobhan-doherty/ag_challenge
airflow bigquery csv-files data-engineering etl google-cloud-platform python sql
Last synced: 17 Dec 2024
https://github.com/istinnew/cook-me-up
Welcome to Cook-Me-Up! This project aims to analyze and organize cooking recipes using data analysis (Python, BigQuery SQL, Looker Studio etc.) and machine learning techniques. The goal is to simplify meal preparation and offer users a comprehensive database of culinary delights.
bigquery clustering cookme culinary data data-science dataanalysis datavisualization looker-studio machine-learning python recipe-search recipes unsupervised-learning
Last synced: 17 Dec 2024
https://github.com/yu-iskw/bigquery-lineage
Visualize BigQuery data lineage graph
bigquery data-governance data-management visualization
Last synced: 17 Dec 2024
https://github.com/yu-iskw/homebrew-bigquery-to-datastore
A homebrew tap for bigquery-to-datastore
bigquery google-datastore homebrew
Last synced: 17 Dec 2024
https://github.com/mysto-007/cyclistic-bike-share-analysis
Analyzed the dataset of Cyclistic Rental Service as the Capstone project for Google Data Analytics SpecializationAnalyzed the dataset of Cyclistic bike-share (Capstone project for Google Data Analytics Specialization)
bigquery data-analysis excel ms-sql-server sql tableau tableau-public
Last synced: 18 Jan 2025