Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

BigQuery

Google BigQuery enables companies to handle large amounts of data without having to manage infrastructure. Google’s documentation describes it as a « serverless architecture (that) lets you use SQL queries to answer your organization’s biggest questions with zero infrastructure management. BigQuery’s scalable, distributed analysis engine lets you query terabytes in seconds and petabytes in minutes. » Its client libraries allow the use of widely known languages such as Python, Java, JavaScript, and Go. Federated queries are also supported, making it flexible to read data from external sources.

📖 A highly rated canonical book on it is « Google BigQuery: The Definitive Guide », a comprehensive reference. Another enriching read on the subject is the inside story told in the article by the founding product manager of BigQuery celebrating its 10th anniversary.

https://github.com/ritu456286/smartstockai

SmartStockAI uses AI to predict inventory trends, minimize deadstock risks, and provide actionable insights through advanced models and interactive visualizations.

bigquery bigquery-ml cloud-storage cloudrun cloudsql gemini google-maps-api

Last synced: 30 Jan 2025

https://github.com/pmhalvor/whale-speech

A pipeline to map whale sightings to hydrophone audio

beam bigquery gcs mle model-as-a-service python tensorflow2

Last synced: 20 Dec 2024

https://github.com/rsachdeva/illuminatingdeposits-gcp-trigger

Terraform usage in the context of Google Cloud Platform GCP based Trigger of Resources applied to Cloud Functions. Both resource creation and destruction is through Terraform.

bigquery bigquery-table cloud-events functions-framework gcp go golang golangci-lint google-cloud google-cloud-function-pubsub-trigger google-cloud-functions google-cloud-pubsub google-cloud-sdk google-cloud-storage google-cloud-terraform sendgrid terraform

Last synced: 18 Jan 2025

https://github.com/kyoshidajp/bqcop

Save your BigQuery cost.

bigquery golang

Last synced: 21 Jan 2025

https://github.com/esanchezros/bigquery-maven-plugin

Maven plugin for managing BigQuery datasets, tables and views

bigquery java maven maven-plugin

Last synced: 22 Jan 2025

https://github.com/teraearlywine/sample_sql

The following repo contains samples of SQL code that can be referenced by future clients or employers.

bigquery database mysql sql

Last synced: 21 Jan 2025

https://github.com/metrics-pli/bigquery-export

Exports collected metrics to Google Big Query

bigquery datastudio lighthouse metrics metrics-pli performance pupeteer

Last synced: 25 Jan 2025

https://github.com/elithrar/finding-bugs-with-bigquery

A talk on using BigQuery, the GitHub Public Data & some elbow grease to find bugs in OSS projects.

big-data bigquery bugs github golang open-source

Last synced: 24 Jan 2025

https://github.com/misicode/Kaggle-Intro_to_SQL

Solutions of the exercises from the course "Introduction to SQL with BigQuery" by @Kaggle.

bigquery kaggle kaggle-intro-to-sql sql

Last synced: 23 Oct 2024

https://github.com/mehmoodulhaq570/bigquery_machine_learning_project

Developed a machine learning model to predict incident groups based on data from the London Fire Brigade service calls.

bigquery bigquery-dataset cloud database jupyter-notebook machine-learning machine-learning-algorithms ml models prediction-algorithm prediction-model python

Last synced: 22 Dec 2024

https://github.com/tupizz/fiap_pnad-covid-19

Este projeto realiza a análise e transformação de dados da PNAD COVID-19 de maio a julho de 2020, utilizando PySpark para processamento de dados em larga escala e BigQuery como destino para armazenamento e análise posterior. O objetivo é consolidar os dados mensais em um único conjunto de dados transformado.

analysis bigquery pyspark python

Last synced: 09 Feb 2025

https://github.com/essien1990/etl_pipeline_airflow

Creating pipelines using Python3 and Apache Airflow to load tables into Google Big Query Dataware House

airflow airflow-dags airflow-operators bash bigquery bq datawarehouse etl-pipeline python3

Last synced: 21 Jan 2025

https://github.com/rohitsanj/superset-dbt-demo

This repository contains an example project (Jaffle Shop) demonstrating integration between Superset and dbt, with BigQuery as the data warehouse.

apache-superset bigquery dbt superset

Last synced: 23 Jan 2025

https://github.com/danlessa/meta_qa

A practical one-liner metalanguage for describing common-sense in an machine-friendly way.

bigquery metalanguage

Last synced: 08 Feb 2025

https://github.com/mchirico/gmail

Inserts Gmail messages into BigQuery, then, deletes.

angular9 bigquery gcp gmail python3

Last synced: 23 Jan 2025

https://github.com/mlabarrere/pygquery

🐷 Multitread your data with Google BigQuery

bigquery dataframe google-bigquery multithreading pandas python

Last synced: 21 Jan 2025

https://github.com/fpopic/bigquery-schema-select

(Script) Generates SQL query that selects all fields (recursively for nested fields) from the provided BigQuery schema file.

bigquery bigquery-schema scala sql

Last synced: 21 Jan 2025

https://github.com/dav009/bqt

Local unit tests for your BigQuery queries

bigquery bq data test unittest

Last synced: 21 Jan 2025

https://github.com/ostrokach/uniparc_xml_parser

UniParc dataset describing ~300 million protein sequences converted into relational tables accessible through Google BigQuery (and as Parquet files).

bigquery bioinformatics csv-files parquet-files protein-domains protein-sequences

Last synced: 21 Jan 2025

https://github.com/chandanpasunoori/event-sync

Event Sync is for syncing events from multiple sources to multiple destinations, targetted for adhoc events, where sources support acknowledgement functionality.

bigquery golang-tools google-cloud-platform pubsub

Last synced: 19 Dec 2024

https://github.com/george-nyamao/gcp_etl_project

An ETL pipeline to move an uploaded flat file ffrom GCS, mask PII, store Big Query, and Create a report in Looker.

airflow bigquery cloudcomposer data-fusion gcs-bucket looker python3 wrangler

Last synced: 21 Jan 2025

https://github.com/prathmeshyelne/etl-pipeline-for-employee-data-using-data-fusion-airflow

This repository contains code and configuration files for an Extract, Transform, Load (ETL) project using Google Cloud Data Fusion for data extraction, Apache Airflow/Composer for orchestration, and Google BigQuery for data loading.

airflow bigquery dataengineering etl gcp googlecloudplatform

Last synced: 21 Jan 2025

https://github.com/tomgorb/project-template-for-production

project template to (help) put a Machine/Deep learning algorithm into production

airflow bigquery gcp

Last synced: 09 Jan 2025

https://github.com/nghiant3110/google_analytic_4

This is a DA project based on the GA4 Sample dataset on Big Query

bigquery google-analytics looker-studio sql

Last synced: 24 Dec 2024

https://github.com/windi-wulandari/pbi_kimia-farma-x-rakamin

A data-driven analytics project for Kimia Farma to evaluate business performance from 2020-2023 using BigQuery. Focused on transaction data, inventory, branch operations, and product insights. Results were visualized through an interactive dashboard to support strategic decisions and optimizations.

big-data-analytics bigquery datawarehouse googlelooker sql

Last synced: 23 Jan 2025

https://github.com/antoinegiraud/dataform_hypermarche

SQL repo orchestrated by Dataform for BigQuery

bigquery dataform

Last synced: 08 Jan 2025

https://github.com/alterra-greeve/de-capstone

Capstone Project SIB Batch 6 x Alterra Academy - Data Engineer

bigquery cloud-function data-engineering docker googlefirebase looker-studio python

Last synced: 21 Nov 2024

https://github.com/cartodb/carto-auth

Python library to authenticate with CARTO

auth bigquery carto carto-dw oauth

Last synced: 12 Oct 2024

https://github.com/ackeecz/terraform-gcp-dataflow_pubsub_to_bq

Dataflow job subscriber to PubSub subscription. It takes message from subscription and push it into BigQuery table.

bigquery dataflow pubsub terraform-module

Last synced: 07 Jan 2025

https://github.com/greenpeace/gpes-old-en-petitions-api-emulator

Emulates the deprecated EN petition's API. Useful if you have legacy microsites with petitions.

bigquery mysql petitions sqlite3

Last synced: 17 Nov 2024

https://github.com/zkan/running-bigquery-query-from-airflow-using-bigqueryexecuteoperator

Running BigQuery Query from Airflow using BigQueryExecuteOperator

airflow bigquery sql

Last synced: 19 Dec 2024

https://github.com/yaph/queries

Collection of Data Queries in SPARQL and SQL

bigquery data-mining dbpedia openstreetmap osm queries sparql sql stackoverflow wikidata

Last synced: 08 Jan 2025

https://github.com/icarusso/bigqueryexporter

Export query data from google bigquery to local machine

bigquery csv export python

Last synced: 21 Nov 2024

https://github.com/ajaxbarcelonacruyff/gcp_cost

Monitoring Google Cloud costs with Looker Studio.

bigquery googlecloud googlecloudplatform lookerstudio

Last synced: 25 Dec 2024

https://github.com/ajaxbarcelonacruyff/ga4_bigquery_session_source

Creating GA4 session references in BigQuery.

bigquery ga4 googleanalytics

Last synced: 25 Dec 2024

https://github.com/yu-iskw/terraform-google-copy-bq-datasets

A terraform module to copy BigQuery datasets across regions

bigquery data-engineering google-cloud terraform

Last synced: 21 Dec 2024

https://github.com/matt-strautmann/dbt-bigquery-ecommerce-quickstart

Welcome to the dbt-BigQuery Quickstart Project! 🎉 This repository is designed as a hands-on guide to help you build a modern data stack leveraging powerful tools like Airbyte for ingestion, dbt for transformation, and BigQuery for storage and analytics.

bigquery dbt e-commerce quickstarts

Last synced: 17 Jan 2025

https://github.com/nais/bqrator

Operator for creating BigQuery datasets

bigquery bigquery-operator kubernetes kubernetes-operator nais-features

Last synced: 04 Feb 2025

https://github.com/gdbecker/dbtlabslearning

Learn the foundational steps of transforming data in dbt Cloud. Start by connecting dbt Cloud to a data warehouse and Git repository, then explore key concepts like modeling, sources, testing, documentation, and deployment. Get hands-on by building a model and running tests in dbt Cloud.

analytics-engineering bigquery dbt dbt-cloud jinja macros models packages sql testing

Last synced: 22 Jan 2025

https://github.com/samanthalang/samanthalang_portfolio

Une data analyste avec la vision d'une consommatrice et la stratégie d'une marketeuse.

bigquery excel figma mysql notebook numpy pandas postgresql powerbi powerquery python sql sqlite wordpress

Last synced: 05 Feb 2025

https://github.com/digitaloptimizationgroup/digitaloptgroup-r-notebooks

A collection of R notebooks to analyze data from the Digital Optimization Group Platform

ab-testing bigquery jupyter-notebook performance-analysis r web-analytics

Last synced: 21 Jan 2025

https://github.com/francois-lenne/play-bq-gcp

Data pipeline in order to retrieve data from the playstation API to BigQuery

bigquery cicd data-engineering google-cloud python

Last synced: 13 Jan 2025

https://github.com/ruru-lyy/nyc-taxi-service-pipeline

In this project, I built a data pipeline using Mage.ai for ETL, GCP for storage, BigQuery for querying, and Looker Studio for analytics. This project helped me learn how to process, store, and visualize data effectively using modern tools.

bigquery data-engineering data-modeling etl-pipeline looker mage-ai python

Last synced: 23 Jan 2025

https://github.com/edumoraes1/spam_count_sfmc

Consulta de SQL com contagem de envios de email e spam dos ultimos 365 dias

bigquery marketing-cloud salesforce sql

Last synced: 31 Dec 2024

https://github.com/yoshiyukikato/nightharbor-bigquery-reporter

A nightharbor reporter for GCP BigQuery

bigquery lighthouse

Last synced: 23 Jan 2025

https://github.com/victorcezeh/data-engineering-final-semester-portfolio

This GitHub repository serves as a comprehensive platform for managing and showcasing my data engineering projects and assessments throughout my final semester at Alt School Africa. Designed to foster collaboration, organization, and continuous improvement, this repository is the backbone of my academic journey in data engineering.

bigquery docker gcs-bucket postgresql python

Last synced: 17 Nov 2024

https://github.com/vidyadnina/cyclistic-sql-tableau-project

Trip data analysis for a bike-sharing service company using SQL and Tableau.

bigquery dashboard data-analysis data-analytics-sql data-cleaning data-visualization sql

Last synced: 21 Jan 2025

https://github.com/antbit96/dataform_poc

Template for basic data preparation

bigquery bigquery-dataform data-preparation

Last synced: 14 Dec 2024

https://github.com/lupusruber/music_analytics

This project processes real-time music event data using Kafka, Apache Spark on Google Cloud Dataproc, and stores the transformed data in BigQuery for analytics, all orchestrated by Airflow and managed with Terraform.

bigquery data-proc dimensional-modeling gcp-project kafka spark-structured-streaming

Last synced: 02 Feb 2025

https://github.com/drvipulasharma/e-commerce-data-analysis-sql-big---query

E-Commerce-Data-Analysis-SQL-Big-Query

bigquery sql

Last synced: 23 Jan 2025

https://gitlab.com/solidninja/albion

A Scala BigQuery client

bigquery scala

Last synced: 05 Feb 2025

https://github.com/govau/warcraider

Convert WARC files into Avro for big data processing

avro bigquery crawler rust warc

Last synced: 21 Jan 2025

https://github.com/xennis/particulate-matter-sensor-storage

Store the particulate matter data from a luftdaten.info sensor in BigQuery

bigquery cloud-function luftdaten particulate-matter sensor-data

Last synced: 18 Nov 2024

https://github.com/lixx21/airflow-dbt-gcp

A comprehensive data pipeline leveraging Airflow, DBT, Google Cloud Platform (GCP), and Docker to extract, transform, and load data seamlessly from a staging layer to a data warehouse and data mart.

airflow bigquery data-engineer dbt gcp

Last synced: 29 Jan 2025

https://github.com/oguzgn/fully-automated-performance-marketing-dashboard

This project integrates data from multiple ad platforms with Google Analytics to track marketing campaigns. It uses a structured naming system and UTM tags. Data is visualized in Looker Studio dashboards to analyze campaign performance and ad spend.

bigquery data-analysis data-engineering data-modeling marketing-analytics marketing-automation marketing-data-science marketingdata sql

Last synced: 29 Jan 2025

https://github.com/gabrieladados/people-analytics

People Analytics: Insights para Retenção de Talentos

bigquery figma people-analytics sql tableau

Last synced: 29 Jan 2025

https://github.com/pratshrestha/cochin-traders---sql--sales-analysis

Cochin Traders imports and exports specialty foods globally. This project analyzes sales and operational data to enhance business efficiency, supply chain management, and sales performance. Key areas of focus include

bigquery customer-engagement employee-performance inventory-management sales-trends sql

Last synced: 21 Jan 2025

https://github.com/scraly/bigquery

Google BigQuery AaaS tools, tips and fun

bigquery java

Last synced: 25 Dec 2024

https://github.com/brpy/nyc-trips

Data engineering | Zoomcamp journey on nyc trip data with gcp stack

bigquery dbt gcp pyspark

Last synced: 22 Dec 2024

https://github.com/ayresgneto/use-case-gcp-etl

ELT pipeline GCP. Tecnologias utilizadas: Postgresql, GCP Storage, Airflow (local), Pyspark (local), BigQuery

airflow big-data bigquery data data-engineering etl gcp pipeline postgresql programming-oriented-object pyspark python spark

Last synced: 21 Jan 2025

https://github.com/tosh2230/cdc-rds-bq

Change data capture from Amazon RDS to Google BigQuery

bigquery changedatacapture rds

Last synced: 21 Jan 2025

https://github.com/codingsancho/fastapi-bigquery

Learning exercise, Python backend, FastAPI, bigquery, React-JS frontend.

bigquery fastapi javascript python react

Last synced: 20 Dec 2024

https://github.com/adindasarianti/rakamin_kf_analytics

This repository contains my project as a Big Data Analytics intern at Kimia Farma, where I analyzed the performance of Kimia Farma from 2020 to 2023

bigquery dataanalytics lookerstudio

Last synced: 02 Jan 2025

https://github.com/phukon/package-insights

PyPI package reports and insights. The data was ingested from publicly available source using BigQuery and then transformed.

big-data bigquery dbt duckdb

Last synced: 27 Jan 2025

https://github.com/dobsontom/basket-abandonment

Data pipeline for detecting and responding to basket abandonment using BigQuery and Adobe Campaign.

adobe-campaign bigquery ga4 gcp sql

Last synced: 21 Nov 2024

https://github.com/niteshchawla/nc-sql-business-case

A Leading Retail chain brand and a prominent retailer in the United States. It makes itself a preferred shopping destination by offering outstanding value, inspiration, innovation and an exceptional guest experience that no other retailer can deliver.

bigquery retail sql supermarket

Last synced: 21 Jan 2025

https://github.com/nealwp/blobview

Generate BigQuery SQL views from JSON

bigquery cli json sql

Last synced: 21 Jan 2025

https://github.com/hrialan/dataform-prune

An open-source tool for automating the cleanup of outdated objects in Dataform configurations, optimizing data workflows with seamless CI/CD integration.

automation bigquery data-analytics dataform

Last synced: 21 Nov 2024

https://github.com/sahilmb/employee-churn-da

A data analysis project on employee churn rate using Google Bigquery, Looker, Pycaret and Colab

bigquery looker-studio pycaret

Last synced: 21 Nov 2024

https://github.com/akihokurino/dbt-sample

dbt sample

bigquery dbt python3

Last synced: 07 Feb 2025

https://github.com/ket0825/v1-gcp-preview

Preview 서비스를 위한 GCP 레포 / Manage GCP src for preview services

bigquery cloud-functions cloud-run cloudbuild gcp logging pubsub

Last synced: 21 Nov 2024

https://github.com/marielachirinosr/analysis-urgencias-hospital-pitalito

This project involves analyzing emergency room admission data from the E.S.E Hospital Departamental de Pitalito using a star schema model.

bigquery data data-analysis etl-pipeline tableau

Last synced: 21 Nov 2024

https://github.com/themihirmathur/uber-data-analytics

The goal of this project is to perform comprehensive data analytics on Uber trip data using a modern data engineering stack on Google Cloud Platform (GCP).

bigquery data-analysis data-engineering etl-pipeline google-cloud-platform looker python

Last synced: 21 Nov 2024

https://github.com/thanhloc81/sql-project-bicycles-practise

✨ Utilizing SQL to extract data following a simulated task involving the Sales and Product modules

adventureworks bicycle bigquery google-cloud sql

Last synced: 21 Jan 2025

https://github.com/tharun2806/end-to-end-internship-data-analysis

Internship Dataset Analysis is an end-to-end project analyzing an internship dataset obtained from Kaggle. The project involves cleaning and preprocessing the data using Excel and SQL, followed by exploratory data analysis (EDA). The analysis includes statistical, sectoral and geospatial insights, visualized through an interactive Tableau dashboard

bigquery data-analysis data-cleaning data-preprocessing data-visualization exploratory-data-analysis geospatial-analysis microsoft-excel reporting sectoral-analysis statistical-analysis tableau-public

Last synced: 07 Feb 2025

https://github.com/newtonmunene99/sec-filings

Simple golang app that crawls sec EDGAR filings and loads indices into Google BigQuery

bigquery cloudstorage gcp golang

Last synced: 21 Jan 2025

https://github.com/juldrixx/bigquery-avro-schema-converter

Website to convert a schema from one format to another between BigQuery and Avro

avro avro-schema bigquery bigquery-schema converter schema

Last synced: 22 Jan 2025

https://github.com/branb97/jobstreet-data-eng-project

Building a data pipeline to deliver job listing data from Jobstreet for analysis.

airflow bigquery data-engineering etl-pipeline google-cloud looker-studio python sql

Last synced: 22 Jan 2025

https://github.com/ivdatahub/pypi-package-stats

Project for ingest pypi packages data from BigQuery and send to DataDog for analysis and insights with dashboards, monitors and more

bigquery cloud data-engineering data-warehouse gcp software-engineering

Last synced: 21 Nov 2024

https://github.com/push-protocol/push-google-bigquery

The Power of Web3 Big Data: A Guide to Using Google BigQuery and Push Protocol for Data Communication and Analysis

bigquery data push push-notifications web3

Last synced: 31 Jan 2025

https://github.com/yu-iskw/homebrew-bigquery-to-datastore

A homebrew tap for bigquery-to-datastore

bigquery google-datastore homebrew

Last synced: 10 Feb 2025

BigQuery Awesome Lists
BigQuery Categories