Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

BigQuery

Google BigQuery enables companies to handle large amounts of data without having to manage infrastructure. Google’s documentation describes it as a « serverless architecture (that) lets you use SQL queries to answer your organization’s biggest questions with zero infrastructure management. BigQuery’s scalable, distributed analysis engine lets you query terabytes in seconds and petabytes in minutes. » Its client libraries allow the use of widely known languages such as Python, Java, JavaScript, and Go. Federated queries are also supported, making it flexible to read data from external sources.

📖 A highly rated canonical book on it is « Google BigQuery: The Definitive Guide », a comprehensive reference. Another enriching read on the subject is the inside story told in the article by the founding product manager of BigQuery celebrating its 10th anniversary.

https://github.com/nealwp/blobview

Generate BigQuery SQL views from JSON

bigquery cli json sql

Last synced: 21 Jan 2025

https://github.com/hrialan/dataform-prune

An open-source tool for automating the cleanup of outdated objects in Dataform configurations, optimizing data workflows with seamless CI/CD integration.

automation bigquery data-analytics dataform

Last synced: 21 Nov 2024

https://github.com/sahilmb/employee-churn-da

A data analysis project on employee churn rate using Google Bigquery, Looker, Pycaret and Colab

bigquery looker-studio pycaret

Last synced: 21 Nov 2024

https://github.com/raqssoriano/hha504_assignment_nosql_dbs

This task is part of my assignment focused on creating and configuring databases in different platforms, such as GCP's BigQuery, MongoDB Atlas, and Redis Cloud.

bigquery mongodb-atlas mongodbcompass redis redisinsight

Last synced: 10 Feb 2025

https://github.com/ket0825/v1-gcp-preview

Preview 서비스를 위한 GCP 레포 / Manage GCP src for preview services

bigquery cloud-functions cloud-run cloudbuild gcp logging pubsub

Last synced: 21 Nov 2024

https://github.com/marielachirinosr/analysis-urgencias-hospital-pitalito

This project involves analyzing emergency room admission data from the E.S.E Hospital Departamental de Pitalito using a star schema model.

bigquery data data-analysis etl-pipeline tableau

Last synced: 21 Nov 2024

https://github.com/themihirmathur/uber-data-analytics

The goal of this project is to perform comprehensive data analytics on Uber trip data using a modern data engineering stack on Google Cloud Platform (GCP).

bigquery data-analysis data-engineering etl-pipeline google-cloud-platform looker python

Last synced: 21 Nov 2024

https://github.com/thanhloc81/sql-project-bicycles-practise

✨ Utilizing SQL to extract data following a simulated task involving the Sales and Product modules

adventureworks bicycle bigquery google-cloud sql

Last synced: 21 Jan 2025

https://github.com/nghiant3110/google_fiber_bi_5

This is a BI Capstone project based on the Google Fiber dataset from Google BI Course

bigquery google-sheets looker-studio sql

Last synced: 15 Feb 2025

https://github.com/martinkalema/bigquery-pubsub

Loading data into BigQuery Table

bigquery data-engineering flat-file kafka

Last synced: 11 Jan 2025

https://github.com/goosethedev/de-zoomcamp-2025

Homeworks for the DataTalksClub's Data Engineering Zoomcamp 2025.

bigquery data-engineering kestra python terraform

Last synced: 12 Feb 2025

https://github.com/codingsancho/fastapi-bigquery

Learning exercise, Python backend, FastAPI, bigquery, React-JS frontend.

bigquery fastapi javascript python react

Last synced: 12 Feb 2025

https://github.com/oguzgn/a-case-study-for-a-livestreaming-platform

This project aims to analyze livestream watch times of users across different regions. The goal is to identify the top 5 users with the highest watch time for each region. The analysis involves multiple SQL transformations to extract meaningful insights from the data.

bigquery data data-analysis data-modeling live-streaming sql

Last synced: 27 Jan 2025

https://github.com/loinguyen3108/sportify-music-analysis

Engineered the streaming crawler pipeline using Kafka to extract, transform, and load Spotify data into PostgreSQL and ClickHouse for real-time analytics. Additionally, developed an automated batching pipeline using Airflow and Spark to efficiently ETL crawled data into BigQuery.

airflow bigquery clickhouse kafka pyspark spotify

Last synced: 14 Feb 2025

https://github.com/aafaf-arharas/de-zoomcamp2025

This repository contains my work completed during the Data Engineering Zoomcamp 2025

bigquery docker gcp kestra python sql terraform

Last synced: 14 Feb 2025

https://github.com/flowerinthenight/bqstream

A simple library to help facilitate streaming to BigQuery.

bigquery go golang streaming

Last synced: 08 Jan 2025

https://github.com/janmin123/cyclistic

Capstone project for Google/Coursera Data Analytics Course

analysis bigquery sql tableau visualization

Last synced: 04 Feb 2025

https://github.com/juldrixx/bigquery-avro-schema-converter

Website to convert a schema from one format to another between BigQuery and Avro

avro avro-schema bigquery bigquery-schema converter schema

Last synced: 22 Jan 2025

https://github.com/branb97/jobstreet-data-eng-project

Building a data pipeline to deliver job listing data from Jobstreet for analysis.

airflow bigquery data-engineering etl-pipeline google-cloud looker-studio python sql

Last synced: 22 Jan 2025

https://github.com/akansharajput280799/strategic-analysis-of-retail-brand-in-south-america-using-sql

Leveraged Big Query and MySQL to analyze 100K records for sales optimization, trend identification, and enhancing customer satisfaction for a retail brand in South America and to provide insights and recommendations to improve their userbase and improve their services

bigquery data-analysis data-science database database-schema google-bigquery mysql-server sql

Last synced: 04 Feb 2025

https://github.com/tuancamtbtx/dataform-utils

Bigquery Dataform Javascript Utils Package - Support Ads, Query Common, ...

bigquery dataform datawarehouse

Last synced: 17 Feb 2025

https://github.com/patriciavalentine/loan-data-queries

In this project, I analyzed a vehicle loan dataset using BigQuery to identify demographic, financial, and loan patterns. Through SQL queries, I extracted insights such as the credit scores, and loan distribution by region, and explored high-risk profiles. The findings are visualized in Looker Studio, thus helping to inform strategic decisions.

asset-finance bigquery loan-data looker-studio

Last synced: 04 Feb 2025

https://github.com/rrmcguinness/protoc-gen-bq-schema

A protocol buffer compiler (protoc) plugin for generating Google BigQuery JSON table definitions.

bigquery bigquery-schema protobuf

Last synced: 13 Jan 2025

https://github.com/moeabbas6/bq_data_loader

A Python script for executing and logging batch SQL commands in Google BigQuery. Includes tracking of execution times, unique job and statement IDs, and automated logging to a specified BigQuery table.

bigquery data python

Last synced: 29 Jan 2025

https://github.com/vigneshss-07/cloud-bigquery-and-sql---the-interview-guide

This deals with SQL commands, interview preparation and query questions and solutions in BigQuery

azuresql bigquery gcp sql sql-query sql-server sqlalchemy

Last synced: 15 Feb 2025

https://github.com/sangnandar/load-csvs-from-gcs-to-bigquery

Google Apps Script to streamline loading CSV data from Google Cloud Storage (GCS) into BigQuery.

bigquery csv-import google-apps-script google-cloud-storage

Last synced: 13 Jan 2025

https://github.com/sejalmankar1012/product_data_analyst_assessement

Analyzing the Impact of Business Hour Mismatch on Order Volume in the Food Delivery Industry: A Case Study of UEats and Ghub

assessment-project bigquery loop product-analyst sql-query

Last synced: 26 Jan 2025

https://github.com/kellyjadams/ap-exam-scores

Analyzing AP exam scores for a school.

bigquery sql

Last synced: 08 Jan 2025

https://github.com/hayashi-yudai/cloudfunc_login

Example of authentication function for login with Cloud Functions and BigQuery

bigquery gcp-cloud-functions golang server

Last synced: 15 Jan 2025

https://github.com/giorgishengelia/bike-share-analysis-report

Help developing marketing strategy using data analytics to help convert casual riders into members

bigquery sql tableau

Last synced: 05 Feb 2025

https://github.com/mehbas92/sql

SQL Using BigQuery

bigquery sql

Last synced: 13 Feb 2025

https://github.com/squidmin/bigquery-labs

GCP BigQuery CLI

bigquery gcp java

Last synced: 07 Feb 2025

https://github.com/angulartist/scio-demo

Playing w/ Scio

apache-beam bigquery scio

Last synced: 10 Nov 2024

https://github.com/riju18/airflow-data-engineering-with-bigquery-and-dbt

Fetch Data from a simple csv file, send the data in GCP BigQuery table, run dbt to automate the DWH and run SODA to check Data Quality.

apache-airflow bigquery csv dbt python3 soda

Last synced: 28 Jan 2025

https://github.com/marceloneppel/gcs-to-bigquery

WIP: Moving data from GCS to BigQuery.

bigquery gcs scala scio

Last synced: 30 Jan 2025

https://github.com/bsrikanth24/etl-pipeline-project-sales

This project implements a pipeline ETL to process fictitious sales data.

bigquery pandas-dataframe python

Last synced: 06 Feb 2025

https://github.com/marielachirinosr/nyc-taxi-trip-exploration-2019-2020

Explores passenger behavior & impact of COVID-19 on NYC taxi industry (Q1 2019-2020).

bigquery data data-analysis data-visualization python sql tableau

Last synced: 19 Feb 2025

https://github.com/siriospa/gcp-helpers-bigquery

Helpers for Google Cloud BigQuery.

bigquery gcp google-cloud-platform sirio

Last synced: 16 Feb 2025

https://github.com/ivanildobarauna/pypi-package-stats

Project for ingest pypi packages data from BigQuery and send to DataDog for analysis and insights with dashboards, monitors and more

bigquery cloud data-engineering data-warehouse gcp software-engineering

Last synced: 19 Feb 2025

https://github.com/sangnandar/insert-unique-record

This is Cloud Functions script to insert only unique records into BigQuery.

bigquery digital-marketing-analytics google-cloud-functions

Last synced: 19 Feb 2025

https://github.com/lucashomuniz/project-22

[Dashboard] Data and Sustainability: Optimizing Green Flow's Fertilizer Portfolio

agrotech bigquery data-analytics data-structures data-visualization google-cloud-platform powerbi powerbi-visuals powerquery sql sustainability

Last synced: 25 Jan 2025

https://github.com/rafal-kowalski-dev/selling-cars-analize

Hobby project for learning PySpark, AirFlow and BigQuery

airflow bigquery gcp pyspark python sqlalchemy

Last synced: 30 Jan 2025

https://github.com/seahrh/nyc-taxi-trips

REST API for the New York City Taxi Trips public dataset, implemented in Scala and Play Framework 2.7

bigquery nyc-taxi-dataset play-framework rest-api scala

Last synced: 03 Feb 2025

https://github.com/quipper/send-ci-result-to-bigquery-action

Send test results to BigQuery in GitHub Actions

bigquery github-actions google-bigquery junit-xml

Last synced: 09 Jan 2025

https://github.com/akihokurino/dbt-sample

dbt sample

bigquery dbt python3

Last synced: 07 Feb 2025

https://github.com/chdl17/nyc_green_taxis_peak_hour_analysis

This project analyzes GCP BigQuery data and uses Looker Studio to build a Peak Hour Analysis.

bigquery gcp google-cloud-platform looker-studio sql

Last synced: 21 Nov 2024

https://github.com/jmfeck/bigquery-local-framework

This repo provides tools to manage BigQuery operations locally, simplifying tasks like uploading flat files, running SQL queries, and downloading tables. It offers a unified interface for local BigQuery interactions, enabling more efficient interaction with it.

bigquery data-engineering ingestion pandas python

Last synced: 18 Jan 2025

https://github.com/owox/sgtm-owox-ga4-bigquery

OWOX BI Streaming is an advanced tracking to get the most from existing Google Analytics 4 installed on your website

analytics bigquery ga4

Last synced: 13 Feb 2025

https://github.com/tharun2806/end-to-end-internship-data-analysis

Internship Dataset Analysis is an end-to-end project analyzing an internship dataset obtained from Kaggle. The project involves cleaning and preprocessing the data using Excel and SQL, followed by exploratory data analysis (EDA). The analysis includes statistical, sectoral and geospatial insights, visualized through an interactive Tableau dashboard

bigquery data-analysis data-cleaning data-preprocessing data-visualization exploratory-data-analysis geospatial-analysis microsoft-excel reporting sectoral-analysis statistical-analysis tableau-public

Last synced: 07 Feb 2025

https://github.com/ackeecz/terraform-gcp-cloud-run_pubsub_to_bq

Cloud Run subscribes itself to given topic and inserts each message to BigQuery table.

bigquery gcp pubsub terraform

Last synced: 07 Jan 2025

https://github.com/allanreda/share-of-search-retrieval-and-visualization

Share of search analysis including data retrieval from Google Ads API, storing data in BigQuery and visualizing it in Looker Studio

bigquery google-ads-api looker-studio python share-of-search

Last synced: 18 Feb 2025

https://github.com/allanreda/video-processing-and-categorization

Video processing and categorization using computer vision, machine learning and cloud computing

bigquery cloud-storage-bucket cnn computer-vision google-cloud kmeans-clustering machine-learning opencv2 tensorflow virtual-machine

Last synced: 18 Feb 2025

https://github.com/amitkumarj441/mysql2bigquery

A script to load a MySQL table in BigQuery. Extracts schema and data as JSON.

bigquery docker mysql scala

Last synced: 26 Jan 2025

https://github.com/fsistemas/bigquery-td

ETL to extract data from mysql load and merge in BigQuery

bigquery etl mysql python sql

Last synced: 03 Jan 2025

https://github.com/isaacmg/mimic_iv_bq_queries

Queries needed to recreate time series features for model training

bigquery mimic-iv sql

Last synced: 21 Jan 2025

https://github.com/mdornseif/datastore-to-bigquery

The missing Data Transfer Tool: Dump Google Cloud Datastore contents and load them into BigQuery.

backup bigquery bigquery-backup cloud datastore google

Last synced: 21 Jan 2025

https://github.com/arhea/go-mock-bigquery

Creates a mock BigQuery client based on the bigquery-emulator for testing in Golang projects.

bigquery golang golang-module google-bigquery google-cloud-platform testcontainers-go testing

Last synced: 21 Jan 2025

https://github.com/nlgtuankiet/bq-noti

BigQuery notification

bigquery bq notification notifier

Last synced: 21 Jan 2025

https://github.com/denny-b-justin/purdue

The internship was broadly to understand if the topics/events are being covered differently in the different countries and how they affect stock market returns. The provided dataset is a post-processed set of news articles, so already reflects topic modelling and sentiment analysis.

big-data bigquery finance gdelt-events python

Last synced: 21 Jan 2025

https://github.com/marcopellegrinoit/web-traffic-time-series-predictions

Forecast Web Traffic Demand Time Series with ARIMA+ BigQuery and Looker Studio. Addionatel modeling available with ARIMA, LSTM, and Facebook Prophet.

arima bigquery gcp lstm prophet-model time-series vertex-ai

Last synced: 21 Jan 2025

https://github.com/gabrieladados/analise-ecommerce

Análise SQL para E-commerce: Estratégias de Crescimento para Impulsionar Vendas

bigquery data-analysis ecommerce sql

Last synced: 06 Feb 2025

https://github.com/greatwoman23/car_insurance_analysis

The Car Insurance Analysis project aims to provide a comprehensive examination of a car insurance portfolio using advanced data analytics tools. The analysis offers valuable insights into policy demographics, claims patterns, and financial metrics, helping stakeholders make informed decisions.

bigquery data data-science dataanalytics insurance-claims looker-studio tableau

Last synced: 21 Jan 2025

https://github.com/syedsajjadaskari/end-to-end-chicago-taxi-tip-prediction-with-bigquery-and-vertex-ai

An end-to-end example of Chicago taxi on Google Cloud using TensorFlow, TFX, and Vertex AI

bigquery gcp tensorflow tfx vertex-ai

Last synced: 13 Jan 2025

https://github.com/vaibhavs10/ml-on-gcp

The repository walks through a Data Scientist focused way of building and deploying Machine Learning models on Google Cloud

aiplatform bigquery googlecloudplatform ml

Last synced: 06 Feb 2025

https://github.com/rubnsbarbosa/nasa-asteroids-extractor

ETL asteroids data extractor using some Google Cloud services

bigquery bucket cloud-storage google-cloud nasa-api-neows

Last synced: 21 Jan 2025

https://github.com/kmohamedalie/bigquery-intro

Coursera BigQuery Introduction using Covid19 dataset

bigquery coursera covid-19 datavisualization looker-studio sql

Last synced: 21 Jan 2025

https://github.com/machinelearningzuu/data-engineering-projects

This repository is a curated collection of projects and tools that exemplify best practices in data engineering. It serves as a resource for data professionals seeking to enhance their data infrastructure, optimize data pipelines, and implement cutting-edge data processing techniques.

airflow bigquery data-engineering data-science data-visualization data-warehouse

Last synced: 04 Feb 2025

https://github.com/istinnew/cook-me-up

[In Progress] Welcome to Cook-Me-Up! This project aims to analyze and organize cooking recipes using data analysis (Python, BigQuery SQL, Looker Studio etc.) and machine learning techniques. The goal is to simplify meal preparation and offer users a comprehensive database of culinary delights.

bigquery clustering cookme culinary data data-science dataanalysis datavisualization looker-studio machine-learning python recipe-search recipes unsupervised-learning

Last synced: 10 Feb 2025

https://github.com/minhajuddin2510/bigquery_alerts

In today’s data-driven world, organisations heavily rely on timely alerts to monitor critical systems and make informed decisions. However, when working with BigQuery, a popular cloud-based data warehouse, there is no built-in functionality to generate alerts. In this article, we will explore how I recently built a cloud function to address this

alerting bigquery cloudfunctions monitoring-tool slack

Last synced: 31 Jan 2025

https://github.com/mysto-007/cyclistic-bike-share-analysis

Analyzed the dataset of Cyclistic Rental Service as the Capstone project for Google Data Analytics SpecializationAnalyzed the dataset of Cyclistic bike-share (Capstone project for Google Data Analytics Specialization)

bigquery data-analysis excel ms-sql-server sql tableau tableau-public

Last synced: 18 Jan 2025

https://github.com/ivanildobarauna-dev/pypi-package-stats

Project for ingest pypi packages data from BigQuery and send to DataDog for analysis and insights with dashboards, monitors and more

bigquery cloud data-engineering data-warehouse gcp software-engineering

Last synced: 11 Dec 2024

https://github.com/nghiant3110/b2b_crm_3

This is a DA project based on the B2B Sales CRM dataset from Maven Analytics

bigquery google-sheets looker-studio sql

Last synced: 15 Feb 2025

https://github.com/cyber-programmer/web-traffic-analytics-ml-model

This Jupyter Notebook focuses on classifying website visitors using logistic regression. The project leverages Google Analytics sample data and BigQuery for data analysis and feature engineering. It provides a comprehensive workflow that includes data import, preprocessing, exploratory data analysis.

bigquery logistic-regression machine-learning

Last synced: 21 Jan 2025

https://github.com/mchmarny/stocker

Using tweeter sentiment and stock market price signal correlation to predict next day closing price

bigquery ml prediction regression-models

Last synced: 31 Dec 2024

https://github.com/nghiant3110/e_com_1

This is a DA project base on E-com Data set (Thelook_ecom) in Big Query from Google

bigquery looker-studio sql

Last synced: 15 Feb 2025

https://github.com/jasontanx/terraform-practice

Creating datasets and tables in Google BigQuery via Terraform

bigquery iac-terraform infrastructure-as-code terraform

Last synced: 01 Feb 2025

https://github.com/yiu31802/gcp-project

GCP AppEngine project of Twitter data and some sample code

appengine bigquery gcp google-bigquery google-cloud google-datastore resas twitter twitter-data twitter4j

Last synced: 02 Feb 2025

https://github.com/plishka/blockchain_analysis

Cryptocurrency On-Chain Analysis (Bitcoin Blockchain)

bigquery blockchain data-cleaning scraping-websites sql tableau

Last synced: 21 Jan 2025

https://github.com/celiason/coffee-funnel

webpage for visualizing sales projections of a small coffee business

bigquery prophet sales-analysis streamlit-webapp

Last synced: 17 Feb 2025

https://github.com/lawal-hash/olistelt

An end-to-end ELT data pipeline of the Brazilian olist e-commerce dataset using the modern data stack

airflow bigquery dbt dbt-core docker postgresql sql

Last synced: 21 Jan 2025

https://github.com/davidkhala/dwh-migration-tools

dwh-migration-tools: contribution fork

bigquery bq gcp

Last synced: 23 Jan 2025

https://github.com/nghiant3110/firebase_6

This is a DA project based on the Firebase Sample dataset on Big Query

bigquery firebase looker-studio sql

Last synced: 21 Jan 2025

https://github.com/manesioz/airflow-without-code

Dynamically generate DAGs to ingest SQL files into BigQuery with one line of "code"

airflow airflow-plugin bigquery python sql

Last synced: 05 Jan 2025

https://github.com/anpandu/ps2bq

Stream insert GCP PubSub messages into BigQuery table.

bigquery golang pubsub

Last synced: 21 Jan 2025

BigQuery Awesome Lists
BigQuery Categories