Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

BigQuery

Google BigQuery enables companies to handle large amounts of data without having to manage infrastructure. Google’s documentation describes it as a « serverless architecture (that) lets you use SQL queries to answer your organization’s biggest questions with zero infrastructure management. BigQuery’s scalable, distributed analysis engine lets you query terabytes in seconds and petabytes in minutes. » Its client libraries allow the use of widely known languages such as Python, Java, JavaScript, and Go. Federated queries are also supported, making it flexible to read data from external sources.

📖 A highly rated canonical book on it is « Google BigQuery: The Definitive Guide », a comprehensive reference. Another enriching read on the subject is the inside story told in the article by the founding product manager of BigQuery celebrating its 10th anniversary.

https://github.com/seahrh/nyc-taxi-trips

REST API for the New York City Taxi Trips public dataset, implemented in Scala and Play Framework 2.7

bigquery nyc-taxi-dataset play-framework rest-api scala

Last synced: 08 Dec 2024

https://github.com/quipper/send-ci-result-to-bigquery-action

Send test results to BigQuery in GitHub Actions

bigquery github-actions google-bigquery junit-xml

Last synced: 09 Jan 2025

https://github.com/rifa8/extract-load-demo

Learning Google Cloud Platform (GCP)

airbyte bigquery bucket gcp

Last synced: 27 Jan 2025

https://gitlab.com/solidninja/albion

A Scala BigQuery client

bigquery scala

Last synced: 11 Dec 2024

https://github.com/knands42/data-ingestion

Data Ingestion project to evaluate my Kotlin skill using concurrency

bigquery golang google-cloud-platform google-storage gradle-kotlin-dsl kotlin kotlin-flow

Last synced: 25 Jan 2025

https://github.com/sayed-ashfaq/target-sql

In this project, I analyzed Target company's data using SQL in BigQuery, focusing on data extraction, manipulation, and performing various analytical queries to derive insights.

aggregation bigquery cte joins sql

Last synced: 23 Dec 2024

https://github.com/squidmin/bigquery-labs

GCP BigQuery CLI

bigquery gcp java

Last synced: 14 Dec 2024

https://github.com/owox/sgtm-owox-ga4-bigquery

OWOX BI Streaming is an advanced tracking to get the most from existing Google Analytics 4 installed on your website

analytics bigquery ga4

Last synced: 20 Dec 2024

https://github.com/patriciavalentine/loan-data-queries

In this project, I analyzed a vehicle loan dataset using BigQuery to identify demographic, financial, and loan patterns. Through SQL queries, I extracted insights such as the credit scores, and loan distribution by region, and explored high-risk profiles. The findings are visualized in Looker Studio, thus helping to inform strategic decisions.

asset-finance bigquery loan-data looker-studio

Last synced: 09 Dec 2024

https://github.com/janmin123/cyclistic

Capstone project for Google/Coursera Data Analytics Course

analysis bigquery sql tableau visualization

Last synced: 09 Dec 2024

https://github.com/akansharajput280799/strategic-analysis-of-retail-brand-in-south-america-using-sql

Leveraged Big Query and MySQL to analyze 100K records for sales optimization, trend identification, and enhancing customer satisfaction for a retail brand in South America and to provide insights and recommendations to improve their userbase and improve their services

bigquery data-analysis data-science database database-schema google-bigquery mysql-server sql

Last synced: 09 Dec 2024

https://github.com/minhajuddin2510/bigquery_alerts

In today’s data-driven world, organisations heavily rely on timely alerts to monitor critical systems and make informed decisions. However, when working with BigQuery, a popular cloud-based data warehouse, there is no built-in functionality to generate alerts. In this article, we will explore how I recently built a cloud function to address this

alerting bigquery cloudfunctions monitoring-tool slack

Last synced: 04 Dec 2024

https://github.com/andrewm4894/gcp-telemetry-example

Simple HTTP endpoint for telemetry data type events in GCP.

bigquery gcp-cloud-functions gcp-storage python terraform

Last synced: 05 Dec 2024

https://github.com/jasontanx/ridership-headline-project

This end to end data engineering / data analytics project will be about the Malaysian public transport ridership data.

bigquery data-engineering minio-server public-transport-ridership terraform

Last synced: 05 Dec 2024

https://github.com/oguzgn/firebase-ab-test-analysis-for-a-mobile-race-game

This repository showcases an infrastructure designed for analyzing A/B tests in mobile games. It leverages BigQuery to process Firebase and GA4-based event data and uses Looker Studio for dynamic visualization. The project simplifies A/B test comparisons, enabling stakeholders to view results directly through interactive dashboards.

ab-testing ab-testing-analysis bigquery event-based-tracking firebase looker-studio mobile-game-analytics race-game sql

Last synced: 26 Jan 2025

https://github.com/jasontanx/terraform-practice

Creating datasets and tables in Google BigQuery via Terraform

bigquery iac-terraform infrastructure-as-code terraform

Last synced: 05 Dec 2024

https://github.com/alessio-siciliano/google-cloud-python-class-wrapper

An example of several classes written in Python to interact with GCP

bigquery datatransfer gcp google-cloud

Last synced: 26 Jan 2025

https://github.com/valenthr/purchase_funnel

Google merch store sales analysis

bigquery product-analysis sql

Last synced: 27 Jan 2025

https://github.com/machinelearningzuu/data-engineering-projects

This repository is a curated collection of projects and tools that exemplify best practices in data engineering. It serves as a resource for data professionals seeking to enhance their data infrastructure, optimize data pipelines, and implement cutting-edge data processing techniques.

airflow bigquery data-engineering data-science data-visualization data-warehouse

Last synced: 10 Dec 2024

https://github.com/sejalmankar1012/product_data_analyst_assessement

Analyzing the Impact of Business Hour Mismatch on Order Volume in the Food Delivery Industry: A Case Study of UEats and Ghub

assessment-project bigquery loop product-analyst sql-query

Last synced: 26 Jan 2025

https://github.com/ankita-selokar/fitbit-for-her-crafting-fitbit-s-strategy-for-women

This project analyzes smart device usage data to uncover trends and insights, guiding Fitbit by Google’s product and marketing strategies for their new women-focused product launch. It combines competitive market analysis with customer behavior insights to inform key decisions.

bigquery excel powerbi spreadsheet sql

Last synced: 10 Dec 2024

https://github.com/ivanildobarauna-dev/pypi-package-stats

Project for ingest pypi packages data from BigQuery and send to DataDog for analysis and insights with dashboards, monitors and more

bigquery cloud data-engineering data-warehouse gcp software-engineering

Last synced: 11 Dec 2024

https://github.com/samanthalang/samanthalang_portfolio

Une data analyste avec la vision d'une consommatrice et la stratégie d'une marketeuse.

bigquery excel figma mysql notebook numpy pandas postgresql powerbi powerquery python sql sqlite wordpress

Last synced: 11 Dec 2024

https://github.com/francois-lenne/play-bq-gcp

Data pipeline in order to retrieve data from the playstation API to BigQuery

bigquery cicd data-engineering google-cloud python

Last synced: 13 Jan 2025

https://github.com/mchmarny/stocker

Using tweeter sentiment and stock market price signal correlation to predict next day closing price

bigquery ml prediction regression-models

Last synced: 31 Dec 2024

https://github.com/antbit96/dataform_poc

Template for basic data preparation

bigquery bigquery-dataform data-preparation

Last synced: 14 Dec 2024

https://github.com/scraly/bigquery

Google BigQuery AaaS tools, tips and fun

bigquery java

Last synced: 25 Dec 2024

https://github.com/davidkhala/dwh-migration-tools

dwh-migration-tools: contribution fork

bigquery bq gcp

Last synced: 23 Jan 2025

https://github.com/tosh2230/cdc-rds-bq

Change data capture from Amazon RDS to Google BigQuery

bigquery changedatacapture rds

Last synced: 21 Jan 2025

https://github.com/codingsancho/fastapi-bigquery

Learning exercise, Python backend, FastAPI, bigquery, React-JS frontend.

bigquery fastapi javascript python react

Last synced: 20 Dec 2024

https://github.com/adindasarianti/rakamin_kf_analytics

This repository contains my project as a Big Data Analytics intern at Kimia Farma, where I analyzed the performance of Kimia Farma from 2020 to 2023

bigquery dataanalytics lookerstudio

Last synced: 02 Jan 2025

https://github.com/newtonmunene99/sec-filings

Simple golang app that crawls sec EDGAR filings and loads indices into Google BigQuery

bigquery cloudstorage gcp golang

Last synced: 21 Jan 2025

https://github.com/ivdatahub/pypi-package-stats

Project for ingest pypi packages data from BigQuery and send to DataDog for analysis and insights with dashboards, monitors and more

bigquery cloud data-engineering data-warehouse gcp software-engineering

Last synced: 21 Nov 2024

https://github.com/push-protocol/push-google-bigquery

The Power of Web3 Big Data: A Guide to Using Google BigQuery and Push Protocol for Data Communication and Analysis

bigquery data push push-notifications web3

Last synced: 31 Jan 2025

https://github.com/rrmcguinness/protoc-gen-bq-schema

A protocol buffer compiler (protoc) plugin for generating Google BigQuery JSON table definitions.

bigquery bigquery-schema protobuf

Last synced: 13 Jan 2025

https://github.com/giorgishengelia/bike-share-analysis-report

Help developing marketing strategy using data analytics to help convert casual riders into members

bigquery sql tableau

Last synced: 12 Dec 2024

https://github.com/sangnandar/load-csvs-from-gcs-to-bigquery

Google Apps Script to streamline loading CSV data from Google Cloud Storage (GCS) into BigQuery.

bigquery csv-import google-apps-script google-cloud-storage

Last synced: 13 Jan 2025

https://github.com/kevin-rsj/real-estate-investments

Sistema de scoring que clasifica ciudades francesas para inversión en segundas viviendas según perfil de riesgo(alto, moderado y bajo). Evalúa ratios clave en áreas como demanda, disponibilidad, infraestructura, demografía y precios.

bigquery data-analytics looker-studio numpy pandas python sklearn-library sql visualization

Last synced: 17 Dec 2024

https://github.com/tupizz/fiap_pnad-covid-19

Este projeto realiza a análise e transformação de dados da PNAD COVID-19 de maio a julho de 2020, utilizando PySpark para processamento de dados em larga escala e BigQuery como destino para armazenamento e análise posterior. O objetivo é consolidar os dados mensais em um único conjunto de dados transformado.

analysis bigquery pyspark python

Last synced: 17 Dec 2024

https://github.com/neo4j-field/dataflow-flex-pyarrow-to-gds

Google Dataflow Flex Templates (in Python) for large scale Graph Loading with GDS and Apache Arrow

apache-arrow apache-beam bigquery dataflow neo4j python

Last synced: 23 Dec 2024

https://github.com/siriospa/gcp-helpers-bigquery

Helpers for Google Cloud BigQuery.

bigquery gcp google-cloud-platform sirio

Last synced: 12 Oct 2024

https://github.com/istinnew/cook-me-up

Welcome to Cook-Me-Up! This project aims to analyze and organize cooking recipes using data analysis (Python, BigQuery SQL, Looker Studio etc.) and machine learning techniques. The goal is to simplify meal preparation and offer users a comprehensive database of culinary delights.

bigquery clustering cookme culinary data data-science dataanalysis datavisualization looker-studio machine-learning python recipe-search recipes unsupervised-learning

Last synced: 17 Dec 2024

https://github.com/vaibhavs10/ml-on-gcp

The repository walks through a Data Scientist focused way of building and deploying Machine Learning models on Google Cloud

aiplatform bigquery googlecloudplatform ml

Last synced: 19 Dec 2024

https://github.com/chdl17/nyc_green_taxis_peak_hour_analysis

This project analyzes GCP BigQuery data and uses Looker Studio to build a Peak Hour Analysis.

bigquery gcp google-cloud-platform looker-studio sql

Last synced: 21 Nov 2024

https://github.com/jmfeck/bigquery-local-framework

This repo provides tools to manage BigQuery operations locally, simplifying tasks like uploading flat files, running SQL queries, and downloading tables. It offers a unified interface for local BigQuery interactions, enabling more efficient interaction with it.

bigquery data-engineering ingestion pandas python

Last synced: 18 Jan 2025

https://github.com/yu-iskw/bigquery-lineage

Visualize BigQuery data lineage graph

bigquery data-governance data-management visualization

Last synced: 17 Dec 2024

https://github.com/yu-iskw/homebrew-bigquery-to-datastore

A homebrew tap for bigquery-to-datastore

bigquery google-datastore homebrew

Last synced: 17 Dec 2024

https://github.com/amitkumarj441/mysql2bigquery

A script to load a MySQL table in BigQuery. Extracts schema and data as JSON.

bigquery docker mysql scala

Last synced: 26 Jan 2025

https://github.com/nghiant3110/e_com_1

This is a DA project base on E-com Data set (Thelook_ecom) in Big Query from Google

bigquery looker-studio sql

Last synced: 24 Dec 2024

https://github.com/nghiant3110/google_fiber_bi_5

This is a BI Capstone project based on the Google Fiber dataset from Google BI Course

bigquery google-sheets looker-studio sql

Last synced: 24 Dec 2024

https://github.com/nghiant3110/b2b_crm_3

This is a DA project based on the B2B Sales CRM dataset from Maven Analytics

bigquery google-sheets looker-studio sql

Last synced: 24 Dec 2024

https://github.com/raqssoriano/hha504_assignment_nosql_dbs

This task is part of my assignment focused on creating and configuring databases in different platforms, such as GCP's BigQuery, MongoDB Atlas, and Redis Cloud.

bigquery mongodb-atlas mongodbcompass redis redisinsight

Last synced: 18 Dec 2024

https://github.com/adadalshabab/data-engineering-gcp-project

An end-to-end modern data engineering project, including deployment of ETL pipeline on Google Cloud Platform, using BigQuery for data analysis and leveraging Looker to generate an insight dashboard.

bigquery data data-science data-visualization databases dataengineering-a engineering etl-pipeline looker-studio powerbi

Last synced: 19 Dec 2024

https://github.com/armahdavi/bigdata_pyspark_sales_analytics

Summarizing my big data code in python pyspark to analyze sales data with retail and walmart superstore to draw sales insights

big-data bigquery clustering dataframe hadoop k-means machine-learning pyspark pyspark-ml python spark unsupervised-learning

Last synced: 28 Dec 2024

https://github.com/fsistemas/bigquery-td

ETL to extract data from mysql load and merge in BigQuery

bigquery etl mysql python sql

Last synced: 03 Jan 2025

https://github.com/scraly/flume-bigquery-sink

An Apache Flume Sink implementation to publish data to Google BigQuery

bigquery flume sink

Last synced: 25 Dec 2024

https://github.com/alexgenovese/machine-learning-bigquery-gcp

These SQL are based on available ecommerce dataset that has millions of Google Analytics records for the Google Merchandise Store loaded into BigQuery.

bigquery google google-cloud-platform purchase sql visitors

Last synced: 28 Dec 2024

https://github.com/francois-lenne/elt-mp4-quiberon

the goal of this project is to retrieve the video of the municipality of quiberon and see if a person is in or no

bigquery cicd data-engineering docker elt google-cloud-functions google-cloud-platform google-cloud-run google-cloud-storage pipeline python sql unstructured-data

Last synced: 25 Dec 2024

https://github.com/fakhri098/project-sql-bigquery

This project aims to analyze taxi trip data with a focus on trip duration patterns, popular routes, and trip costs. The study was conducted to gain in-depth insights into taxi travel behavior based on historical data.

bigquery sql

Last synced: 17 Jan 2025

https://github.com/celiason/coffee-funnel

webpage for visualizing sales projections of a small coffee business

bigquery prophet sales-analysis streamlit-webapp

Last synced: 26 Dec 2024

https://github.com/ansh-info/stockpulse

Real-time stock market analytics pipeline with live visualization dashboard. Built with Python and GCP, featuring automated data processing and interactive Streamlit analytics.

api big-data bigquery cloud cloud-computing cloud-native data-engineering data-pipeline docker docker-compose gcp gcp-automation-gitops gcp-cloud-run gcp-pubsub google-cloud-platform real-time realtime stock-market stocks streamlit

Last synced: 27 Dec 2024

https://github.com/syedsajjadaskari/end-to-end-chicago-taxi-tip-prediction-with-bigquery-and-vertex-ai

An end-to-end example of Chicago taxi on Google Cloud using TensorFlow, TFX, and Vertex AI

bigquery gcp tensorflow tfx vertex-ai

Last synced: 13 Jan 2025

https://github.com/prashhhant213/strategic-analysis-of-retail-brand-in-south-america-using-sql

Leveraged Big Query and MySQL to analyze 100K records for sales optimization, trend identification, and enhancing customer satisfaction for a retail brand in South America and to provide insights and recommendations to improve their userbase and improve their services

bigquery database mysql-server sql

Last synced: 26 Dec 2024

https://github.com/mehbas92/sql

SQL Using BigQuery

bigquery sql

Last synced: 20 Dec 2024

https://github.com/epomatti/gcp-bigquery

Data sync via CDC from GCP Cloud SQL to Big Query using Datastream

bigquery cloud-sql datastream gcp

Last synced: 17 Jan 2025

https://github.com/mysto-007/cyclistic-bike-share-analysis

Analyzed the dataset of Cyclistic Rental Service as the Capstone project for Google Data Analytics SpecializationAnalyzed the dataset of Cyclistic bike-share (Capstone project for Google Data Analytics Specialization)

bigquery data-analysis excel ms-sql-server sql tableau tableau-public

Last synced: 18 Jan 2025

https://github.com/allanreda/video-processing-and-categorization

Video processing and categorization using computer vision, machine learning and cloud computing

bigquery cloud-storage-bucket cnn computer-vision google-cloud kmeans-clustering machine-learning opencv2 tensorflow virtual-machine

Last synced: 28 Dec 2024

https://github.com/itsubaki/hermes-lambda

Transfers AWS cost data to BigQuery

aws bigquery

Last synced: 14 Dec 2024

https://github.com/anyesh/gbq-helpers

GBQ related helper functions and snippets.

bigquery google

Last synced: 10 Jan 2025

https://github.com/manesioz/airflow-without-code

Dynamically generate DAGs to ingest SQL files into BigQuery with one line of "code"

airflow airflow-plugin bigquery python sql

Last synced: 05 Jan 2025

https://github.com/entur/terraform-aiven-kafka-connect-bigquery-sink

Terraform module for BigQuery sink connector on Aiven KafkaConnect cluster

aiven bigquery kafka-connect sink-connector terraform terraform-modules

Last synced: 17 Jan 2025

https://github.com/ngangawairimu/clv-rfm-and-customer-segmentation-analysis

This project performs cohort analysis to estimate Customer Lifetime Value (CLV) by analyzing weekly revenue and user registrations over 12 weeks, forecasting future revenue, and providing actionable insights for marketing and business strategy.

bigquery clv-analysis cohort-analysis customer-segmentation excel rfm-analysis

Last synced: 03 Jan 2025

https://github.com/mikeghen/metadata

Pulls data from Socrata open data portals

bigquery python socrata

Last synced: 27 Dec 2024

https://github.com/shikanime/seeker

Data platform based on BigQuery

bigquery dataform google-cloud

Last synced: 04 Jan 2025

https://github.com/marceloneppel/map-to-bigquery-structs

Tool to convert a Golang map to a struct containing fields with types like bigquery.Null*.

bigquery golang map struct

Last synced: 30 Jan 2025

https://github.com/vigneshss-07/mastering-sql-and-bigquery-on-google-cloud-platform

Take your Data Analytics skills to the next level with this comprehensive playlist. Learn SQL from the basics to advanced techniques while mastering BigQuery on Google Cloud.

analytics bigquery gcp sql

Last synced: 05 Jan 2025

https://github.com/khanovico/energy-data-analysis

This is the cloud model analyzing real world dataset with BigQuery and other big-data analyzing tools. I implemented docker image for running this app on cross-platform environments.

big-data-processing bigquery docker google-app-engine jupyter-notebook mlflow python scikit-learn seaborn xgboost

Last synced: 10 Oct 2024

https://github.com/simhayn/genomics-cannabis-bigquery

BigQuery's Cannabis_Genomics Dataset Exploration using SQL in a Python Environment

big-data bigquery bioinformatics exploratory-data-analysis genomics python sql

Last synced: 22 Jan 2025

https://github.com/yasarsultan/taxi-trip-analysis

The NYC Taxi Trip Batch Data Pipeline automates processing of large-scale trip data using Apache Spark and Airflow, integrating AWS S3 and Google BigQuery for storage and analytics. It features scalable, containerized workflows with robust data validation.

airflow aws-s3 bash-script batch-processing bigquery data-lake data-warehouse docker python3 spark

Last synced: 11 Jan 2025

https://github.com/syou6162/mackerel-plugin-bigquery-query-result-importer

Mackerel plugin to post bigquery's query result

bigquery mackerel-plugin

Last synced: 12 Oct 2024

https://github.com/santiago-giordano/aws-gcp-pipeline

Simple pipeline, downloads csv from aws bucket, does some transformations, creates tables in gcp bq, loads data, and runs queries

aws bigquery etl gcp jupyter pipeline python

Last synced: 12 Jan 2025

https://github.com/denisogr/kaggle-notebook-to-production

This is a study project. I get analytics/ML examples from Kaggle and use different technologies to re-implement them.

bigquery data-engineering gcp kaggle-competition kaggle-dataset python spark

Last synced: 12 Jan 2025

https://github.com/justinjsd/analytics-engineering

📊 A repository focusing on analytics engineering, particularly using dbt on the Northwind Sample dataset

analytics bigquery dbt engineering sql

Last synced: 12 Jan 2025

https://github.com/shvetsihorr/sql-projects

SQL and Google BigQuery-Portfolio Projects

azuredatastudio bigquery mssql postgresql sql

Last synced: 18 Jan 2025

https://github.com/smohanta23/uber_data-engineering_etl-project

This project demonstrates a comprehensive data engineering workflow using the Uber information dataset. It covers the full spectrum of data engineering pipelines, from data transformation to deployment on Google Cloud, with a focus on creating a scalable and insightful data model.

big-data-analytics bigquery cloudcomputing computeengine dashboard-application dataengineering datainsights datamodelling datapipeline datascience datavisualization etl-pipeline gcp-project googlecloudplatform mage opensource python uber uber-api

Last synced: 21 Jan 2025

https://github.com/lucashomuniz/project-22

[Dashboard] Data and Sustainability: Optimizing Green Flow's Fertilizer Portfolio

agrotech bigquery data-analytics data-structures data-visualization google-cloud-platform powerbi powerbi-visuals powerquery sql sustainability

Last synced: 25 Jan 2025

BigQuery Awesome Lists
BigQuery Categories