Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

BigQuery

Google BigQuery enables companies to handle large amounts of data without having to manage infrastructure. Google’s documentation describes it as a « serverless architecture (that) lets you use SQL queries to answer your organization’s biggest questions with zero infrastructure management. BigQuery’s scalable, distributed analysis engine lets you query terabytes in seconds and petabytes in minutes. » Its client libraries allow the use of widely known languages such as Python, Java, JavaScript, and Go. Federated queries are also supported, making it flexible to read data from external sources.

📖 A highly rated canonical book on it is « Google BigQuery: The Definitive Guide », a comprehensive reference. Another enriching read on the subject is the inside story told in the article by the founding product manager of BigQuery celebrating its 10th anniversary.

https://github.com/yasarsultan/olist_datawarehouse

An end-to-end data pipeline that extracts data, processes it, and then loads it into the BigQuery data warehouse.

airflow bigquery data-warehouse docker

Last synced: 22 Jan 2025

https://github.com/ahbiels/chatbot_analize_avaliation

Um bot feito no dialogflow cx que permite ao usuário avaliar um determinado produto da empresa. Após a avaliação, o bot ira fazer uma análise de sentimentos na avaliação do usuário, e armazenar o resultado da avaliação (juntamente com o texto da avaliação, nome do usuário e produto) dentro de um dataset no BigQuery

bigquery chatbot dataset dialogflow dialogflow-cx documentation flask gcp google-cloud iterator language-model nlu nlu-chatbot python sql

Last synced: 22 Jan 2025

https://github.com/mutaharshaik/airflow_retail_project

Airflow retail project using pipeline with BigQuery, dbt, Soda

airflow astro-cli astro-sdk bigquery datamodeling dbt docker etl-pipeline gcp snowflake soda

Last synced: 22 Jan 2025

https://github.com/victorcezeh/end-to-end-elt-pipeline

An end-to-end ELT project using the Brazilian E-Commerce dataset from Kaggle. This project demonstrates the use of Python, PostgreSQL, Docker, Docker Compose, Airflow, dbt, and BigQuery to ingest, transform, and analyze data, providing insights into sales, delivery times, and order distributions.

airflow bigquery dbt-core docker docker-compose postgresql python

Last synced: 22 Jan 2025

https://github.com/mattwelke/charter-challenge-for-fair-voting-bot

Bot that web scrapes and logs in BigQuery the donations so far of the Charter Challenge for Fair Voting.

bigquery bot go openwhisk public-data

Last synced: 22 Jan 2025

https://github.com/nszoni/dbtgen

dbt: write nothing, generate (almost) everything.

analytics bigquery dbt documentation generative-ai github tooling

Last synced: 31 Jan 2025

https://github.com/lorinczakos/sql-projects

This is a collection of my SQL scripts that I wrote and were approved through my course with GoIT Romania Data Analyst course

bigquery cte data data-analysis dbeaver marketing-analytics postgresql project-repository sql vscode

Last synced: 28 Jan 2025

https://github.com/wooyakob/music-recommendation-engine

Using Gemini API to generate personalized music recommendations.

ai bigquery gemini-api google-cloud-platform

Last synced: 28 Jan 2025

https://github.com/tosh2230/bigquery-table-history

Diff daily changes by BigQuery INFORMATION_SCHEMA.PARTITIONS records.

bigquery

Last synced: 21 Jan 2025

https://github.com/phstudy/zetasketch-bigquery-example

An example demonstrates how to use ZetaSketch with BigQuery

bigquery hll java zetasketch

Last synced: 21 Jan 2025

https://github.com/ymyzk/bq-globalip

Record the current global IPv4 address to a BigQuery table.

bigquery golang

Last synced: 28 Jan 2025

https://github.com/garbetjie/phpunit-bigquery-schema

A BigQuery schema validator constraint for BigQuery

bigquery phpunit

Last synced: 28 Jan 2025

https://github.com/ankita-selokar/fitbit-for-her-crafting-fitbit-s-strategy-for-women

This project analyzes smart device usage data to uncover trends and insights, guiding Fitbit by Google’s product and marketing strategies for their new women-focused product launch. It combines competitive market analysis with customer behavior insights to inform key decisions.

bigquery excel powerbi spreadsheet sql

Last synced: 05 Feb 2025

https://github.com/vigneshss-07/complete-atoz-sql

This deals with SQL commands, interview preparation and query questions and solutions

azuresql bigquery gcp sql sql-query sql-server sqlalchemy

Last synced: 15 Nov 2024

https://github.com/night-fury-me/real-time-vehicle-data-processing

A repository that contains implementation of a Real-Time Vehicle Data Processing Pipeline that efficiently manages and analyzes vehicle data through a cohesive system.

bigquery cpp data-engineering data-streaming flink grpc kafka python real-time-data-processing

Last synced: 22 Jan 2025

https://github.com/jancervenka/bqcli

REPL for BigQuery

bigquery data-science gcp google python

Last synced: 31 Dec 2024

https://github.com/samanthalang/samanthalang_portfolio

Une data analyste avec la vision d'une consommatrice et la stratégie d'une marketeuse.

bigquery excel figma mysql notebook numpy pandas postgresql powerbi powerquery python sql sqlite wordpress

Last synced: 05 Feb 2025

https://github.com/digitaloptimizationgroup/digitaloptgroup-r-notebooks

A collection of R notebooks to analyze data from the Digital Optimization Group Platform

ab-testing bigquery jupyter-notebook performance-analysis r web-analytics

Last synced: 21 Jan 2025

https://github.com/ruru-lyy/nyc-taxi-service-pipeline

In this project, I built a data pipeline using Mage.ai for ETL, GCP for storage, BigQuery for querying, and Looker Studio for analytics. This project helped me learn how to process, store, and visualize data effectively using modern tools.

bigquery data-engineering data-modeling etl-pipeline looker mage-ai python

Last synced: 23 Jan 2025

https://github.com/edumoraes1/spam_count_sfmc

Consulta de SQL com contagem de envios de email e spam dos ultimos 365 dias

bigquery marketing-cloud salesforce sql

Last synced: 31 Dec 2024

https://github.com/yoshiyukikato/nightharbor-bigquery-reporter

A nightharbor reporter for GCP BigQuery

bigquery lighthouse

Last synced: 23 Jan 2025

https://github.com/victorcezeh/data-engineering-final-semester-portfolio

This GitHub repository serves as a comprehensive platform for managing and showcasing my data engineering projects and assessments throughout my final semester at Alt School Africa. Designed to foster collaboration, organization, and continuous improvement, this repository is the backbone of my academic journey in data engineering.

bigquery docker gcs-bucket postgresql python

Last synced: 17 Nov 2024

https://github.com/drvipulasharma/e-commerce-data-analysis-sql-big---query

E-Commerce-Data-Analysis-SQL-Big-Query

bigquery sql

Last synced: 23 Jan 2025

https://gitlab.com/solidninja/albion

A Scala BigQuery client

bigquery scala

Last synced: 05 Feb 2025

https://github.com/govau/warcraider

Convert WARC files into Avro for big data processing

avro bigquery crawler rust warc

Last synced: 21 Jan 2025

https://github.com/xennis/particulate-matter-sensor-storage

Store the particulate matter data from a luftdaten.info sensor in BigQuery

bigquery cloud-function luftdaten particulate-matter sensor-data

Last synced: 18 Nov 2024

https://github.com/lixx21/airflow-dbt-gcp

A comprehensive data pipeline leveraging Airflow, DBT, Google Cloud Platform (GCP), and Docker to extract, transform, and load data seamlessly from a staging layer to a data warehouse and data mart.

airflow bigquery data-engineer dbt gcp

Last synced: 29 Jan 2025

https://github.com/oguzgn/fully-automated-performance-marketing-dashboard

This project integrates data from multiple ad platforms with Google Analytics to track marketing campaigns. It uses a structured naming system and UTM tags. Data is visualized in Looker Studio dashboards to analyze campaign performance and ad spend.

bigquery data-analysis data-engineering data-modeling marketing-analytics marketing-automation marketing-data-science marketingdata sql

Last synced: 29 Jan 2025

https://github.com/gabrieladados/people-analytics

People Analytics: Insights para Retenção de Talentos

bigquery figma people-analytics sql tableau

Last synced: 29 Jan 2025

https://github.com/brpy/nyc-trips

Data engineering | Zoomcamp journey on nyc trip data with gcp stack

bigquery dbt gcp pyspark

Last synced: 22 Dec 2024

https://github.com/dobsontom/basket-abandonment

Data pipeline for detecting and responding to basket abandonment using BigQuery and Adobe Campaign.

adobe-campaign bigquery ga4 gcp sql

Last synced: 21 Nov 2024

https://github.com/niteshchawla/nc-sql-business-case

A Leading Retail chain brand and a prominent retailer in the United States. It makes itself a preferred shopping destination by offering outstanding value, inspiration, innovation and an exceptional guest experience that no other retailer can deliver.

bigquery retail sql supermarket

Last synced: 21 Jan 2025

https://github.com/nealwp/blobview

Generate BigQuery SQL views from JSON

bigquery cli json sql

Last synced: 21 Jan 2025

https://github.com/hrialan/dataform-prune

An open-source tool for automating the cleanup of outdated objects in Dataform configurations, optimizing data workflows with seamless CI/CD integration.

automation bigquery data-analytics dataform

Last synced: 21 Nov 2024

https://github.com/sahilmb/employee-churn-da

A data analysis project on employee churn rate using Google Bigquery, Looker, Pycaret and Colab

bigquery looker-studio pycaret

Last synced: 21 Nov 2024

https://github.com/ket0825/v1-gcp-preview

Preview 서비스를 위한 GCP 레포 / Manage GCP src for preview services

bigquery cloud-functions cloud-run cloudbuild gcp logging pubsub

Last synced: 21 Nov 2024

https://github.com/marielachirinosr/analysis-urgencias-hospital-pitalito

This project involves analyzing emergency room admission data from the E.S.E Hospital Departamental de Pitalito using a star schema model.

bigquery data data-analysis etl-pipeline tableau

Last synced: 21 Nov 2024

https://github.com/themihirmathur/uber-data-analytics

The goal of this project is to perform comprehensive data analytics on Uber trip data using a modern data engineering stack on Google Cloud Platform (GCP).

bigquery data-analysis data-engineering etl-pipeline google-cloud-platform looker python

Last synced: 21 Nov 2024

https://github.com/thanhloc81/sql-project-bicycles-practise

✨ Utilizing SQL to extract data following a simulated task involving the Sales and Product modules

adventureworks bicycle bigquery google-cloud sql

Last synced: 21 Jan 2025

https://github.com/juldrixx/bigquery-avro-schema-converter

Website to convert a schema from one format to another between BigQuery and Avro

avro avro-schema bigquery bigquery-schema converter schema

Last synced: 22 Jan 2025

https://github.com/branb97/jobstreet-data-eng-project

Building a data pipeline to deliver job listing data from Jobstreet for analysis.

airflow bigquery data-engineering etl-pipeline google-cloud looker-studio python sql

Last synced: 22 Jan 2025

https://github.com/moeabbas6/bq_data_loader

A Python script for executing and logging batch SQL commands in Google BigQuery. Includes tracking of execution times, unique job and statement IDs, and automated logging to a specified BigQuery table.

bigquery data python

Last synced: 29 Jan 2025

https://github.com/marceloneppel/gcs-to-bigquery

WIP: Moving data from GCS to BigQuery.

bigquery gcs scala scio

Last synced: 30 Jan 2025

https://github.com/rafal-kowalski-dev/selling-cars-analize

Hobby project for learning PySpark, AirFlow and BigQuery

airflow bigquery gcp pyspark python sqlalchemy

Last synced: 30 Jan 2025

https://github.com/seahrh/nyc-taxi-trips

REST API for the New York City Taxi Trips public dataset, implemented in Scala and Play Framework 2.7

bigquery nyc-taxi-dataset play-framework rest-api scala

Last synced: 03 Feb 2025

https://github.com/quipper/send-ci-result-to-bigquery-action

Send test results to BigQuery in GitHub Actions

bigquery github-actions google-bigquery junit-xml

Last synced: 09 Jan 2025

https://github.com/sayed-ashfaq/target-sql

In this project, I analyzed Target company's data using SQL in BigQuery, focusing on data extraction, manipulation, and performing various analytical queries to derive insights.

aggregation bigquery cte joins sql

Last synced: 23 Dec 2024

https://github.com/machinelearningzuu/data-engineering-projects

This repository is a curated collection of projects and tools that exemplify best practices in data engineering. It serves as a resource for data professionals seeking to enhance their data infrastructure, optimize data pipelines, and implement cutting-edge data processing techniques.

airflow bigquery data-engineering data-science data-visualization data-warehouse

Last synced: 04 Feb 2025

https://github.com/ivanildobarauna-dev/pypi-package-stats

Project for ingest pypi packages data from BigQuery and send to DataDog for analysis and insights with dashboards, monitors and more

bigquery cloud data-engineering data-warehouse gcp software-engineering

Last synced: 11 Dec 2024

https://github.com/mchmarny/stocker

Using tweeter sentiment and stock market price signal correlation to predict next day closing price

bigquery ml prediction regression-models

Last synced: 31 Dec 2024

https://github.com/davidkhala/dwh-migration-tools

dwh-migration-tools: contribution fork

bigquery bq gcp

Last synced: 23 Jan 2025

https://github.com/giorgishengelia/bike-share-analysis-report

Help developing marketing strategy using data analytics to help convert casual riders into members

bigquery sql tableau

Last synced: 12 Dec 2024

https://github.com/kevin-rsj/real-estate-investments

Sistema de scoring que clasifica ciudades francesas para inversión en segundas viviendas según perfil de riesgo(alto, moderado y bajo). Evalúa ratios clave en áreas como demanda, disponibilidad, infraestructura, demografía y precios.

bigquery data-analytics looker-studio numpy pandas python sklearn-library sql visualization

Last synced: 17 Dec 2024

https://github.com/tupizz/fiap_pnad-covid-19

Este projeto realiza a análise e transformação de dados da PNAD COVID-19 de maio a julho de 2020, utilizando PySpark para processamento de dados em larga escala e BigQuery como destino para armazenamento e análise posterior. O objetivo é consolidar os dados mensais em um único conjunto de dados transformado.

analysis bigquery pyspark python

Last synced: 17 Dec 2024

https://github.com/neo4j-field/dataflow-flex-pyarrow-to-gds

Google Dataflow Flex Templates (in Python) for large scale Graph Loading with GDS and Apache Arrow

apache-arrow apache-beam bigquery dataflow neo4j python

Last synced: 23 Dec 2024

https://github.com/istinnew/cook-me-up

Welcome to Cook-Me-Up! This project aims to analyze and organize cooking recipes using data analysis (Python, BigQuery SQL, Looker Studio etc.) and machine learning techniques. The goal is to simplify meal preparation and offer users a comprehensive database of culinary delights.

bigquery clustering cookme culinary data data-science dataanalysis datavisualization looker-studio machine-learning python recipe-search recipes unsupervised-learning

Last synced: 17 Dec 2024

https://github.com/yu-iskw/bigquery-lineage

Visualize BigQuery data lineage graph

bigquery data-governance data-management visualization

Last synced: 17 Dec 2024

https://github.com/yu-iskw/homebrew-bigquery-to-datastore

A homebrew tap for bigquery-to-datastore

bigquery google-datastore homebrew

Last synced: 17 Dec 2024

https://github.com/nghiant3110/e_com_1

This is a DA project base on E-com Data set (Thelook_ecom) in Big Query from Google

bigquery looker-studio sql

Last synced: 24 Dec 2024

https://github.com/nghiant3110/google_fiber_bi_5

This is a BI Capstone project based on the Google Fiber dataset from Google BI Course

bigquery google-sheets looker-studio sql

Last synced: 24 Dec 2024

https://github.com/nghiant3110/b2b_crm_3

This is a DA project based on the B2B Sales CRM dataset from Maven Analytics

bigquery google-sheets looker-studio sql

Last synced: 24 Dec 2024

https://github.com/andrewm4894/gcp-telemetry-example

Simple HTTP endpoint for telemetry data type events in GCP.

bigquery gcp-cloud-functions gcp-storage python terraform

Last synced: 01 Feb 2025

https://github.com/raqssoriano/hha504_assignment_nosql_dbs

This task is part of my assignment focused on creating and configuring databases in different platforms, such as GCP's BigQuery, MongoDB Atlas, and Redis Cloud.

bigquery mongodb-atlas mongodbcompass redis redisinsight

Last synced: 18 Dec 2024

https://github.com/adadalshabab/data-engineering-gcp-project

An end-to-end modern data engineering project, including deployment of ETL pipeline on Google Cloud Platform, using BigQuery for data analysis and leveraging Looker to generate an insight dashboard.

bigquery data data-science data-visualization databases dataengineering-a engineering etl-pipeline looker-studio powerbi

Last synced: 19 Dec 2024

https://github.com/scraly/flume-bigquery-sink

An Apache Flume Sink implementation to publish data to Google BigQuery

bigquery flume sink

Last synced: 25 Dec 2024

https://github.com/francois-lenne/elt-mp4-quiberon

the goal of this project is to retrieve the video of the municipality of quiberon and see if a person is in or no

bigquery cicd data-engineering docker elt google-cloud-functions google-cloud-platform google-cloud-run google-cloud-storage pipeline python sql unstructured-data

Last synced: 25 Dec 2024

https://github.com/fakhri098/project-sql-bigquery

This project aims to analyze taxi trip data with a focus on trip duration patterns, popular routes, and trip costs. The study was conducted to gain in-depth insights into taxi travel behavior based on historical data.

bigquery sql

Last synced: 17 Jan 2025

https://github.com/celiason/coffee-funnel

webpage for visualizing sales projections of a small coffee business

bigquery prophet sales-analysis streamlit-webapp

Last synced: 26 Dec 2024

https://github.com/prashhhant213/strategic-analysis-of-retail-brand-in-south-america-using-sql

Leveraged Big Query and MySQL to analyze 100K records for sales optimization, trend identification, and enhancing customer satisfaction for a retail brand in South America and to provide insights and recommendations to improve their userbase and improve their services

bigquery database mysql-server sql

Last synced: 26 Dec 2024

https://github.com/epomatti/gcp-bigquery

Data sync via CDC from GCP Cloud SQL to Big Query using Datastream

bigquery cloud-sql datastream gcp

Last synced: 17 Jan 2025

https://github.com/anyesh/gbq-helpers

GBQ related helper functions and snippets.

bigquery google

Last synced: 10 Jan 2025

https://github.com/entur/terraform-aiven-kafka-connect-bigquery-sink

Terraform module for BigQuery sink connector on Aiven KafkaConnect cluster

aiven bigquery kafka-connect sink-connector terraform terraform-modules

Last synced: 17 Jan 2025

https://github.com/ngangawairimu/clv-rfm-and-customer-segmentation-analysis

This project performs cohort analysis to estimate Customer Lifetime Value (CLV) by analyzing weekly revenue and user registrations over 12 weeks, forecasting future revenue, and providing actionable insights for marketing and business strategy.

bigquery clv-analysis cohort-analysis customer-segmentation excel rfm-analysis

Last synced: 03 Jan 2025

https://github.com/shikanime/seeker

Data platform based on BigQuery

bigquery dataform google-cloud

Last synced: 04 Jan 2025

https://github.com/marceloneppel/map-to-bigquery-structs

Tool to convert a Golang map to a struct containing fields with types like bigquery.Null*.

bigquery golang map struct

Last synced: 30 Jan 2025

https://github.com/yasarsultan/taxi-trip-analysis

The NYC Taxi Trip Batch Data Pipeline automates processing of large-scale trip data using Apache Spark and Airflow, integrating AWS S3 and Google BigQuery for storage and analytics. It features scalable, containerized workflows with robust data validation.

airflow aws-s3 bash-script batch-processing bigquery data-lake data-warehouse docker python3 spark

Last synced: 11 Jan 2025

https://github.com/santiago-giordano/aws-gcp-pipeline

Simple pipeline, downloads csv from aws bucket, does some transformations, creates tables in gcp bq, loads data, and runs queries

aws bigquery etl gcp jupyter pipeline python

Last synced: 12 Jan 2025

https://github.com/denisogr/kaggle-notebook-to-production

This is a study project. I get analytics/ML examples from Kaggle and use different technologies to re-implement them.

bigquery data-engineering gcp kaggle-competition kaggle-dataset python spark

Last synced: 12 Jan 2025

https://github.com/justinjsd/analytics-engineering

📊 A repository focusing on analytics engineering, particularly using dbt on the Northwind Sample dataset

analytics bigquery dbt engineering sql

Last synced: 12 Jan 2025

https://github.com/shvetsihorr/sql-projects

SQL and Google BigQuery-Portfolio Projects

azuredatastudio bigquery mssql postgresql sql

Last synced: 18 Jan 2025

https://github.com/rolandbende/python-bigquery-migrations

Python bigquery-migrations package is for creating and manipulating BigQuery databases easily.

bigquery google migration-automation migration-scripts migration-tool migrations python

Last synced: 24 Jan 2025

https://github.com/ddzikri/analisis-data-kimia-farma

Project Based Internship Kimia Farma Rakamin Academy

bigquery dataset sql

Last synced: 24 Jan 2025

BigQuery Awesome Lists
BigQuery Categories