Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

BigQuery

Google BigQuery enables companies to handle large amounts of data without having to manage infrastructure. Google’s documentation describes it as a « serverless architecture (that) lets you use SQL queries to answer your organization’s biggest questions with zero infrastructure management. BigQuery’s scalable, distributed analysis engine lets you query terabytes in seconds and petabytes in minutes. » Its client libraries allow the use of widely known languages such as Python, Java, JavaScript, and Go. Federated queries are also supported, making it flexible to read data from external sources.

📖 A highly rated canonical book on it is « Google BigQuery: The Definitive Guide », a comprehensive reference. Another enriching read on the subject is the inside story told in the article by the founding product manager of BigQuery celebrating its 10th anniversary.

https://github.com/vidyadnina/cyclistic-sql-tableau-project

Trip data analysis for a bike-sharing service company using SQL and Tableau.

bigquery dashboard data-analysis data-analytics-sql data-cleaning data-visualization sql

Last synced: 21 Jan 2025

https://github.com/pratshrestha/cochin-traders---sql--sales-analysis

Cochin Traders imports and exports specialty foods globally. This project analyzes sales and operational data to enhance business efficiency, supply chain management, and sales performance. Key areas of focus include

bigquery customer-engagement employee-performance inventory-management sales-trends sql

Last synced: 21 Jan 2025

https://github.com/ayresgneto/use-case-gcp-etl

ELT pipeline GCP. Tecnologias utilizadas: Postgresql, GCP Storage, Airflow (local), Pyspark (local), BigQuery

airflow big-data bigquery data data-engineering etl gcp pipeline postgresql programming-oriented-object pyspark python spark

Last synced: 21 Jan 2025

https://github.com/francois-lenne/elt-mp4-quiberon

the goal of this project is to retrieve the video of the municipality of quiberon and see if a person is in or no

bigquery cicd data-engineering docker elt google-cloud-functions google-cloud-platform google-cloud-run google-cloud-storage pipeline python sql unstructured-data

Last synced: 17 Feb 2025

https://github.com/humairarizwan/uber-ride-dataengineering-analysis

This project creates a pipeline to process data and performs data analytics on Uber data.

bigquery dataanalysis dataengineering gcp-project googlestorage looker-studio

Last synced: 21 Jan 2025

https://github.com/kevin-rsj/real-estate-investments

Sistema de scoring que clasifica ciudades francesas para inversión en segundas viviendas según perfil de riesgo(alto, moderado y bajo). Evalúa ratios clave en áreas como demanda, disponibilidad, infraestructura, demografía y precios.

bigquery data-analytics looker-studio numpy pandas python sklearn-library sql visualization

Last synced: 09 Feb 2025

https://github.com/alessio-siciliano/bigquery-advanced-utils

BigQuery-advanced-utils is a lightweight utility library that extends the official Google BigQuery Python client. It simplifies tasks like query management, data processing, and automation. Aimed at developers and data scientists, the project is open to contributions to improve and enhance its functionality.

bigquery datatransfer google-cloud python

Last synced: 01 Feb 2025

https://github.com/sameer6690/data_analytics_bootcamp_hdnb

This is an analytics project on the "Titanic - Machine Learning From Disaster" dataset's train.csv file. I performed data cleaning with MS Excel before using SQL to query results based on the questions provided for the completion of the project. Finally I visualized the data on Google Looker Studio.

bigquery excel looker-studio sql

Last synced: 20 Feb 2025

https://github.com/kathisnehith/nyc311-requests-etl-pipeline

The project of end to end ETL pipeline processing NYC 311 service request through API for analysis.

airflow-dags api bigquery data-engineering data-pipeline elt erdiagram gcp normalization-data spark-sql tableau-desktop

Last synced: 20 Feb 2025

https://github.com/ackeecz/terraform-gcp-cloud-function_pubsub_to_bq

Cloud function subscribes itself to given topic and inserts each message to BigQuery table.

bigquery cloud-functions pubsub terraform-module

Last synced: 07 Jan 2025

https://github.com/vaishnavipaithane/bellabeat-data-analysis-case-study

This capstone project was done as a part of Google Data Analytics Professional Certificate course.

bigquery data-analysis sql tableau

Last synced: 21 Jan 2025

https://github.com/tomgorb/some-data-monitoring

fully functional DAG using Airflow 2 and minikube (locally) to help monitor GCP billing

airflow2 bigquery gcp minikube

Last synced: 21 Jan 2025

https://github.com/kartikeya443/automated-data-pipeline-gcp

This project showcases the integration of various Google Cloud Platform services to build an efficient and automated data pipeline for sales data.

bigquery cloud data-engineering flask gcp google-cloud-platform looker-studio pipeline python sql

Last synced: 21 Jan 2025

https://github.com/vidyadnina/other-sql-projects-and-queries

Other SQL projects and queries.

bigquery mysql sql

Last synced: 21 Jan 2025

https://github.com/karencofre/marketing-segmentacion-en-powerbi

Proyecto prueba de hipótesis en powerbi y python

bigquery google-colab powerbi python sql statsmodels

Last synced: 21 Jan 2025

https://github.com/lisabensoussan/bigdataminig_finalassignment

This repository contains solutions for the final assignment of the Big Data Mining course (52002/52019), focusing on querying large datasets with BigQuery, network analysis with Python, and distributed data processing with Apache Spark.

bigquery community-detection data-cleaning dataframe exploratory-data-analysis pagerank rdd sql text-analysis visualization

Last synced: 21 Jan 2025

https://github.com/edumoraes1/spam_count_sfmc

Consulta de SQL com contagem de envios de email e spam dos ultimos 365 dias

bigquery marketing-cloud salesforce sql

Last synced: 20 Feb 2025

https://github.com/yu-iskw/bigquery-lineage

Visualize BigQuery data lineage graph

bigquery data-governance data-management visualization

Last synced: 10 Feb 2025

https://github.com/vigneshss-07/mastering-sql-and-bigquery-on-google-cloud-platform

Take your Data Analytics skills to the next level with this comprehensive playlist. Learn SQL from the basics to advanced techniques while mastering BigQuery on Google Cloud.

analytics bigquery gcp sql

Last synced: 05 Jan 2025

https://github.com/andrewm4894/gcp-telemetry-example

Simple HTTP endpoint for telemetry data type events in GCP.

bigquery gcp-cloud-functions gcp-storage python terraform

Last synced: 01 Feb 2025

https://github.com/edwinrlambert/cyclistic-bike-share-analysis

This repository is part of the Google Data Analytics Capstone Project, focusing on analyzing Cyclistic's bike-sharing data to identify trends and strategies for converting casual riders to annual members. It aims to provide actionable insights for enhancing marketing efforts.

act analyze ask bigquery prepare process share sql

Last synced: 21 Jan 2025

https://github.com/mlund2k/project-1-baseball-performance-vs.-attendance

Project assets for my first exploratory data analysis: Baseball Performance vs. Attendance.

bigquery data-analysis data-cleaning data-visualization excel rstudio sql tableau tidyverse

Last synced: 21 Jan 2025

https://github.com/syou6162/mackerel-plugin-bigquery-query-result-importer

Mackerel plugin to post bigquery's query result

bigquery mackerel-plugin

Last synced: 16 Feb 2025

https://github.com/khanovico/energy-data-analysis

This is the cloud model analyzing real world dataset with BigQuery and other big-data analyzing tools. I implemented docker image for running this app on cross-platform environments.

big-data-processing bigquery docker google-app-engine jupyter-notebook mlflow python scikit-learn seaborn xgboost

Last synced: 09 Feb 2025

https://github.com/scraly/flume-bigquery-sink

An Apache Flume Sink implementation to publish data to Google BigQuery

bigquery flume sink

Last synced: 16 Feb 2025

https://github.com/moeabbas6/dbt_analytics_engine

An end-to-end project using dbt to demonstrate data transformations, testing, and visualization with Google BigQuery, and Looker Studio. It showcases a complete data pipeline from extraction/generation to deployment.

analytics-engineering bigquery data data-pipeline data-transformation data-visualization dbt testing

Last synced: 21 Jan 2025

https://github.com/mikeghen/metadata

Pulls data from Socrata open data portals

bigquery python socrata

Last synced: 18 Feb 2025

https://github.com/fakhri098/project-sql-bigquery

This project aims to analyze taxi trip data with a focus on trip duration patterns, popular routes, and trip costs. The study was conducted to gain in-depth insights into taxi travel behavior based on historical data.

bigquery sql

Last synced: 17 Jan 2025

https://github.com/simhayn/genomics-cannabis-bigquery

BigQuery's Cannabis_Genomics Dataset Exploration using SQL in a Python Environment

big-data bigquery bioinformatics exploratory-data-analysis genomics python sql

Last synced: 22 Jan 2025

https://github.com/aazuspan/landsat-bigquery

Summarizing 51 years of Landsat data using Earth Engine and BigQuery

bigquery google-earth-engine landsat

Last synced: 21 Jan 2025

https://github.com/smohanta23/uber_data-engineering_etl-project

This project demonstrates a comprehensive data engineering workflow using the Uber information dataset. It covers the full spectrum of data engineering pipelines, from data transformation to deployment on Google Cloud, with a focus on creating a scalable and insightful data model.

big-data-analytics bigquery cloudcomputing computeengine dashboard-application dataengineering datainsights datamodelling datapipeline datascience datavisualization etl-pipeline gcp-project googlecloudplatform mage opensource python uber uber-api

Last synced: 21 Jan 2025

https://github.com/prashhhant213/strategic-analysis-of-retail-brand-in-south-america-using-sql

Leveraged Big Query and MySQL to analyze 100K records for sales optimization, trend identification, and enhancing customer satisfaction for a retail brand in South America and to provide insights and recommendations to improve their userbase and improve their services

bigquery database mysql-server sql

Last synced: 17 Feb 2025

https://github.com/vedantwalia/google-data-analytics-capstone-case-study

This is a repository of my work on data analysis as a part of the Google Data Analytics Capstone

bigquery data data-viz datavisualization-project divvy-bikes google googledataanalytics sql tableau tableau-public

Last synced: 21 Jan 2025

https://github.com/epomatti/gcp-bigquery

Data sync via CDC from GCP Cloud SQL to Big Query using Datastream

bigquery cloud-sql datastream gcp

Last synced: 17 Jan 2025

https://github.com/davelester/gharchive-bigquery-examples

Examples Using BigQuery to Analyze GH Archive Data

bigquery gharchive

Last synced: 01 Feb 2025

https://github.com/adadalshabab/data-engineering-gcp-project

An end-to-end modern data engineering project, including deployment of ETL pipeline on Google Cloud Platform, using BigQuery for data analysis and leveraging Looker to generate an insight dashboard.

bigquery data data-science data-visualization databases dataengineering-a engineering etl-pipeline looker-studio powerbi

Last synced: 11 Feb 2025

https://github.com/anyesh/gbq-helpers

GBQ related helper functions and snippets.

bigquery google

Last synced: 10 Jan 2025

https://github.com/jasontanx/ridership-headline-project

This end to end data engineering / data analytics project will be about the Malaysian public transport ridership data.

bigquery data-engineering minio-server public-transport-ridership terraform

Last synced: 01 Feb 2025

https://github.com/spacepatcher/google-workspace-gmail-collector

👁 App for collecting Gmail logs from your Google Workspace account and sending them to Kafka

bigquery gmail google-workspace security soc

Last synced: 23 Oct 2024

https://github.com/entur/terraform-aiven-kafka-connect-bigquery-sink

Terraform module for BigQuery sink connector on Aiven KafkaConnect cluster

aiven bigquery kafka-connect sink-connector terraform terraform-modules

Last synced: 17 Jan 2025

https://github.com/ngangawairimu/clv-rfm-and-customer-segmentation-analysis

This project performs cohort analysis to estimate Customer Lifetime Value (CLV) by analyzing weekly revenue and user registrations over 12 weeks, forecasting future revenue, and providing actionable insights for marketing and business strategy.

bigquery clv-analysis cohort-analysis customer-segmentation excel rfm-analysis

Last synced: 03 Jan 2025

https://github.com/shikanime/seeker

Data platform based on BigQuery

bigquery dataform google-cloud

Last synced: 04 Jan 2025

https://github.com/marceloneppel/map-to-bigquery-structs

Tool to convert a Golang map to a struct containing fields with types like bigquery.Null*.

bigquery golang map struct

Last synced: 30 Jan 2025

https://github.com/aisurjyasamantaray/-optimizing-target-s-brazilian-operations-insights-from-order-processing-pricing-and-payment-trends-

This project offers an in-depth analysis of consumer behavior, logistical performance, and payment preferences within the e-commerce sector. By examining order costs, delivery times, and payment methods, businesses can uncover valuable insights into operational efficiency and customer preferences.

bigquery consumer-insights data-analysis database sql target

Last synced: 21 Jan 2025

https://github.com/yasarsultan/taxi-trip-analysis

The NYC Taxi Trip Batch Data Pipeline automates processing of large-scale trip data using Apache Spark and Airflow, integrating AWS S3 and Google BigQuery for storage and analytics. It features scalable, containerized workflows with robust data validation.

airflow aws-s3 bash-script batch-processing bigquery data-lake data-warehouse docker python3 spark

Last synced: 11 Jan 2025

https://github.com/sayed-ashfaq/target-sql

In this project, I analyzed Target company's data using SQL in BigQuery, focusing on data extraction, manipulation, and performing various analytical queries to derive insights.

aggregation bigquery cte joins sql

Last synced: 15 Feb 2025

https://github.com/phukon/package-insights

PyPI package reports and insights. The data was ingested from publicly available source using BigQuery and then transformed.

big-data bigquery dbt duckdb

Last synced: 27 Jan 2025

https://github.com/santiago-giordano/aws-gcp-pipeline

Simple pipeline, downloads csv from aws bucket, does some transformations, creates tables in gcp bq, loads data, and runs queries

aws bigquery etl gcp jupyter pipeline python

Last synced: 12 Jan 2025

https://github.com/denisogr/kaggle-notebook-to-production

This is a study project. I get analytics/ML examples from Kaggle and use different technologies to re-implement them.

bigquery data-engineering gcp kaggle-competition kaggle-dataset python spark

Last synced: 12 Jan 2025

https://github.com/justinjsd/analytics-engineering

📊 A repository focusing on analytics engineering, particularly using dbt on the Northwind Sample dataset

analytics bigquery dbt engineering sql

Last synced: 12 Jan 2025

https://github.com/richardbnk/data_tools

Python Library to Accelerate Creation of Data ETL Processes on multiple database systems.

bigquery etl gcp sql

Last synced: 02 Feb 2025

https://github.com/itsubaki/hermes-lambda

Transfers AWS cost data to BigQuery

aws bigquery

Last synced: 07 Feb 2025

https://github.com/hitthecodelabs/bigquery_ml

Jupyter notebooks that utilize Google BigQuery's machine learning capabilities.

bigquery notebooks python sql

Last synced: 04 Feb 2025

https://github.com/oguzgn/firebase-ab-test-analysis-for-a-mobile-race-game

This repository showcases an infrastructure designed for analyzing A/B tests in mobile games. It leverages BigQuery to process Firebase and GA4-based event data and uses Looker Studio for dynamic visualization. The project simplifies A/B test comparisons, enabling stakeholders to view results directly through interactive dashboards.

ab-testing ab-testing-analysis bigquery event-based-tracking firebase looker-studio mobile-game-analytics race-game sql

Last synced: 26 Jan 2025

https://github.com/xxmadkillerx10/data-engineering-zoomcamp

The Data Engineering Zoomcamp covers essential skills in containerization, workflow orchestration, data warehousing, analytics engineering, batch, and streaming processing. It includes tools like Docker, Terraform, BigQuery, dbt, Spark, Kafka, Kestra, Postgres, Google Data Studio, and Metabase.

airflow bigquery data-visualization dbt dbt-clickhouse docker-compose etl gcs google-cloud kafka postgresql spark sql streaming

Last synced: 03 Feb 2025

https://github.com/oleksiilatypov/google_cloud

AI & Data, Google Cloud Skills Boost

bigquery document-ai ml vertexai

Last synced: 18 Jan 2025

https://github.com/lambdamusic/dimschema

CLI to retrieve SQL schema information about the Dimensions on Google BigQuery dataset.

bigquery dimensions python scholarly-metadata

Last synced: 12 Jan 2025

https://github.com/karencofre/riesgorelativo-lookerstudio

proyecto de análisis de datos y análisis perdicitvo en looker studio y google colab

bigquery data-analysis data-science machine-learning matplotlib python sklearn sql

Last synced: 22 Jan 2025

https://github.com/zborovskaanna/e-commerce-web-events-analysis

SQL project based on the Big Query public database 'The Look e-Commerce' and a dashboard in Looker Studio

analysis bigquery dashboard data-visualization looker-studio sql

Last synced: 22 Jan 2025

https://github.com/azapeti/bigquery-python-bash-automation

Since you're using the free version, you can only get data from your website through the Google Analytics API for the last 60 days. I would like to demonstrate in this repository how to run BigQuery queries in Python and automate it using bash and crontab for collecting historical data.

analytics automation bash bigquery cronjob crontab ga4 python python3

Last synced: 22 Jan 2025

https://github.com/robinnoiret/importcsv_zendeskbigquery

This project involves developing a Python script to import csv export from Zendesk to BigQuery. It is not intended for recurring use, but to enable an initial dump of historical data.

bigquery connector export-csvfile json zendesk

Last synced: 22 Jan 2025

https://github.com/acardosolima/crypto-ethereum-tokens

This project aims to create a data pipeline using Airflow to ingest dataset from Google Bigquery to a PostgreSQL database. This stack will run in a local environment using Kubernetes.

airflow bigquery postgresql python

Last synced: 22 Jan 2025

https://github.com/fahmiaziz98/sql_agent

build sql agent using different pattern rag/self-correction/optimization

agent bigquery langchain sql sql-agent sqlite toolkit

Last synced: 22 Jan 2025

https://github.com/lisabensoussan/bigdata_midterm

This project focuses on analyzing Stack Overflow data related to JavaScript and Python questions using a combination of SQL queries (Google BigQuery) and Unix shell commands. The aim is to explore trends, activity patterns, and user behavior around these popular programming languages through data wrangling and querying techniques.

bigquery data-cleaning sql unix-command unix-shell

Last synced: 22 Jan 2025

https://github.com/thecodersstudio/node-native-test-runner

Code samples and test cases showcasing the power of Node.js's native test runner for streamlined and efficient testing.

bigquery mock nodejs nodejs-test nodenativetestrunner test

Last synced: 22 Jan 2025

https://github.com/noospheracr/twilio-segment-configs

Integration of Twilio Segment with Google BigQuery, Looker/PowerBI, and Google VertexAI to create a data-driven marketing platform

bigquery google-cloud-platform looker-studio marketing noosphera power-bi twilio-segment vertex-ai

Last synced: 22 Jan 2025

https://github.com/thanhloc81/customer-segmentation

✨ Analyze customer segments of Adventure World dataset

bigquery google-cloud powerbi sql

Last synced: 22 Jan 2025

https://github.com/jasontanx/mas-international-arrivals

Code repository about international arrivals into Malaysia

bigquery data-analytics data-engineering etl-pipeline international-arrivals

Last synced: 22 Jan 2025

https://github.com/hanif-syazul/analyzing-kimia-farma-sales-performance-with-gcp

This repository contains the final project for the Rakamin Big Data Analytics Internship. It include a complete dashboard of Kimia Farma's sales performance analysis from 2020 to 2023.

big-data-analytics bigquery internship-project kimia-farma looker-studio rakamin sql

Last synced: 22 Jan 2025

https://github.com/zeinhasan/etl-using-airflow

Extract Transform Load Using Airflow

airflow bigquery etl

Last synced: 22 Jan 2025

https://github.com/shvetsihorr/sql-projects

SQL and Google BigQuery-Portfolio Projects

azuredatastudio bigquery mssql postgresql sql

Last synced: 18 Jan 2025

https://github.com/yasarsultan/olist_datawarehouse

An end-to-end data pipeline that extracts data, processes it, and then loads it into the BigQuery data warehouse.

airflow bigquery data-warehouse docker

Last synced: 22 Jan 2025

https://github.com/chiamakaukwuoma/portfolio

This repository contains various projects I've been privileged to work on outside of work.

aws-rds azure-fabric bigquery data-analysis docker-container elasticsearch excel grafana hadoop looker-studio mssql mysql postgresql powerbi python sql tableau

Last synced: 03 Feb 2025

https://github.com/ahbiels/chatbot_analize_avaliation

Um bot feito no dialogflow cx que permite ao usuário avaliar um determinado produto da empresa. Após a avaliação, o bot ira fazer uma análise de sentimentos na avaliação do usuário, e armazenar o resultado da avaliação (juntamente com o texto da avaliação, nome do usuário e produto) dentro de um dataset no BigQuery

bigquery chatbot dataset dialogflow dialogflow-cx documentation flask gcp google-cloud iterator language-model nlu nlu-chatbot python sql

Last synced: 22 Jan 2025

https://github.com/mutaharshaik/airflow_retail_project

Airflow retail project using pipeline with BigQuery, dbt, Soda

airflow astro-cli astro-sdk bigquery datamodeling dbt docker etl-pipeline gcp snowflake soda

Last synced: 22 Jan 2025

https://github.com/victorcezeh/end-to-end-elt-pipeline

An end-to-end ELT project using the Brazilian E-Commerce dataset from Kaggle. This project demonstrates the use of Python, PostgreSQL, Docker, Docker Compose, Airflow, dbt, and BigQuery to ingest, transform, and analyze data, providing insights into sales, delivery times, and order distributions.

airflow bigquery dbt-core docker docker-compose postgresql python

Last synced: 22 Jan 2025

https://github.com/pittica/google-bigquery-helpers

Helpers for Google Cloud BigQuery.

bigquery gcp google-cloud-platform pittica

Last synced: 13 Nov 2024

https://github.com/valenthr/purchase_funnel

Google merch store sales analysis

bigquery product-analysis sql

Last synced: 27 Jan 2025

https://github.com/mattwelke/charter-challenge-for-fair-voting-bot

Bot that web scrapes and logs in BigQuery the donations so far of the Charter Challenge for Fair Voting.

bigquery bot go openwhisk public-data

Last synced: 22 Jan 2025

https://github.com/topefolorunso/musicaly-project

An end-to-end data pipeline that ingests simulated music stream data, structures, cleans and models the raw data, and visualizes clean data.

airflow bigquery data-pipeline dbt google-cloud-platform kafka python spark-streaming

Last synced: 17 Feb 2025

https://github.com/ngangawairimu/sales-analysis-and-customer-insights

This project features SQL queries for detailed customer and sales analysis:Customer Analysis and Sales Reporting

bigquery bigquery-dataset excel sql

Last synced: 28 Jan 2025

https://github.com/makism/bq-ethereum

bq-ethereum

bigquery ethereum sql

Last synced: 20 Jan 2025

https://github.com/nszoni/dbtgen

dbt: write nothing, generate (almost) everything.

analytics bigquery dbt documentation generative-ai github tooling

Last synced: 31 Jan 2025

https://github.com/garbetjie/monolog-bigquery-handler

A simple Monolog handler for writing to BigQuery.

bigquery logging monolog monolog-handler

Last synced: 16 Jan 2025

BigQuery Awesome Lists
BigQuery Categories