Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

BigQuery

Google BigQuery enables companies to handle large amounts of data without having to manage infrastructure. Google’s documentation describes it as a « serverless architecture (that) lets you use SQL queries to answer your organization’s biggest questions with zero infrastructure management. BigQuery’s scalable, distributed analysis engine lets you query terabytes in seconds and petabytes in minutes. » Its client libraries allow the use of widely known languages such as Python, Java, JavaScript, and Go. Federated queries are also supported, making it flexible to read data from external sources.

📖 A highly rated canonical book on it is « Google BigQuery: The Definitive Guide », a comprehensive reference. Another enriching read on the subject is the inside story told in the article by the founding product manager of BigQuery celebrating its 10th anniversary.

https://github.com/nghiant3110/e_com_1

This is a DA project base on E-com Data set (Thelook_ecom) in Big Query from Google

bigquery looker-studio sql

Last synced: 24 Dec 2024

https://github.com/nghiant3110/google_fiber_bi_5

This is a BI Capstone project based on the Google Fiber dataset from Google BI Course

bigquery google-sheets looker-studio sql

Last synced: 24 Dec 2024

https://github.com/nghiant3110/b2b_crm_3

This is a DA project based on the B2B Sales CRM dataset from Maven Analytics

bigquery google-sheets looker-studio sql

Last synced: 24 Dec 2024

https://github.com/raqssoriano/hha504_assignment_nosql_dbs

This task is part of my assignment focused on creating and configuring databases in different platforms, such as GCP's BigQuery, MongoDB Atlas, and Redis Cloud.

bigquery mongodb-atlas mongodbcompass redis redisinsight

Last synced: 18 Dec 2024

https://github.com/allanreda/video-processing-and-categorization

Video processing and categorization using computer vision, machine learning and cloud computing

bigquery cloud-storage-bucket cnn computer-vision google-cloud kmeans-clustering machine-learning opencv2 tensorflow virtual-machine

Last synced: 28 Dec 2024

https://github.com/adadalshabab/data-engineering-gcp-project

An end-to-end modern data engineering project, including deployment of ETL pipeline on Google Cloud Platform, using BigQuery for data analysis and leveraging Looker to generate an insight dashboard.

bigquery data data-science data-visualization databases dataengineering-a engineering etl-pipeline looker-studio powerbi

Last synced: 19 Dec 2024

https://github.com/itsubaki/hermes-lambda

Transfers AWS cost data to BigQuery

aws bigquery

Last synced: 14 Dec 2024

https://github.com/scraly/flume-bigquery-sink

An Apache Flume Sink implementation to publish data to Google BigQuery

bigquery flume sink

Last synced: 25 Dec 2024

https://github.com/francois-lenne/elt-mp4-quiberon

the goal of this project is to retrieve the video of the municipality of quiberon and see if a person is in or no

bigquery cicd data-engineering docker elt google-cloud-functions google-cloud-platform google-cloud-run google-cloud-storage pipeline python sql unstructured-data

Last synced: 25 Dec 2024

https://github.com/fakhri098/project-sql-bigquery

This project aims to analyze taxi trip data with a focus on trip duration patterns, popular routes, and trip costs. The study was conducted to gain in-depth insights into taxi travel behavior based on historical data.

bigquery sql

Last synced: 17 Jan 2025

https://github.com/celiason/coffee-funnel

webpage for visualizing sales projections of a small coffee business

bigquery prophet sales-analysis streamlit-webapp

Last synced: 26 Dec 2024

https://github.com/manesioz/airflow-without-code

Dynamically generate DAGs to ingest SQL files into BigQuery with one line of "code"

airflow airflow-plugin bigquery python sql

Last synced: 05 Jan 2025

https://github.com/mikeghen/metadata

Pulls data from Socrata open data portals

bigquery python socrata

Last synced: 27 Dec 2024

https://github.com/prashhhant213/strategic-analysis-of-retail-brand-in-south-america-using-sql

Leveraged Big Query and MySQL to analyze 100K records for sales optimization, trend identification, and enhancing customer satisfaction for a retail brand in South America and to provide insights and recommendations to improve their userbase and improve their services

bigquery database mysql-server sql

Last synced: 26 Dec 2024

https://github.com/epomatti/gcp-bigquery

Data sync via CDC from GCP Cloud SQL to Big Query using Datastream

bigquery cloud-sql datastream gcp

Last synced: 17 Jan 2025

https://github.com/vigneshss-07/mastering-sql-and-bigquery-on-google-cloud-platform

Take your Data Analytics skills to the next level with this comprehensive playlist. Learn SQL from the basics to advanced techniques while mastering BigQuery on Google Cloud.

analytics bigquery gcp sql

Last synced: 05 Jan 2025

https://github.com/khanovico/energy-data-analysis

This is the cloud model analyzing real world dataset with BigQuery and other big-data analyzing tools. I implemented docker image for running this app on cross-platform environments.

big-data-processing bigquery docker google-app-engine jupyter-notebook mlflow python scikit-learn seaborn xgboost

Last synced: 10 Oct 2024

https://github.com/simhayn/genomics-cannabis-bigquery

BigQuery's Cannabis_Genomics Dataset Exploration using SQL in a Python Environment

big-data bigquery bioinformatics exploratory-data-analysis genomics python sql

Last synced: 22 Jan 2025

https://github.com/anyesh/gbq-helpers

GBQ related helper functions and snippets.

bigquery google

Last synced: 10 Jan 2025

https://github.com/syou6162/mackerel-plugin-bigquery-query-result-importer

Mackerel plugin to post bigquery's query result

bigquery mackerel-plugin

Last synced: 12 Oct 2024

https://github.com/entur/terraform-aiven-kafka-connect-bigquery-sink

Terraform module for BigQuery sink connector on Aiven KafkaConnect cluster

aiven bigquery kafka-connect sink-connector terraform terraform-modules

Last synced: 17 Jan 2025

https://github.com/ngangawairimu/clv-rfm-and-customer-segmentation-analysis

This project performs cohort analysis to estimate Customer Lifetime Value (CLV) by analyzing weekly revenue and user registrations over 12 weeks, forecasting future revenue, and providing actionable insights for marketing and business strategy.

bigquery clv-analysis cohort-analysis customer-segmentation excel rfm-analysis

Last synced: 03 Jan 2025

https://github.com/shikanime/seeker

Data platform based on BigQuery

bigquery dataform google-cloud

Last synced: 04 Jan 2025

https://github.com/marceloneppel/map-to-bigquery-structs

Tool to convert a Golang map to a struct containing fields with types like bigquery.Null*.

bigquery golang map struct

Last synced: 30 Jan 2025

https://github.com/smohanta23/uber_data-engineering_etl-project

This project demonstrates a comprehensive data engineering workflow using the Uber information dataset. It covers the full spectrum of data engineering pipelines, from data transformation to deployment on Google Cloud, with a focus on creating a scalable and insightful data model.

big-data-analytics bigquery cloudcomputing computeengine dashboard-application dataengineering datainsights datamodelling datapipeline datascience datavisualization etl-pipeline gcp-project googlecloudplatform mage opensource python uber uber-api

Last synced: 21 Jan 2025

https://github.com/yasarsultan/taxi-trip-analysis

The NYC Taxi Trip Batch Data Pipeline automates processing of large-scale trip data using Apache Spark and Airflow, integrating AWS S3 and Google BigQuery for storage and analytics. It features scalable, containerized workflows with robust data validation.

airflow aws-s3 bash-script batch-processing bigquery data-lake data-warehouse docker python3 spark

Last synced: 11 Jan 2025

https://github.com/kavyachippada/hva

Mini-Hackathon 1.0

bigquery excel pandas powerbi sql

Last synced: 13 Oct 2024

https://github.com/santiago-giordano/aws-gcp-pipeline

Simple pipeline, downloads csv from aws bucket, does some transformations, creates tables in gcp bq, loads data, and runs queries

aws bigquery etl gcp jupyter pipeline python

Last synced: 12 Jan 2025

https://github.com/denisogr/kaggle-notebook-to-production

This is a study project. I get analytics/ML examples from Kaggle and use different technologies to re-implement them.

bigquery data-engineering gcp kaggle-competition kaggle-dataset python spark

Last synced: 12 Jan 2025

https://github.com/spacepatcher/google-workspace-gmail-collector

👁 App for collecting Gmail logs from your Google Workspace account and sending them to Kafka

bigquery gmail google-workspace security soc

Last synced: 23 Oct 2024

https://github.com/justinjsd/analytics-engineering

📊 A repository focusing on analytics engineering, particularly using dbt on the Northwind Sample dataset

analytics bigquery dbt engineering sql

Last synced: 12 Jan 2025

https://github.com/shvetsihorr/sql-projects

SQL and Google BigQuery-Portfolio Projects

azuredatastudio bigquery mssql postgresql sql

Last synced: 18 Jan 2025

https://github.com/alessio-siciliano/bigquery-advanced-utils

A utility library that enhances the official BigQuery Python client with additional tools for query management, data processing, and automation, making it easier to work efficiently with Google BigQuery.

bigquery datatransfer google-cloud python

Last synced: 05 Dec 2024

https://github.com/rolandbende/python-bigquery-migrations

Python bigquery-migrations package is for creating and manipulating BigQuery databases easily.

bigquery google migration-automation migration-scripts migration-tool migrations python

Last synced: 24 Jan 2025

https://github.com/ddzikri/analisis-data-kimia-farma

Project Based Internship Kimia Farma Rakamin Academy

bigquery dataset sql

Last synced: 24 Jan 2025

https://github.com/stoqey/rasputia

Rasputia Latimore - The Big Data Bitch 💋

bigquery

Last synced: 19 Jan 2025

https://github.com/janaom/gcp-de-project-data-pipeline-with-cloud-run-functions-airflow-biggueryml

Build a data pipeline on Google Cloud using an event-driven architecture, leveraging GCS, Cloud Run functions, and BigQuery. Explore both VM and Composer options for Airflow management, and utilize Logging & Monitoring for pipeline health. Discover how SQL-based BigQuery ML can be used for initial ML implementation in specific scenarios.

airflow bigquery bigqueryml cloud-functions cloud-run-functions composer data-engineering-project google-cloud-platform

Last synced: 26 Jan 2025

https://github.com/oguzgn/firebase-ab-test-analysis-for-a-mobile-race-game

This repository showcases an infrastructure designed for analyzing A/B tests in mobile games. It leverages BigQuery to process Firebase and GA4-based event data and uses Looker Studio for dynamic visualization. The project simplifies A/B test comparisons, enabling stakeholders to view results directly through interactive dashboards.

ab-testing ab-testing-analysis bigquery event-based-tracking firebase looker-studio mobile-game-analytics race-game sql

Last synced: 26 Jan 2025

https://github.com/alessio-siciliano/google-cloud-python-class-wrapper

An example of several classes written in Python to interact with GCP

bigquery datatransfer gcp google-cloud

Last synced: 26 Jan 2025

https://github.com/sejalmankar1012/product_data_analyst_assessement

Analyzing the Impact of Business Hour Mismatch on Order Volume in the Food Delivery Industry: A Case Study of UEats and Ghub

assessment-project bigquery loop product-analyst sql-query

Last synced: 26 Jan 2025

https://github.com/knands42/data-ingestion

Data Ingestion project to evaluate my Kotlin skill using concurrency

bigquery golang google-cloud-platform google-storage gradle-kotlin-dsl kotlin kotlin-flow

Last synced: 25 Jan 2025

https://github.com/vaibhavs10/ml-on-gcp

The repository walks through a Data Scientist focused way of building and deploying Machine Learning models on Google Cloud

aiplatform bigquery googlecloudplatform ml

Last synced: 19 Dec 2024

https://github.com/kartikeya443/automated-data-pipeline-gcp

This project showcases the integration of various Google Cloud Platform services to build an efficient and automated data pipeline for sales data.

bigquery cloud data-engineering flask gcp google-cloud-platform looker-studio pipeline python sql

Last synced: 21 Jan 2025

https://github.com/tomgorb/some-data-monitoring

fully functional DAG using Airflow 2 and minikube (locally) to help monitor GCP billing

airflow2 bigquery gcp minikube

Last synced: 21 Jan 2025

https://github.com/vaishnavipaithane/bellabeat-data-analysis-case-study

This capstone project was done as a part of Google Data Analytics Professional Certificate course.

bigquery data-analysis sql tableau

Last synced: 21 Jan 2025

https://github.com/ayresgneto/use-case-gcp-etl

ELT pipeline GCP. Tecnologias utilizadas: Postgresql, GCP Storage, Airflow (local), Pyspark (local), BigQuery

airflow big-data bigquery data data-engineering etl gcp pipeline postgresql programming-oriented-object pyspark python spark

Last synced: 21 Jan 2025

https://github.com/pratshrestha/cochin-traders---sql--sales-analysis

Cochin Traders imports and exports specialty foods globally. This project analyzes sales and operational data to enhance business efficiency, supply chain management, and sales performance. Key areas of focus include

bigquery customer-engagement employee-performance inventory-management sales-trends sql

Last synced: 21 Jan 2025

https://github.com/vidyadnina/cyclistic-sql-tableau-project

Trip data analysis for a bike-sharing service company using SQL and Tableau.

bigquery dashboard data-analysis data-analytics-sql data-cleaning data-visualization sql

Last synced: 21 Jan 2025

https://github.com/shaheerazam-dev/cyclistic-case-study-google-data-analytics-certificate

This case study simulates the real-world experience of a junior data analyst at Cyclistic, a fictional company. We will leverage the data analysis process framework (Ask, Prepare, Process, Analyze, Share, Act) to address critical business questions and provide data-driven insights to guide strategic decision-making.

bigquery data-science data-visualization spreadsheet sql tableau

Last synced: 21 Jan 2025

https://github.com/anpandu/ps2bq

Stream insert GCP PubSub messages into BigQuery table.

bigquery golang pubsub

Last synced: 21 Jan 2025

https://github.com/nghiant3110/firebase_6

This is a DA project based on the Firebase Sample dataset on Big Query

bigquery firebase looker-studio sql

Last synced: 21 Jan 2025

https://github.com/lawal-hash/olistelt

An end-to-end ELT data pipeline of the Brazilian olist e-commerce dataset using the modern data stack

airflow bigquery dbt dbt-core docker postgresql sql

Last synced: 21 Jan 2025

https://github.com/plishka/blockchain_analysis

Cryptocurrency On-Chain Analysis (Bitcoin Blockchain)

bigquery blockchain data-cleaning scraping-websites sql tableau

Last synced: 21 Jan 2025

https://github.com/cyber-programmer/web-traffic-analytics-ml-model

This Jupyter Notebook focuses on classifying website visitors using logistic regression. The project leverages Google Analytics sample data and BigQuery for data analysis and feature engineering. It provides a comprehensive workflow that includes data import, preprocessing, exploratory data analysis.

bigquery logistic-regression machine-learning

Last synced: 21 Jan 2025

https://github.com/kmohamedalie/bigquery-intro

Coursera BigQuery Introduction using Covid19 dataset

bigquery coursera covid-19 datavisualization looker-studio sql

Last synced: 21 Jan 2025

https://github.com/rubnsbarbosa/nasa-asteroids-extractor

ETL asteroids data extractor using some Google Cloud services

bigquery bucket cloud-storage google-cloud nasa-api-neows

Last synced: 21 Jan 2025

https://github.com/greatwoman23/car_insurance_analysis

The Car Insurance Analysis project aims to provide a comprehensive examination of a car insurance portfolio using advanced data analytics tools. The analysis offers valuable insights into policy demographics, claims patterns, and financial metrics, helping stakeholders make informed decisions.

bigquery data data-science dataanalytics insurance-claims looker-studio tableau

Last synced: 21 Jan 2025

https://github.com/marcopellegrinoit/web-traffic-time-series-predictions

Forecast Web Traffic Demand Time Series with ARIMA+ BigQuery and Looker Studio. Addionatel modeling available with ARIMA, LSTM, and Facebook Prophet.

arima bigquery gcp lstm prophet-model time-series vertex-ai

Last synced: 21 Jan 2025

https://github.com/denny-b-justin/purdue

The internship was broadly to understand if the topics/events are being covered differently in the different countries and how they affect stock market returns. The provided dataset is a post-processed set of news articles, so already reflects topic modelling and sentiment analysis.

big-data bigquery finance gdelt-events python

Last synced: 21 Jan 2025

https://github.com/nlgtuankiet/bq-noti

BigQuery notification

bigquery bq notification notifier

Last synced: 21 Jan 2025

https://github.com/arhea/go-mock-bigquery

Creates a mock BigQuery client based on the bigquery-emulator for testing in Golang projects.

bigquery golang golang-module google-bigquery google-cloud-platform testcontainers-go testing

Last synced: 21 Jan 2025

https://github.com/mdornseif/datastore-to-bigquery

The missing Data Transfer Tool: Dump Google Cloud Datastore contents and load them into BigQuery.

backup bigquery bigquery-backup cloud datastore google

Last synced: 21 Jan 2025

https://github.com/vidyadnina/other-sql-projects-and-queries

Other SQL projects and queries.

bigquery mysql sql

Last synced: 21 Jan 2025

https://github.com/karencofre/marketing-segmentacion-en-powerbi

Proyecto prueba de hipótesis en powerbi y python

bigquery google-colab powerbi python sql statsmodels

Last synced: 21 Jan 2025

https://github.com/lisabensoussan/bigdataminig_finalassignment

This repository contains solutions for the final assignment of the Big Data Mining course (52002/52019), focusing on querying large datasets with BigQuery, network analysis with Python, and distributed data processing with Apache Spark.

bigquery community-detection data-cleaning dataframe exploratory-data-analysis pagerank rdd sql text-analysis visualization

Last synced: 21 Jan 2025

https://github.com/edwinrlambert/cyclistic-bike-share-analysis

This repository is part of the Google Data Analytics Capstone Project, focusing on analyzing Cyclistic's bike-sharing data to identify trends and strategies for converting casual riders to annual members. It aims to provide actionable insights for enhancing marketing efforts.

act analyze ask bigquery prepare process share sql

Last synced: 21 Jan 2025

https://github.com/mlund2k/project-1-baseball-performance-vs.-attendance

Project assets for my first exploratory data analysis: Baseball Performance vs. Attendance.

bigquery data-analysis data-cleaning data-visualization excel rstudio sql tableau tidyverse

Last synced: 21 Jan 2025

https://github.com/moeabbas6/dbt_analytics_engine

An end-to-end project using dbt to demonstrate data transformations, testing, and visualization with Google BigQuery, and Looker Studio. It showcases a complete data pipeline from extraction/generation to deployment.

analytics-engineering bigquery data data-pipeline data-transformation data-visualization dbt testing

Last synced: 21 Jan 2025

https://github.com/aazuspan/landsat-bigquery

Summarizing 51 years of Landsat data using Earth Engine and BigQuery

bigquery google-earth-engine landsat

Last synced: 21 Jan 2025

https://github.com/vedantwalia/google-data-analytics-capstone-case-study

This is a repository of my work on data analysis as a part of the Google Data Analytics Capstone

bigquery data data-viz datavisualization-project divvy-bikes google googledataanalytics sql tableau tableau-public

Last synced: 21 Jan 2025

https://github.com/aisurjyasamantaray/-optimizing-target-s-brazilian-operations-insights-from-order-processing-pricing-and-payment-trends-

This project offers an in-depth analysis of consumer behavior, logistical performance, and payment preferences within the e-commerce sector. By examining order costs, delivery times, and payment methods, businesses can uncover valuable insights into operational efficiency and customer preferences.

bigquery consumer-insights data-analysis database sql target

Last synced: 21 Jan 2025

https://github.com/angulartist/scio-demo

Playing w/ Scio

apache-beam bigquery scio

Last synced: 10 Nov 2024

https://github.com/hayashi-yudai/cloudfunc_login

Example of authentication function for login with Cloud Functions and BigQuery

bigquery gcp-cloud-functions golang server

Last synced: 15 Jan 2025

https://github.com/martinkalema/bigquery-pubsub

Loading data into BigQuery Table

bigquery data-engineering flat-file kafka

Last synced: 11 Jan 2025

https://github.com/phukon/package-insights

PyPI package reports and insights. The data was ingested from publicly available source using BigQuery and then transformed.

big-data bigquery dbt duckdb

Last synced: 27 Jan 2025

https://github.com/sintef/bigquery-postgresql-wire-proxy

A PostgreSQL wire protocol proxy server for BigQuery.

bigquery postgresql proxy

Last synced: 12 Jan 2025

https://github.com/alessio-siciliano/bigquery-utils

A utility library that enhances the official BigQuery Python client with additional tools for query management, data processing, and automation, making it easier to work efficiently with Google BigQuery.

bigquery datatransfer google-cloud python

Last synced: 28 Jan 2025

https://github.com/airbytehq/terraform-airbyte-bigquery-destination

Terraform Module for Setting Up BigQuery Destination with Airbyte

airbyte bigquery elt etl terraform

Last synced: 07 Jan 2025

https://github.com/crudek-data/bigquery-kaggle-apis

kaggle api to download free datasets along with google bigquery api to read/write from cloud data warehouse

bigquery data-engineering kaggle

Last synced: 22 Jan 2025

https://github.com/nikhilsree5/targetcasestudy

An exploratory and in-depth study of the e-commerce market in Brazil.

bigquery eda sql visualization

Last synced: 22 Jan 2025

https://github.com/coatless/bigquery-reddit-ask-your-advisor

Analysis code that counts instances of a phrase on Reddit (e.g. "ask your advisor")

ask-your-advisor bigquery r reddit

Last synced: 16 Jan 2025

https://github.com/flowerinthenight/bqstream

A simple library to help facilitate streaming to BigQuery.

bigquery go golang streaming

Last synced: 08 Jan 2025

https://github.com/kellyjadams/ap-exam-scores

Analyzing AP exam scores for a school.

bigquery sql

Last synced: 08 Jan 2025

BigQuery Awesome Lists
BigQuery Categories