Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

BigQuery

Google BigQuery enables companies to handle large amounts of data without having to manage infrastructure. Google’s documentation describes it as a « serverless architecture (that) lets you use SQL queries to answer your organization’s biggest questions with zero infrastructure management. BigQuery’s scalable, distributed analysis engine lets you query terabytes in seconds and petabytes in minutes. » Its client libraries allow the use of widely known languages such as Python, Java, JavaScript, and Go. Federated queries are also supported, making it flexible to read data from external sources.

📖 A highly rated canonical book on it is « Google BigQuery: The Definitive Guide », a comprehensive reference. Another enriching read on the subject is the inside story told in the article by the founding product manager of BigQuery celebrating its 10th anniversary.

https://github.com/siriospa/gcp-helpers-bigquery

Helpers for Google Cloud BigQuery.

bigquery gcp google-cloud-platform sirio

Last synced: 12 Oct 2024

https://github.com/anpandu/ps2bq

Stream insert GCP PubSub messages into BigQuery table.

bigquery golang pubsub

Last synced: 12 Oct 2024

https://github.com/mdornseif/datastore-to-bigquery

The missing Data Transfer Tool: Dump Google Cloud Datastore contents and load them into BigQuery.

backup bigquery bigquery-backup cloud datastore google

Last synced: 12 Oct 2024

https://github.com/hanif-syazul/analyzing-kimia-farma-sales-performance-with-gcp

This repository contains the final project for the Rakamin Big Data Analytics Internship. It include a complete dashboard of Kimia Farma's sales performance analysis from 2020 to 2023.

big-data-analytics bigquery internship-project kimia-farma looker-studio rakamin sql

Last synced: 13 Oct 2024

https://github.com/nealwp/blobview

Generate BigQuery SQL views from JSON

bigquery cli json sql

Last synced: 12 Oct 2024

https://github.com/edwinrlambert/cyclistic-bike-share-analysis

This repository is part of the Google Data Analytics Capstone Project, focusing on analyzing Cyclistic's bike-sharing data to identify trends and strategies for converting casual riders to annual members. It aims to provide actionable insights for enhancing marketing efforts.

act analyze ask bigquery prepare process share sql

Last synced: 12 Oct 2024

https://github.com/nghiant3110/google_analytic_4

This is a DA project based on the GA4 Sample dataset on Big Query

bigquery google-analytics looker-studio sql

Last synced: 06 Nov 2024

https://github.com/nghiant3110/e_com_1

This is a DA project base on E-com Data set (Thelook_ecom) in Big Query from Google

bigquery looker-studio sql

Last synced: 06 Nov 2024

https://github.com/nghiant3110/b2b_crm_3

This is a DA project based on the B2B Sales CRM dataset from Maven Analytics

bigquery google-sheets looker-studio sql

Last synced: 06 Nov 2024

https://github.com/nghiant3110/google_fiber_bi_5

This is a BI Capstone project based on the Google Fiber dataset from Google BI Course

bigquery google-sheets looker-studio sql

Last synced: 06 Nov 2024

https://github.com/niteshchawla/nc-sql-business-case

A Leading Retail chain brand and a prominent retailer in the United States. It makes itself a preferred shopping destination by offering outstanding value, inspiration, innovation and an exceptional guest experience that no other retailer can deliver.

bigquery retail sql supermarket

Last synced: 12 Oct 2024

https://github.com/oliveroneill/wilt-cloud-functions

Wilt Google Cloud Functions

bigquery google-cloud-functions

Last synced: 10 Nov 2024

https://github.com/smohanta23/uber_data-engineering_etl-project

This project demonstrates a comprehensive data engineering workflow using the Uber information dataset. It covers the full spectrum of data engineering pipelines, from data transformation to deployment on Google Cloud, with a focus on creating a scalable and insightful data model.

big-data-analytics bigquery cloudcomputing computeengine dashboard-application dataengineering datainsights datamodelling datapipeline datascience datavisualization etl-pipeline gcp-project googlecloudplatform mage opensource python uber uber-api

Last synced: 12 Oct 2024

https://github.com/zborovskaanna/e-commerce-web-events-analysis

SQL project based on the Big Query public database 'The Look e-Commerce' and a dashboard in Looker Studio

analysis bigquery dashboard data-visualization looker-studio sql

Last synced: 13 Oct 2024

https://github.com/noospheracr/twilio-segment-configs

Integration of Twilio Segment with Google BigQuery, Looker/PowerBI, and Google VertexAI to create a data-driven marketing platform

bigquery google-cloud-platform looker-studio marketing noosphera power-bi twilio-segment vertex-ai

Last synced: 13 Oct 2024

https://github.com/zeinhasan/etl-using-airflow

Extract Transform Load Using Airflow

airflow bigquery etl

Last synced: 13 Oct 2024

https://github.com/lisabensoussan/bigdata_midterm

This project focuses on analyzing Stack Overflow data related to JavaScript and Python questions using a combination of SQL queries (Google BigQuery) and Unix shell commands. The aim is to explore trends, activity patterns, and user behavior around these popular programming languages through data wrangling and querying techniques.

bigquery data-cleaning sql unix-command unix-shell

Last synced: 13 Oct 2024

https://github.com/thanhloc81/customer-segmentation

✨ Analyze customer segments of Adventure World dataset

bigquery google-cloud powerbi sql

Last synced: 13 Oct 2024

https://github.com/thecodersstudio/node-native-test-runner

Code samples and test cases showcasing the power of Node.js's native test runner for streamlined and efficient testing.

bigquery mock nodejs nodejs-test nodenativetestrunner test

Last synced: 13 Oct 2024

https://github.com/ivdatahub/pypi-package-stats

Project for ingest pypi packages data from BigQuery and send to DataDog for analysis and insights with dashboards, monitors and more

bigquery cloud data-engineering data-warehouse gcp software-engineering

Last synced: 13 Oct 2024

https://github.com/jasontanx/mas-international-arrivals

Code repository about international arrivals into Malaysia

bigquery data-analytics data-engineering etl-pipeline international-arrivals

Last synced: 13 Oct 2024

https://github.com/ahbiels/chatbot_analize_avaliation

Um bot feito no dialogflow cx que permite ao usuário avaliar um determinado produto da empresa. Após a avaliação, o bot ira fazer uma análise de sentimentos na avaliação do usuário, e armazenar o resultado da avaliação (juntamente com o texto da avaliação, nome do usuário e produto) dentro de um dataset no BigQuery

bigquery chatbot dataset dialogflow dialogflow-cx documentation flask gcp google-cloud iterator language-model nlu nlu-chatbot python sql

Last synced: 13 Oct 2024

https://github.com/yasarsultan/olist_datawarehouse

An end-to-end data pipeline that extracts data, processes it, and then loads it into the BigQuery data warehouse.

airflow bigquery data-warehouse docker

Last synced: 13 Oct 2024

https://github.com/aisurjyasamantaray/-optimizing-target-s-brazilian-operations-insights-from-order-processing-pricing-and-payment-trends-

This project offers an in-depth analysis of consumer behavior, logistical performance, and payment preferences within the e-commerce sector. By examining order costs, delivery times, and payment methods, businesses can uncover valuable insights into operational efficiency and customer preferences.

bigquery consumer-insights data-analysis database sql target

Last synced: 12 Oct 2024

https://github.com/mattwelke/charter-challenge-for-fair-voting-bot

Bot that web scrapes and logs in BigQuery the donations so far of the Charter Challenge for Fair Voting.

bigquery bot go openwhisk public-data

Last synced: 13 Oct 2024

https://github.com/codingsancho/fastapi-bigquery

Learning exercise, Python backend, FastAPI, bigquery, React-JS frontend.

bigquery fastapi javascript python react

Last synced: 01 Nov 2024

https://github.com/shvetsihorr/sql-projects

SQL and Google BigQuery-Portfolio Projects

azuredatastudio bigquery mssql postgresql sql

Last synced: 12 Oct 2024

https://github.com/kavyachippada/hva

Mini-Hackathon 1.0

bigquery excel pandas powerbi sql

Last synced: 13 Oct 2024

https://github.com/abdullahasghar/sql

The repo includes all projects and assessments I have completed with SQL. IDE/s used: MS SQL Server, Google Big Query.

bigquery mssqlserver sql

Last synced: 12 Oct 2024

https://github.com/mysto-007/cyclistic-bike-share-analysis

Analyzed the dataset of Cyclistic Rental Service as the Capstone project for Google Data Analytics SpecializationAnalyzed the dataset of Cyclistic bike-share (Capstone project for Google Data Analytics Specialization)

bigquery data-analysis excel ms-sql-server sql tableau tableau-public

Last synced: 12 Oct 2024

https://github.com/syedsajjadaskari/end-to-end-chicago-taxi-tip-prediction-with-bigquery-and-vertex-ai

An end-to-end example of Chicago taxi on Google Cloud using TensorFlow, TFX, and Vertex AI

bigquery gcp tensorflow tfx vertex-ai

Last synced: 13 Nov 2024

https://github.com/yoshiyukikato/nightharbor-bigquery-reporter

A nightharbor reporter for GCP BigQuery

bigquery lighthouse

Last synced: 13 Oct 2024

https://github.com/rohitsanj/superset-dbt-demo

This repository contains an example project (Jaffle Shop) demonstrating integration between Superset and dbt, with BigQuery as the data warehouse.

apache-superset bigquery dbt superset

Last synced: 13 Oct 2024

https://github.com/arhea/go-mock-bigquery

Creates a mock BigQuery client based on the bigquery-emulator for testing in Golang projects.

bigquery golang golang-module google-bigquery google-cloud-platform testcontainers-go testing

Last synced: 12 Oct 2024

https://github.com/garbetjie/phpunit-bigquery-schema

A BigQuery schema validator constraint for BigQuery

bigquery phpunit

Last synced: 14 Oct 2024

https://github.com/ymyzk/bq-globalip

Record the current global IPv4 address to a BigQuery table.

bigquery golang

Last synced: 14 Oct 2024

https://github.com/davidkhala/dwh-migration-tools

dwh-migration-tools: contribution fork

bigquery bq gcp

Last synced: 29 Sep 2024

https://github.com/davelester/gharchive-bigquery-examples

Examples Using BigQuery to Analyze GH Archive Data

bigquery gharchive

Last synced: 15 Oct 2024

https://github.com/vaibhavs10/ml-on-gcp

The repository walks through a Data Scientist focused way of building and deploying Machine Learning models on Google Cloud

aiplatform bigquery googlecloudplatform ml

Last synced: 25 Oct 2024

https://github.com/ayresgneto/use-case-gcp-etl

ELT pipeline GCP. Tecnologias utilizadas: Postgresql, GCP Storage, Airflow (local), Pyspark (local), BigQuery

airflow big-data bigquery data data-engineering etl gcp pipeline postgresql programming-oriented-object pyspark python spark

Last synced: 12 Oct 2024

https://github.com/patriciavalentine/loan-data-queries

In this project, I analyzed a vehicle loan dataset using BigQuery to identify demographic, financial, and loan patterns. Through SQL queries, I extracted insights such as the credit scores, and loan distribution by region, and explored high-risk profiles. The findings are visualized in Looker Studio, thus helping to inform strategic decisions.

asset-finance bigquery loan-data looker-studio

Last synced: 21 Oct 2024

https://github.com/akansharajput280799/strategic-analysis-of-retail-brand-in-south-america-using-sql

Leveraged Big Query and MySQL to analyze 100K records for sales optimization, trend identification, and enhancing customer satisfaction for a retail brand in South America and to provide insights and recommendations to improve their userbase and improve their services

bigquery data-analysis data-science database database-schema google-bigquery mysql-server sql

Last synced: 21 Oct 2024

https://github.com/janmin123/cyclistic

Capstone project for Google/Coursera Data Analytics Course

analysis bigquery sql tableau visualization

Last synced: 21 Oct 2024

https://github.com/mikeghen/metadata

Pulls data from Socrata open data portals

bigquery python socrata

Last synced: 07 Nov 2024

https://github.com/greatwoman23/car_insurance_analysis

The Car Insurance Analysis project aims to provide a comprehensive examination of a car insurance portfolio using advanced data analytics tools. The analysis offers valuable insights into policy demographics, claims patterns, and financial metrics, helping stakeholders make informed decisions.

bigquery data data-science dataanalytics insurance-claims looker-studio tableau

Last synced: 12 Oct 2024

https://github.com/antbit96/dataform_poc

Template for basic data preparation

bigquery bigquery-dataform data-preparation

Last synced: 26 Oct 2024

https://github.com/squidmin/bigquery-labs

GCP BigQuery CLI

bigquery gcp java

Last synced: 27 Oct 2024

https://github.com/themihirmathur/uber-data-analytics

The goal of this project is to perform comprehensive data analytics on Uber trip data using a modern data engineering stack on Google Cloud Platform (GCP).

bigquery data-analysis data-engineering etl-pipeline google-cloud-platform looker python

Last synced: 12 Oct 2024

https://github.com/kartikeya443/automated-data-pipeline-gcp

This project showcases the integration of various Google Cloud Platform services to build an efficient and automated data pipeline for sales data.

bigquery cloud data-engineering flask gcp google-cloud-platform looker-studio pipeline python sql

Last synced: 12 Oct 2024

https://github.com/moeabbas6/dbt_analytics_engine

An end-to-end project using dbt to demonstrate data transformations, testing, and visualization with Google BigQuery, and Looker Studio. It showcases a complete data pipeline from extraction/generation to deployment.

analytics-engineering bigquery data data-pipeline data-transformation data-visualization dbt testing

Last synced: 12 Oct 2024

https://github.com/vidyadnina/cyclistic-sql-tableau-project

Trip data analysis for a bike-sharing service company using SQL and Tableau.

bigquery dashboard data-analysis data-analytics-sql data-cleaning data-visualization sql

Last synced: 12 Oct 2024

https://github.com/kevin-rsj/real-estate-investments

Sistema de scoring que clasifica ciudades francesas para inversión en segundas viviendas según perfil de riesgo(alto, moderado y bajo). Evalúa ratios clave en áreas como demanda, disponibilidad, infraestructura, demografía y precios.

bigquery data-analytics looker-studio numpy pandas python sklearn-library sql visualization

Last synced: 29 Oct 2024

https://github.com/prathmeshyelne/etl-pipeline-for-employee-data-using-data-fusion-airflow

This repository contains code and configuration files for an Extract, Transform, Load (ETL) project using Google Cloud Data Fusion for data extraction, Apache Airflow/Composer for orchestration, and Google BigQuery for data loading.

airflow bigquery dataengineering etl gcp googlecloudplatform

Last synced: 12 Oct 2024

https://github.com/nghiant3110/firebase_6

This is a DA project based on the Firebase Sample dataset on Big Query

bigquery firebase looker-studio sql

Last synced: 12 Oct 2024

https://github.com/cyber-programmer/web-traffic-analytics-ml-model

This Jupyter Notebook focuses on classifying website visitors using logistic regression. The project leverages Google Analytics sample data and BigQuery for data analysis and feature engineering. It provides a comprehensive workflow that includes data import, preprocessing, exploratory data analysis.

bigquery logistic-regression machine-learning

Last synced: 12 Oct 2024

https://github.com/mutaharshaik/airflow_retail_project

Airflow retail project using pipeline with BigQuery, dbt, Soda

airflow astro-cli astro-sdk bigquery datamodeling dbt docker etl-pipeline gcp snowflake soda

Last synced: 13 Oct 2024

https://github.com/mlund2k/project-1-baseball-performance-vs.-attendance

Project assets for my first exploratory data analysis: Baseball Performance vs. Attendance.

bigquery data-analysis data-cleaning data-visualization excel rstudio sql tableau tidyverse

Last synced: 12 Oct 2024

https://github.com/yu-iskw/bigquery-lineage

Visualize BigQuery data lineage graph

bigquery data-governance data-management visualization

Last synced: 30 Oct 2024

https://github.com/vaishnavipaithane/bellabeat-data-analysis-case-study

This capstone project was done as a part of Google Data Analytics Professional Certificate course.

bigquery data-analysis sql tableau

Last synced: 12 Oct 2024

https://github.com/aazuspan/landsat-bigquery

Summarizing 51 years of Landsat data using Earth Engine and BigQuery

bigquery google-earth-engine landsat

Last synced: 12 Oct 2024

https://github.com/mehmoodulhaq570/bigquery_machine_learning_project

This project develops a machine learning model to predict incident groups based on data from the London Fire Brigade service calls. Using Python and the Google Colab environment, the model utilizes a Gradient Boosting Classifier to categorize incidents, improving resource allocation and incident response within the London Fire Brigade.

bigquery bigquery-dataset cloud colabs database database-project google-colab ipnyb jupyter-notebook machine-learning prediction-algorithm prediction-model python

Last synced: 05 Nov 2024

https://github.com/anyesh/gbq-helpers

GBQ related helper functions and snippets.

bigquery google

Last synced: 12 Nov 2024

https://github.com/brpy/nyc-trips

Data engineering | Zoomcamp journey on nyc trip data with gcp stack

bigquery dbt gcp pyspark

Last synced: 05 Nov 2024

https://github.com/alexgenovese/machine-learning-bigquery-gcp

These SQL are based on available ecommerce dataset that has millions of Google Analytics records for the Google Merchandise Store loaded into BigQuery.

bigquery google google-cloud-platform purchase sql visitors

Last synced: 07 Nov 2024

https://github.com/scraly/flume-bigquery-sink

An Apache Flume Sink implementation to publish data to Google BigQuery

bigquery flume sink

Last synced: 06 Nov 2024

https://github.com/scraly/bigquery

Google BigQuery AaaS tools, tips and fun

bigquery java

Last synced: 06 Nov 2024

https://github.com/acardosolima/crypto-ethereum-tokens

This project aims to create a data pipeline using Airflow to ingest dataset from Google Bigquery to a PostgreSQL database. This stack will run in a local environment using Kubernetes.

airflow bigquery postgresql python

Last synced: 13 Oct 2024

https://github.com/spacepatcher/google-workspace-gmail-collector

👁 App for collecting Gmail logs from your Google Workspace account and sending them to Kafka

bigquery gmail google-workspace security soc

Last synced: 23 Oct 2024

https://github.com/prashhhant213/strategic-analysis-of-retail-brand-in-south-america-using-sql

Leveraged Big Query and MySQL to analyze 100K records for sales optimization, trend identification, and enhancing customer satisfaction for a retail brand in South America and to provide insights and recommendations to improve their userbase and improve their services

bigquery database mysql-server sql

Last synced: 07 Nov 2024

https://github.com/yandex-cloud-examples/yc-bigquery-to-object-storage

Экспорт данных из Google Big Query через Google Storage в Object Storage Yandex Cloud.

bigquery object-storage python3 yandex-cloud yandexcloud

Last synced: 07 Nov 2024

https://github.com/marielachirinosr/nyc-taxi-trip-exploration-2019-2020

Explores passenger behavior & impact of COVID-19 on NYC taxi industry (Q1 2019-2020).

bigquery data data-analysis data-visualization python sql tableau

Last synced: 07 Nov 2024

https://github.com/coatless/bigquery-reddit-ask-your-advisor

Analysis code that counts instances of a phrase on Reddit (e.g. "ask your advisor")

ask-your-advisor bigquery r reddit

Last synced: 15 Nov 2024

https://github.com/victorcezeh/data-engineering-final-semester-portfolio

This GitHub repository serves as a comprehensive platform for managing and showcasing my data engineering projects and assessments throughout my final semester at Alt School Africa. Designed to foster collaboration, organization, and continuous improvement, this repository is the backbone of my academic journey in data engineering.

bigquery docker gcs-bucket postgresql python

Last synced: 12 Oct 2024

https://github.com/airbytehq/terraform-airbyte-bigquery-destination

Terraform Module for Setting Up BigQuery Destination with Airbyte

airbyte bigquery elt etl terraform

Last synced: 10 Nov 2024

https://github.com/edumoraes1/spam_count_sfmc

Consulta de SQL com contagem de envios de email e spam dos ultimos 365 dias

bigquery marketing-cloud salesforce sql

Last synced: 08 Nov 2024

https://github.com/fsistemas/bigquery-td

ETL to extract data from mysql load and merge in BigQuery

bigquery etl mysql python sql

Last synced: 09 Nov 2024

https://github.com/ngangawairimu/clv-rfm-and-customer-segmentation-analysis

This project performs cohort analysis to estimate Customer Lifetime Value (CLV) by analyzing weekly revenue and user registrations over 12 weeks, forecasting future revenue, and providing actionable insights for marketing and business strategy.

bigquery clv-analysis cohort-analysis customer-segmentation excel rfm-analysis

Last synced: 09 Nov 2024

https://github.com/francois-lenne/play-bq-gcp

Data pipeline in order to retrieve data from the playstation API to BigQuery

bigquery cicd data-engineering google-cloud python

Last synced: 14 Nov 2024

https://github.com/ackeecz/terraform-gcp-cloud-run_pubsub_to_bq

Cloud Run subscribes itself to given topic and inserts each message to BigQuery table.

bigquery gcp pubsub terraform

Last synced: 10 Nov 2024

https://github.com/ackeecz/terraform-gcp-cloud-function_pubsub_to_bq

Cloud function subscribes itself to given topic and inserts each message to BigQuery table.

bigquery cloud-functions pubsub terraform-module

Last synced: 10 Nov 2024

https://github.com/angulartist/scio-demo

Playing w/ Scio

apache-beam bigquery scio

Last synced: 10 Nov 2024

https://github.com/ankita-selokar/fitbit-for-her-crafting-fitbit-s-strategy-for-women

This project analyzes smart device usage data to uncover trends and insights, guiding Fitbit by Google’s product and marketing strategies for their new women-focused product launch. It combines competitive market analysis with customer behavior insights to inform key decisions.

bigquery excel powerbi spreadsheet sql

Last synced: 23 Oct 2024

https://github.com/kellyjadams/ap-exam-scores

Analyzing AP exam scores for a school.

bigquery sql

Last synced: 10 Nov 2024

https://github.com/flowerinthenight/bqstream

A simple library to help facilitate streaming to BigQuery.

bigquery go golang streaming

Last synced: 10 Nov 2024

https://github.com/martinkalema/bigquery-pubsub

Loading data into BigQuery Table

bigquery data-engineering flat-file kafka

Last synced: 12 Nov 2024

https://github.com/yaph/queries

Collection of Data Queries in SPARQL and SQL

bigquery data-mining dbpedia openstreetmap osm queries sparql sql stackoverflow wikidata

Last synced: 10 Nov 2024

https://github.com/thunchanokbow/inventory-amazon

Inventory value is also important for determining a company's liquidity, or its ability to meet its short-term financial obligations. A high inventory value can indicate that a company has too much money tied up in inventory, which could make it difficult for the company to pay its bills.

azure bigquery cloudcomposer clouddatabase cloudstorage compute-engine dataproc postgresql powerbi pyspark-sql python3

Last synced: 11 Nov 2024

https://github.com/quipper/send-ci-result-to-bigquery-action

Send test results to BigQuery in GitHub Actions

bigquery github-actions google-bigquery junit-xml

Last synced: 11 Nov 2024

https://github.com/toskpl/googlecloud

Challnege 30 days - GoogleCloud

bigquery google-cloud google-cloud-platform ml

Last synced: 14 Nov 2024

https://github.com/yasarsultan/taxi-trip-analysis

The NYC Taxi Trip Batch Data Pipeline automates processing of large-scale trip data using Apache Spark and Airflow, integrating AWS S3 and Google BigQuery for storage and analytics. It features scalable, containerized workflows with robust data validation.

airflow aws-s3 bash-script batch-processing bigquery data-lake data-warehouse docker python3 spark

Last synced: 12 Nov 2024

BigQuery Awesome Lists
BigQuery Categories