Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

BigQuery

Google BigQuery enables companies to handle large amounts of data without having to manage infrastructure. Google’s documentation describes it as a « serverless architecture (that) lets you use SQL queries to answer your organization’s biggest questions with zero infrastructure management. BigQuery’s scalable, distributed analysis engine lets you query terabytes in seconds and petabytes in minutes. » Its client libraries allow the use of widely known languages such as Python, Java, JavaScript, and Go. Federated queries are also supported, making it flexible to read data from external sources.

📖 A highly rated canonical book on it is « Google BigQuery: The Definitive Guide », a comprehensive reference. Another enriching read on the subject is the inside story told in the article by the founding product manager of BigQuery celebrating its 10th anniversary.

https://github.com/vaishnavipaithane/bellabeat-data-analysis-case-study

This capstone project was done as a part of Google Data Analytics Professional Certificate course.

bigquery data-analysis sql tableau

Last synced: 21 Jan 2025

https://github.com/tomgorb/some-data-monitoring

fully functional DAG using Airflow 2 and minikube (locally) to help monitor GCP billing

airflow2 bigquery gcp minikube

Last synced: 21 Jan 2025

https://github.com/kartikeya443/automated-data-pipeline-gcp

This project showcases the integration of various Google Cloud Platform services to build an efficient and automated data pipeline for sales data.

bigquery cloud data-engineering flask gcp google-cloud-platform looker-studio pipeline python sql

Last synced: 21 Jan 2025

https://github.com/vidyadnina/other-sql-projects-and-queries

Other SQL projects and queries.

bigquery mysql sql

Last synced: 21 Jan 2025

https://github.com/karencofre/marketing-segmentacion-en-powerbi

Proyecto prueba de hipótesis en powerbi y python

bigquery google-colab powerbi python sql statsmodels

Last synced: 21 Jan 2025

https://github.com/lisabensoussan/bigdataminig_finalassignment

This repository contains solutions for the final assignment of the Big Data Mining course (52002/52019), focusing on querying large datasets with BigQuery, network analysis with Python, and distributed data processing with Apache Spark.

bigquery community-detection data-cleaning dataframe exploratory-data-analysis pagerank rdd sql text-analysis visualization

Last synced: 21 Jan 2025

https://github.com/edwinrlambert/cyclistic-bike-share-analysis

This repository is part of the Google Data Analytics Capstone Project, focusing on analyzing Cyclistic's bike-sharing data to identify trends and strategies for converting casual riders to annual members. It aims to provide actionable insights for enhancing marketing efforts.

act analyze ask bigquery prepare process share sql

Last synced: 21 Jan 2025

https://github.com/mlund2k/project-1-baseball-performance-vs.-attendance

Project assets for my first exploratory data analysis: Baseball Performance vs. Attendance.

bigquery data-analysis data-cleaning data-visualization excel rstudio sql tableau tidyverse

Last synced: 21 Jan 2025

https://github.com/moeabbas6/dbt_analytics_engine

An end-to-end project using dbt to demonstrate data transformations, testing, and visualization with Google BigQuery, and Looker Studio. It showcases a complete data pipeline from extraction/generation to deployment.

analytics-engineering bigquery data data-pipeline data-transformation data-visualization dbt testing

Last synced: 21 Jan 2025

https://github.com/aazuspan/landsat-bigquery

Summarizing 51 years of Landsat data using Earth Engine and BigQuery

bigquery google-earth-engine landsat

Last synced: 21 Jan 2025

https://github.com/angulartist/scio-demo

Playing w/ Scio

apache-beam bigquery scio

Last synced: 10 Nov 2024

https://github.com/vedantwalia/google-data-analytics-capstone-case-study

This is a repository of my work on data analysis as a part of the Google Data Analytics Capstone

bigquery data data-viz datavisualization-project divvy-bikes google googledataanalytics sql tableau tableau-public

Last synced: 21 Jan 2025

https://github.com/hayashi-yudai/cloudfunc_login

Example of authentication function for login with Cloud Functions and BigQuery

bigquery gcp-cloud-functions golang server

Last synced: 15 Jan 2025

https://github.com/aisurjyasamantaray/-optimizing-target-s-brazilian-operations-insights-from-order-processing-pricing-and-payment-trends-

This project offers an in-depth analysis of consumer behavior, logistical performance, and payment preferences within the e-commerce sector. By examining order costs, delivery times, and payment methods, businesses can uncover valuable insights into operational efficiency and customer preferences.

bigquery consumer-insights data-analysis database sql target

Last synced: 21 Jan 2025

https://github.com/martinkalema/bigquery-pubsub

Loading data into BigQuery Table

bigquery data-engineering flat-file kafka

Last synced: 11 Jan 2025

https://github.com/phukon/package-insights

PyPI package reports and insights. The data was ingested from publicly available source using BigQuery and then transformed.

big-data bigquery dbt duckdb

Last synced: 27 Jan 2025

https://github.com/hitthecodelabs/bigquery_ml

Jupyter notebooks that utilize Google BigQuery's machine learning capabilities.

bigquery notebooks python sql

Last synced: 04 Feb 2025

https://github.com/alessio-siciliano/bigquery-utils

A utility library that enhances the official BigQuery Python client with additional tools for query management, data processing, and automation, making it easier to work efficiently with Google BigQuery.

bigquery datatransfer google-cloud python

Last synced: 28 Jan 2025

https://github.com/airbytehq/terraform-airbyte-bigquery-destination

Terraform Module for Setting Up BigQuery Destination with Airbyte

airbyte bigquery elt etl terraform

Last synced: 07 Jan 2025

https://github.com/crudek-data/bigquery-kaggle-apis

kaggle api to download free datasets along with google bigquery api to read/write from cloud data warehouse

bigquery data-engineering kaggle

Last synced: 22 Jan 2025

https://github.com/nikhilsree5/targetcasestudy

An exploratory and in-depth study of the e-commerce market in Brazil.

bigquery eda sql visualization

Last synced: 22 Jan 2025

https://github.com/coatless/bigquery-reddit-ask-your-advisor

Analysis code that counts instances of a phrase on Reddit (e.g. "ask your advisor")

ask-your-advisor bigquery r reddit

Last synced: 16 Jan 2025

https://github.com/flowerinthenight/bqstream

A simple library to help facilitate streaming to BigQuery.

bigquery go golang streaming

Last synced: 08 Jan 2025

https://github.com/kellyjadams/ap-exam-scores

Analyzing AP exam scores for a school.

bigquery sql

Last synced: 08 Jan 2025

https://github.com/humairarizwan/uber-ride-dataengineering-analysis

This project creates a pipeline to process data and performs data analytics on Uber data.

bigquery dataanalysis dataengineering gcp-project googlestorage looker-studio

Last synced: 21 Jan 2025

https://github.com/toskpl/googlecloud

Challnege 30 days - GoogleCloud

bigquery google-cloud google-cloud-platform ml

Last synced: 14 Nov 2024

https://github.com/chukwuemekaaham/ny_taxi_rides

Analytics engineering using Dbt and Google Cloud BigQuery

analytics-engineering bigquery dbt github

Last synced: 10 Jan 2025

https://github.com/abdullahasghar/sql

The repo includes all projects and assessments I have completed with SQL. IDE/s used: MS SQL Server, Google Big Query.

bigquery mssqlserver sql

Last synced: 18 Jan 2025

https://github.com/eddieatgoogle/sql-based-genai-data-pipeline

GenAI data pipeline that performs data preparation, management and performance evaluation tasks for RAG systems using SQL as the primary development language. Please feel free to use this as a starting point for your own projects.

bigquery bqml dataform embeddings gemini google-cloud-platform sql vector-search vertex-ai

Last synced: 08 Jan 2025

https://github.com/markjamesbutler/dbt-fundamentals-bigquery

Implementation of dbt fundamentals training course material using BigQuery.

bigquery dbt dbt-fundamentals fundamentals jinja2 practice-tasks sql

Last synced: 16 Jan 2025

https://github.com/garbetjie/monolog-bigquery-handler

A simple Monolog handler for writing to BigQuery.

bigquery logging monolog monolog-handler

Last synced: 16 Jan 2025

https://github.com/riju18/airflow-data-engineering-with-bigquery-and-dbt

Fetch Data from a simple csv file, send the data in GCP BigQuery table, run dbt to automate the DWH and run SODA to check Data Quality.

apache-airflow bigquery csv dbt python3 soda

Last synced: 28 Jan 2025

https://github.com/makism/bq-ethereum

bq-ethereum

bigquery ethereum sql

Last synced: 20 Jan 2025

https://github.com/ngangawairimu/sales-analysis-and-customer-insights

This project features SQL queries for detailed customer and sales analysis:Customer Analysis and Sales Reporting

bigquery bigquery-dataset excel sql

Last synced: 28 Jan 2025

https://github.com/ackeecz/terraform-gcp-cloud-function_pubsub_to_bq

Cloud function subscribes itself to given topic and inserts each message to BigQuery table.

bigquery cloud-functions pubsub terraform-module

Last synced: 07 Jan 2025

https://github.com/sintef/bigquery-postgresql-wire-proxy

A PostgreSQL wire protocol proxy server for BigQuery.

bigquery postgresql proxy

Last synced: 12 Jan 2025

https://github.com/pittica/google-bigquery-helpers

Helpers for Google Cloud BigQuery.

bigquery gcp google-cloud-platform pittica

Last synced: 13 Nov 2024

https://github.com/ackeecz/terraform-gcp-cloud-run_pubsub_to_bq

Cloud Run subscribes itself to given topic and inserts each message to BigQuery table.

bigquery gcp pubsub terraform

Last synced: 07 Jan 2025

https://github.com/allanreda/share-of-search-retrieval-and-visualization

Share of search analysis including data retrieval from Google Ads API, storing data in BigQuery and visualizing it in Looker Studio

bigquery google-ads-api looker-studio python share-of-search

Last synced: 28 Dec 2024

https://github.com/lambdamusic/dimschema

CLI to retrieve SQL schema information about the Dimensions on Google BigQuery dataset.

bigquery dimensions python scholarly-metadata

Last synced: 12 Jan 2025

https://github.com/oleksiilatypov/google_cloud

AI & Data, Google Cloud Skills Boost

bigquery document-ai ml vertexai

Last synced: 18 Jan 2025

https://github.com/karencofre/riesgorelativo-lookerstudio

proyecto de análisis de datos y análisis perdicitvo en looker studio y google colab

bigquery data-analysis data-science machine-learning matplotlib python sklearn sql

Last synced: 22 Jan 2025

https://github.com/zborovskaanna/e-commerce-web-events-analysis

SQL project based on the Big Query public database 'The Look e-Commerce' and a dashboard in Looker Studio

analysis bigquery dashboard data-visualization looker-studio sql

Last synced: 22 Jan 2025

https://github.com/azapeti/bigquery-python-bash-automation

Since you're using the free version, you can only get data from your website through the Google Analytics API for the last 60 days. I would like to demonstrate in this repository how to run BigQuery queries in Python and automate it using bash and crontab for collecting historical data.

analytics automation bash bigquery cronjob crontab ga4 python python3

Last synced: 22 Jan 2025

https://github.com/robinnoiret/importcsv_zendeskbigquery

This project involves developing a Python script to import csv export from Zendesk to BigQuery. It is not intended for recurring use, but to enable an initial dump of historical data.

bigquery connector export-csvfile json zendesk

Last synced: 22 Jan 2025

https://github.com/acardosolima/crypto-ethereum-tokens

This project aims to create a data pipeline using Airflow to ingest dataset from Google Bigquery to a PostgreSQL database. This stack will run in a local environment using Kubernetes.

airflow bigquery postgresql python

Last synced: 22 Jan 2025

https://github.com/fahmiaziz98/sql_agent

build sql agent using different pattern rag/self-correction/optimization

agent bigquery langchain sql sql-agent sqlite toolkit

Last synced: 22 Jan 2025

https://github.com/lisabensoussan/bigdata_midterm

This project focuses on analyzing Stack Overflow data related to JavaScript and Python questions using a combination of SQL queries (Google BigQuery) and Unix shell commands. The aim is to explore trends, activity patterns, and user behavior around these popular programming languages through data wrangling and querying techniques.

bigquery data-cleaning sql unix-command unix-shell

Last synced: 22 Jan 2025

https://github.com/thecodersstudio/node-native-test-runner

Code samples and test cases showcasing the power of Node.js's native test runner for streamlined and efficient testing.

bigquery mock nodejs nodejs-test nodenativetestrunner test

Last synced: 22 Jan 2025

https://github.com/noospheracr/twilio-segment-configs

Integration of Twilio Segment with Google BigQuery, Looker/PowerBI, and Google VertexAI to create a data-driven marketing platform

bigquery google-cloud-platform looker-studio marketing noosphera power-bi twilio-segment vertex-ai

Last synced: 22 Jan 2025

https://github.com/thanhloc81/customer-segmentation

✨ Analyze customer segments of Adventure World dataset

bigquery google-cloud powerbi sql

Last synced: 22 Jan 2025

https://github.com/jasontanx/mas-international-arrivals

Code repository about international arrivals into Malaysia

bigquery data-analytics data-engineering etl-pipeline international-arrivals

Last synced: 22 Jan 2025

https://github.com/hanif-syazul/analyzing-kimia-farma-sales-performance-with-gcp

This repository contains the final project for the Rakamin Big Data Analytics Internship. It include a complete dashboard of Kimia Farma's sales performance analysis from 2020 to 2023.

big-data-analytics bigquery internship-project kimia-farma looker-studio rakamin sql

Last synced: 22 Jan 2025

https://github.com/zeinhasan/etl-using-airflow

Extract Transform Load Using Airflow

airflow bigquery etl

Last synced: 22 Jan 2025

https://github.com/yasarsultan/olist_datawarehouse

An end-to-end data pipeline that extracts data, processes it, and then loads it into the BigQuery data warehouse.

airflow bigquery data-warehouse docker

Last synced: 22 Jan 2025

https://github.com/ahbiels/chatbot_analize_avaliation

Um bot feito no dialogflow cx que permite ao usuário avaliar um determinado produto da empresa. Após a avaliação, o bot ira fazer uma análise de sentimentos na avaliação do usuário, e armazenar o resultado da avaliação (juntamente com o texto da avaliação, nome do usuário e produto) dentro de um dataset no BigQuery

bigquery chatbot dataset dialogflow dialogflow-cx documentation flask gcp google-cloud iterator language-model nlu nlu-chatbot python sql

Last synced: 22 Jan 2025

https://github.com/mutaharshaik/airflow_retail_project

Airflow retail project using pipeline with BigQuery, dbt, Soda

airflow astro-cli astro-sdk bigquery datamodeling dbt docker etl-pipeline gcp snowflake soda

Last synced: 22 Jan 2025

https://github.com/victorcezeh/end-to-end-elt-pipeline

An end-to-end ELT project using the Brazilian E-Commerce dataset from Kaggle. This project demonstrates the use of Python, PostgreSQL, Docker, Docker Compose, Airflow, dbt, and BigQuery to ingest, transform, and analyze data, providing insights into sales, delivery times, and order distributions.

airflow bigquery dbt-core docker docker-compose postgresql python

Last synced: 22 Jan 2025

https://github.com/mattwelke/charter-challenge-for-fair-voting-bot

Bot that web scrapes and logs in BigQuery the donations so far of the Charter Challenge for Fair Voting.

bigquery bot go openwhisk public-data

Last synced: 22 Jan 2025

https://github.com/alessio-siciliano/bigquery-advanced-utils

BigQuery-advanced-utils is a lightweight utility library that extends the official Google BigQuery Python client. It simplifies tasks like query management, data processing, and automation. Aimed at developers and data scientists, the project is open to contributions to improve and enhance its functionality.

bigquery datatransfer google-cloud python

Last synced: 01 Feb 2025

https://github.com/nszoni/dbtgen

dbt: write nothing, generate (almost) everything.

analytics bigquery dbt documentation generative-ai github tooling

Last synced: 31 Jan 2025

https://github.com/lorinczakos/sql-projects

This is a collection of my SQL scripts that I wrote and were approved through my course with GoIT Romania Data Analyst course

bigquery cte data data-analysis dbeaver marketing-analytics postgresql project-repository sql vscode

Last synced: 28 Jan 2025

https://github.com/wooyakob/music-recommendation-engine

Using Gemini API to generate personalized music recommendations.

ai bigquery gemini-api google-cloud-platform

Last synced: 28 Jan 2025

https://github.com/oguzgn/a-case-study-for-a-livestreaming-platform

This project aims to analyze livestream watch times of users across different regions. The goal is to identify the top 5 users with the highest watch time for each region. The analysis involves multiple SQL transformations to extract meaningful insights from the data.

bigquery data data-analysis data-modeling live-streaming sql

Last synced: 27 Jan 2025

https://github.com/oliveroneill/wilt-cloud-functions

Wilt Google Cloud Functions

bigquery google-cloud-functions

Last synced: 07 Jan 2025

https://github.com/sangnandar/insert-unique-record

This is Cloud Functions script to insert only unique records into BigQuery.

bigquery digital-marketing-analytics google-cloud-functions

Last synced: 29 Dec 2024

https://github.com/tosh2230/bigquery-table-history

Diff daily changes by BigQuery INFORMATION_SCHEMA.PARTITIONS records.

bigquery

Last synced: 21 Jan 2025

https://github.com/phstudy/zetasketch-bigquery-example

An example demonstrates how to use ZetaSketch with BigQuery

bigquery hll java zetasketch

Last synced: 21 Jan 2025

https://github.com/ivanildobarauna/pypi-package-stats

Project for ingest pypi packages data from BigQuery and send to DataDog for analysis and insights with dashboards, monitors and more

bigquery cloud data-engineering data-warehouse gcp software-engineering

Last synced: 29 Dec 2024

https://github.com/iht/bigquery-dataflow-cdc-example

A Dataflow streaming pipeline written in Java, reading data from Pubsub and recovering the sessions from potentially unordered data, and upserting the session data into BigQuery with no duplicates

apache-beam bigquery cdc dataflow google-cloud pubsub

Last synced: 29 Dec 2024

https://github.com/ymyzk/bq-globalip

Record the current global IPv4 address to a BigQuery table.

bigquery golang

Last synced: 28 Jan 2025

https://github.com/garbetjie/phpunit-bigquery-schema

A BigQuery schema validator constraint for BigQuery

bigquery phpunit

Last synced: 28 Jan 2025

https://github.com/marielachirinosr/nyc-taxi-trip-exploration-2019-2020

Explores passenger behavior & impact of COVID-19 on NYC taxi industry (Q1 2019-2020).

bigquery data data-analysis data-visualization python sql tableau

Last synced: 29 Dec 2024

https://github.com/vigneshss-07/complete-atoz-sql

This deals with SQL commands, interview preparation and query questions and solutions

azuresql bigquery gcp sql sql-query sql-server sqlalchemy

Last synced: 15 Nov 2024

https://github.com/night-fury-me/real-time-vehicle-data-processing

A repository that contains implementation of a Real-Time Vehicle Data Processing Pipeline that efficiently manages and analyzes vehicle data through a cohesive system.

bigquery cpp data-engineering data-streaming flink grpc kafka python real-time-data-processing

Last synced: 22 Jan 2025

https://github.com/jancervenka/bqcli

REPL for BigQuery

bigquery data-science gcp google python

Last synced: 31 Dec 2024

https://github.com/yandex-cloud-examples/yc-bigquery-to-object-storage

Экспорт данных из Google Big Query через Google Storage в Object Storage Yandex Cloud.

bigquery object-storage python3 yandex-cloud yandexcloud

Last synced: 29 Dec 2024

https://github.com/digitaloptimizationgroup/digitaloptgroup-r-notebooks

A collection of R notebooks to analyze data from the Digital Optimization Group Platform

ab-testing bigquery jupyter-notebook performance-analysis r web-analytics

Last synced: 21 Jan 2025

https://github.com/ruru-lyy/nyc-taxi-service-pipeline

In this project, I built a data pipeline using Mage.ai for ETL, GCP for storage, BigQuery for querying, and Looker Studio for analytics. This project helped me learn how to process, store, and visualize data effectively using modern tools.

bigquery data-engineering data-modeling etl-pipeline looker mage-ai python

Last synced: 23 Jan 2025

https://github.com/edumoraes1/spam_count_sfmc

Consulta de SQL com contagem de envios de email e spam dos ultimos 365 dias

bigquery marketing-cloud salesforce sql

Last synced: 31 Dec 2024

https://github.com/yoshiyukikato/nightharbor-bigquery-reporter

A nightharbor reporter for GCP BigQuery

bigquery lighthouse

Last synced: 23 Jan 2025

https://github.com/victorcezeh/data-engineering-final-semester-portfolio

This GitHub repository serves as a comprehensive platform for managing and showcasing my data engineering projects and assessments throughout my final semester at Alt School Africa. Designed to foster collaboration, organization, and continuous improvement, this repository is the backbone of my academic journey in data engineering.

bigquery docker gcs-bucket postgresql python

Last synced: 17 Nov 2024

https://github.com/isaacmg/mimic_iv_bq_queries

Queries needed to recreate time series features for model training

bigquery mimic-iv sql

Last synced: 21 Jan 2025

https://github.com/drvipulasharma/e-commerce-data-analysis-sql-big---query

E-Commerce-Data-Analysis-SQL-Big-Query

bigquery sql

Last synced: 23 Jan 2025

https://github.com/rifa8/extract-load-demo

Learning Google Cloud Platform (GCP)

airbyte bigquery bucket gcp

Last synced: 27 Jan 2025

BigQuery Awesome Lists
BigQuery Categories