Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

BigQuery

Google BigQuery enables companies to handle large amounts of data without having to manage infrastructure. Google’s documentation describes it as a « serverless architecture (that) lets you use SQL queries to answer your organization’s biggest questions with zero infrastructure management. BigQuery’s scalable, distributed analysis engine lets you query terabytes in seconds and petabytes in minutes. » Its client libraries allow the use of widely known languages such as Python, Java, JavaScript, and Go. Federated queries are also supported, making it flexible to read data from external sources.

📖 A highly rated canonical book on it is « Google BigQuery: The Definitive Guide », a comprehensive reference. Another enriching read on the subject is the inside story told in the article by the founding product manager of BigQuery celebrating its 10th anniversary.

https://github.com/stoqey/rasputia

Rasputia Latimore - The Big Data Bitch 💋

bigquery

Last synced: 19 Jan 2025

https://github.com/janaom/gcp-de-project-data-pipeline-with-cloud-run-functions-airflow-biggueryml

Build a data pipeline on Google Cloud using an event-driven architecture, leveraging GCS, Cloud Run functions, and BigQuery. Explore both VM and Composer options for Airflow management, and utilize Logging & Monitoring for pipeline health. Discover how SQL-based BigQuery ML can be used for initial ML implementation in specific scenarios.

airflow bigquery bigqueryml cloud-functions cloud-run-functions composer data-engineering-project google-cloud-platform

Last synced: 26 Jan 2025

https://github.com/knands42/data-ingestion

Data Ingestion project to evaluate my Kotlin skill using concurrency

bigquery golang google-cloud-platform google-storage gradle-kotlin-dsl kotlin kotlin-flow

Last synced: 25 Jan 2025

https://github.com/jasontanx/ridership-headline-project

This end to end data engineering / data analytics project will be about the Malaysian public transport ridership data.

bigquery data-engineering minio-server public-transport-ridership terraform

Last synced: 01 Feb 2025

https://github.com/mdornseif/datastore-to-bigquery

The missing Data Transfer Tool: Dump Google Cloud Datastore contents and load them into BigQuery.

backup bigquery bigquery-backup cloud datastore google

Last synced: 21 Jan 2025

https://github.com/arhea/go-mock-bigquery

Creates a mock BigQuery client based on the bigquery-emulator for testing in Golang projects.

bigquery golang golang-module google-bigquery google-cloud-platform testcontainers-go testing

Last synced: 21 Jan 2025

https://github.com/paulveillard/cybersecurity-analytics

An ongoing collection of awesome software, libraries, learning tutorials, documents and books, technical resources and cool stuff about Analytics Engineering in Cybersecurity.

analytics bigdata bigquery cybernetics cybersecurity data data-engineering data-science encryption encryption-decryption seo seo-friendly seo-optimization

Last synced: 02 Feb 2025

https://github.com/nlgtuankiet/bq-noti

BigQuery notification

bigquery bq notification notifier

Last synced: 21 Jan 2025

https://github.com/denny-b-justin/purdue

The internship was broadly to understand if the topics/events are being covered differently in the different countries and how they affect stock market returns. The provided dataset is a post-processed set of news articles, so already reflects topic modelling and sentiment analysis.

big-data bigquery finance gdelt-events python

Last synced: 21 Jan 2025

https://github.com/angulartist/scio-demo

Playing w/ Scio

apache-beam bigquery scio

Last synced: 10 Nov 2024

https://github.com/marcopellegrinoit/web-traffic-time-series-predictions

Forecast Web Traffic Demand Time Series with ARIMA+ BigQuery and Looker Studio. Addionatel modeling available with ARIMA, LSTM, and Facebook Prophet.

arima bigquery gcp lstm prophet-model time-series vertex-ai

Last synced: 21 Jan 2025

https://github.com/greatwoman23/car_insurance_analysis

The Car Insurance Analysis project aims to provide a comprehensive examination of a car insurance portfolio using advanced data analytics tools. The analysis offers valuable insights into policy demographics, claims patterns, and financial metrics, helping stakeholders make informed decisions.

bigquery data data-science dataanalytics insurance-claims looker-studio tableau

Last synced: 21 Jan 2025

https://github.com/hayashi-yudai/cloudfunc_login

Example of authentication function for login with Cloud Functions and BigQuery

bigquery gcp-cloud-functions golang server

Last synced: 15 Jan 2025

https://github.com/patriciavalentine/loan-data-queries

In this project, I analyzed a vehicle loan dataset using BigQuery to identify demographic, financial, and loan patterns. Through SQL queries, I extracted insights such as the credit scores, and loan distribution by region, and explored high-risk profiles. The findings are visualized in Looker Studio, thus helping to inform strategic decisions.

asset-finance bigquery loan-data looker-studio

Last synced: 04 Feb 2025

https://github.com/akansharajput280799/strategic-analysis-of-retail-brand-in-south-america-using-sql

Leveraged Big Query and MySQL to analyze 100K records for sales optimization, trend identification, and enhancing customer satisfaction for a retail brand in South America and to provide insights and recommendations to improve their userbase and improve their services

bigquery data-analysis data-science database database-schema google-bigquery mysql-server sql

Last synced: 04 Feb 2025

https://github.com/janmin123/cyclistic

Capstone project for Google/Coursera Data Analytics Course

analysis bigquery sql tableau visualization

Last synced: 04 Feb 2025

https://github.com/martinkalema/bigquery-pubsub

Loading data into BigQuery Table

bigquery data-engineering flat-file kafka

Last synced: 11 Jan 2025

https://github.com/rubnsbarbosa/nasa-asteroids-extractor

ETL asteroids data extractor using some Google Cloud services

bigquery bucket cloud-storage google-cloud nasa-api-neows

Last synced: 21 Jan 2025

https://github.com/alessio-siciliano/bigquery-utils

A utility library that enhances the official BigQuery Python client with additional tools for query management, data processing, and automation, making it easier to work efficiently with Google BigQuery.

bigquery datatransfer google-cloud python

Last synced: 28 Jan 2025

https://github.com/rrmcguinness/protoc-gen-bq-schema

A protocol buffer compiler (protoc) plugin for generating Google BigQuery JSON table definitions.

bigquery bigquery-schema protobuf

Last synced: 13 Jan 2025

https://github.com/sangnandar/load-csvs-from-gcs-to-bigquery

Google Apps Script to streamline loading CSV data from Google Cloud Storage (GCS) into BigQuery.

bigquery csv-import google-apps-script google-cloud-storage

Last synced: 13 Jan 2025

https://github.com/crudek-data/bigquery-kaggle-apis

kaggle api to download free datasets along with google bigquery api to read/write from cloud data warehouse

bigquery data-engineering kaggle

Last synced: 22 Jan 2025

https://github.com/coatless/bigquery-reddit-ask-your-advisor

Analysis code that counts instances of a phrase on Reddit (e.g. "ask your advisor")

ask-your-advisor bigquery r reddit

Last synced: 16 Jan 2025

https://github.com/nikhilsree5/targetcasestudy

An exploratory and in-depth study of the e-commerce market in Brazil.

bigquery eda sql visualization

Last synced: 22 Jan 2025

https://github.com/kmohamedalie/bigquery-intro

Coursera BigQuery Introduction using Covid19 dataset

bigquery coursera covid-19 datavisualization looker-studio sql

Last synced: 21 Jan 2025

https://github.com/cyber-programmer/web-traffic-analytics-ml-model

This Jupyter Notebook focuses on classifying website visitors using logistic regression. The project leverages Google Analytics sample data and BigQuery for data analysis and feature engineering. It provides a comprehensive workflow that includes data import, preprocessing, exploratory data analysis.

bigquery logistic-regression machine-learning

Last synced: 21 Jan 2025

https://github.com/plishka/blockchain_analysis

Cryptocurrency On-Chain Analysis (Bitcoin Blockchain)

bigquery blockchain data-cleaning scraping-websites sql tableau

Last synced: 21 Jan 2025

https://github.com/lawal-hash/olistelt

An end-to-end ELT data pipeline of the Brazilian olist e-commerce dataset using the modern data stack

airflow bigquery dbt dbt-core docker postgresql sql

Last synced: 21 Jan 2025

https://github.com/toskpl/googlecloud

Challnege 30 days - GoogleCloud

bigquery google-cloud google-cloud-platform ml

Last synced: 14 Nov 2024

https://github.com/chukwuemekaaham/ny_taxi_rides

Analytics engineering using Dbt and Google Cloud BigQuery

analytics-engineering bigquery dbt github

Last synced: 10 Jan 2025

https://github.com/abdullahasghar/sql

The repo includes all projects and assessments I have completed with SQL. IDE/s used: MS SQL Server, Google Big Query.

bigquery mssqlserver sql

Last synced: 18 Jan 2025

https://github.com/markjamesbutler/dbt-fundamentals-bigquery

Implementation of dbt fundamentals training course material using BigQuery.

bigquery dbt dbt-fundamentals fundamentals jinja2 practice-tasks sql

Last synced: 16 Jan 2025

https://github.com/garbetjie/monolog-bigquery-handler

A simple Monolog handler for writing to BigQuery.

bigquery logging monolog monolog-handler

Last synced: 16 Jan 2025

https://github.com/nghiant3110/firebase_6

This is a DA project based on the Firebase Sample dataset on Big Query

bigquery firebase looker-studio sql

Last synced: 21 Jan 2025

https://github.com/makism/bq-ethereum

bq-ethereum

bigquery ethereum sql

Last synced: 20 Jan 2025

https://github.com/ngangawairimu/sales-analysis-and-customer-insights

This project features SQL queries for detailed customer and sales analysis:Customer Analysis and Sales Reporting

bigquery bigquery-dataset excel sql

Last synced: 28 Jan 2025

https://github.com/anpandu/ps2bq

Stream insert GCP PubSub messages into BigQuery table.

bigquery golang pubsub

Last synced: 21 Jan 2025

https://github.com/pittica/google-bigquery-helpers

Helpers for Google Cloud BigQuery.

bigquery gcp google-cloud-platform pittica

Last synced: 13 Nov 2024

https://github.com/shaheerazam-dev/cyclistic-case-study-google-data-analytics-certificate

This case study simulates the real-world experience of a junior data analyst at Cyclistic, a fictional company. We will leverage the data analysis process framework (Ask, Prepare, Process, Analyze, Share, Act) to address critical business questions and provide data-driven insights to guide strategic decision-making.

bigquery data-science data-visualization spreadsheet sql tableau

Last synced: 21 Jan 2025

https://github.com/siriospa/gcp-helpers-bigquery

Helpers for Google Cloud BigQuery.

bigquery gcp google-cloud-platform sirio

Last synced: 12 Oct 2024

https://github.com/lambdamusic/dimschema

CLI to retrieve SQL schema information about the Dimensions on Google BigQuery dataset.

bigquery dimensions python scholarly-metadata

Last synced: 12 Jan 2025

https://github.com/oleksiilatypov/google_cloud

AI & Data, Google Cloud Skills Boost

bigquery document-ai ml vertexai

Last synced: 18 Jan 2025

https://github.com/squidmin/bigquery-labs

GCP BigQuery CLI

bigquery gcp java

Last synced: 07 Feb 2025

https://github.com/karencofre/riesgorelativo-lookerstudio

proyecto de análisis de datos y análisis perdicitvo en looker studio y google colab

bigquery data-analysis data-science machine-learning matplotlib python sklearn sql

Last synced: 22 Jan 2025

https://github.com/zborovskaanna/e-commerce-web-events-analysis

SQL project based on the Big Query public database 'The Look e-Commerce' and a dashboard in Looker Studio

analysis bigquery dashboard data-visualization looker-studio sql

Last synced: 22 Jan 2025

https://github.com/azapeti/bigquery-python-bash-automation

Since you're using the free version, you can only get data from your website through the Google Analytics API for the last 60 days. I would like to demonstrate in this repository how to run BigQuery queries in Python and automate it using bash and crontab for collecting historical data.

analytics automation bash bigquery cronjob crontab ga4 python python3

Last synced: 22 Jan 2025

https://github.com/robinnoiret/importcsv_zendeskbigquery

This project involves developing a Python script to import csv export from Zendesk to BigQuery. It is not intended for recurring use, but to enable an initial dump of historical data.

bigquery connector export-csvfile json zendesk

Last synced: 22 Jan 2025

https://github.com/acardosolima/crypto-ethereum-tokens

This project aims to create a data pipeline using Airflow to ingest dataset from Google Bigquery to a PostgreSQL database. This stack will run in a local environment using Kubernetes.

airflow bigquery postgresql python

Last synced: 22 Jan 2025

https://github.com/fahmiaziz98/sql_agent

build sql agent using different pattern rag/self-correction/optimization

agent bigquery langchain sql sql-agent sqlite toolkit

Last synced: 22 Jan 2025

https://github.com/lisabensoussan/bigdata_midterm

This project focuses on analyzing Stack Overflow data related to JavaScript and Python questions using a combination of SQL queries (Google BigQuery) and Unix shell commands. The aim is to explore trends, activity patterns, and user behavior around these popular programming languages through data wrangling and querying techniques.

bigquery data-cleaning sql unix-command unix-shell

Last synced: 22 Jan 2025

https://github.com/thecodersstudio/node-native-test-runner

Code samples and test cases showcasing the power of Node.js's native test runner for streamlined and efficient testing.

bigquery mock nodejs nodejs-test nodenativetestrunner test

Last synced: 22 Jan 2025

https://github.com/noospheracr/twilio-segment-configs

Integration of Twilio Segment with Google BigQuery, Looker/PowerBI, and Google VertexAI to create a data-driven marketing platform

bigquery google-cloud-platform looker-studio marketing noosphera power-bi twilio-segment vertex-ai

Last synced: 22 Jan 2025

https://github.com/thanhloc81/customer-segmentation

✨ Analyze customer segments of Adventure World dataset

bigquery google-cloud powerbi sql

Last synced: 22 Jan 2025

https://github.com/jasontanx/mas-international-arrivals

Code repository about international arrivals into Malaysia

bigquery data-analytics data-engineering etl-pipeline international-arrivals

Last synced: 22 Jan 2025

https://github.com/hanif-syazul/analyzing-kimia-farma-sales-performance-with-gcp

This repository contains the final project for the Rakamin Big Data Analytics Internship. It include a complete dashboard of Kimia Farma's sales performance analysis from 2020 to 2023.

big-data-analytics bigquery internship-project kimia-farma looker-studio rakamin sql

Last synced: 22 Jan 2025

https://github.com/zeinhasan/etl-using-airflow

Extract Transform Load Using Airflow

airflow bigquery etl

Last synced: 22 Jan 2025

https://github.com/yasarsultan/olist_datawarehouse

An end-to-end data pipeline that extracts data, processes it, and then loads it into the BigQuery data warehouse.

airflow bigquery data-warehouse docker

Last synced: 22 Jan 2025

https://github.com/ahbiels/chatbot_analize_avaliation

Um bot feito no dialogflow cx que permite ao usuário avaliar um determinado produto da empresa. Após a avaliação, o bot ira fazer uma análise de sentimentos na avaliação do usuário, e armazenar o resultado da avaliação (juntamente com o texto da avaliação, nome do usuário e produto) dentro de um dataset no BigQuery

bigquery chatbot dataset dialogflow dialogflow-cx documentation flask gcp google-cloud iterator language-model nlu nlu-chatbot python sql

Last synced: 22 Jan 2025

https://github.com/mutaharshaik/airflow_retail_project

Airflow retail project using pipeline with BigQuery, dbt, Soda

airflow astro-cli astro-sdk bigquery datamodeling dbt docker etl-pipeline gcp snowflake soda

Last synced: 22 Jan 2025

https://github.com/victorcezeh/end-to-end-elt-pipeline

An end-to-end ELT project using the Brazilian E-Commerce dataset from Kaggle. This project demonstrates the use of Python, PostgreSQL, Docker, Docker Compose, Airflow, dbt, and BigQuery to ingest, transform, and analyze data, providing insights into sales, delivery times, and order distributions.

airflow bigquery dbt-core docker docker-compose postgresql python

Last synced: 22 Jan 2025

https://github.com/lucashomuniz/project-22

[Dashboard] Data and Sustainability: Optimizing Green Flow's Fertilizer Portfolio

agrotech bigquery data-analytics data-structures data-visualization google-cloud-platform powerbi powerbi-visuals powerquery sql sustainability

Last synced: 25 Jan 2025

https://github.com/richardbnk/data_tools

Python Library to Accelerate Creation of Data ETL Processes on multiple database systems.

bigquery etl gcp sql

Last synced: 02 Feb 2025

https://github.com/vidyadnina/cyclistic-sql-tableau-project

Trip data analysis for a bike-sharing service company using SQL and Tableau.

bigquery dashboard data-analysis data-analytics-sql data-cleaning data-visualization sql

Last synced: 21 Jan 2025

https://github.com/chdl17/nyc_green_taxis_peak_hour_analysis

This project analyzes GCP BigQuery data and uses Looker Studio to build a Peak Hour Analysis.

bigquery gcp google-cloud-platform looker-studio sql

Last synced: 21 Nov 2024

https://github.com/mattwelke/charter-challenge-for-fair-voting-bot

Bot that web scrapes and logs in BigQuery the donations so far of the Charter Challenge for Fair Voting.

bigquery bot go openwhisk public-data

Last synced: 22 Jan 2025

https://github.com/pratshrestha/cochin-traders---sql--sales-analysis

Cochin Traders imports and exports specialty foods globally. This project analyzes sales and operational data to enhance business efficiency, supply chain management, and sales performance. Key areas of focus include

bigquery customer-engagement employee-performance inventory-management sales-trends sql

Last synced: 21 Jan 2025

https://github.com/jmfeck/bigquery-local-framework

This repo provides tools to manage BigQuery operations locally, simplifying tasks like uploading flat files, running SQL queries, and downloading tables. It offers a unified interface for local BigQuery interactions, enabling more efficient interaction with it.

bigquery data-engineering ingestion pandas python

Last synced: 18 Jan 2025

https://github.com/nszoni/dbtgen

dbt: write nothing, generate (almost) everything.

analytics bigquery dbt documentation generative-ai github tooling

Last synced: 31 Jan 2025

https://github.com/lorinczakos/sql-projects

This is a collection of my SQL scripts that I wrote and were approved through my course with GoIT Romania Data Analyst course

bigquery cte data data-analysis dbeaver marketing-analytics postgresql project-repository sql vscode

Last synced: 28 Jan 2025

https://github.com/wooyakob/music-recommendation-engine

Using Gemini API to generate personalized music recommendations.

ai bigquery gemini-api google-cloud-platform

Last synced: 28 Jan 2025

https://github.com/ayresgneto/use-case-gcp-etl

ELT pipeline GCP. Tecnologias utilizadas: Postgresql, GCP Storage, Airflow (local), Pyspark (local), BigQuery

airflow big-data bigquery data data-engineering etl gcp pipeline postgresql programming-oriented-object pyspark python spark

Last synced: 21 Jan 2025

https://github.com/amitkumarj441/mysql2bigquery

A script to load a MySQL table in BigQuery. Extracts schema and data as JSON.

bigquery docker mysql scala

Last synced: 26 Jan 2025

https://github.com/tosh2230/bigquery-table-history

Diff daily changes by BigQuery INFORMATION_SCHEMA.PARTITIONS records.

bigquery

Last synced: 21 Jan 2025

https://github.com/vaishnavipaithane/bellabeat-data-analysis-case-study

This capstone project was done as a part of Google Data Analytics Professional Certificate course.

bigquery data-analysis sql tableau

Last synced: 21 Jan 2025

https://github.com/tomgorb/some-data-monitoring

fully functional DAG using Airflow 2 and minikube (locally) to help monitor GCP billing

airflow2 bigquery gcp minikube

Last synced: 21 Jan 2025

https://github.com/kartikeya443/automated-data-pipeline-gcp

This project showcases the integration of various Google Cloud Platform services to build an efficient and automated data pipeline for sales data.

bigquery cloud data-engineering flask gcp google-cloud-platform looker-studio pipeline python sql

Last synced: 21 Jan 2025

https://github.com/phstudy/zetasketch-bigquery-example

An example demonstrates how to use ZetaSketch with BigQuery

bigquery hll java zetasketch

Last synced: 21 Jan 2025

https://github.com/vidyadnina/other-sql-projects-and-queries

Other SQL projects and queries.

bigquery mysql sql

Last synced: 21 Jan 2025

https://github.com/akihokurino/dbt-sample

dbt sample

bigquery dbt python3

Last synced: 07 Feb 2025

https://github.com/armahdavi/bigdata_pyspark_sales_analytics

Summarizing my big data code in python pyspark to analyze sales data with retail and walmart superstore to draw sales insights

big-data bigquery clustering dataframe hadoop k-means machine-learning pyspark pyspark-ml python spark unsupervised-learning

Last synced: 28 Dec 2024

https://github.com/karencofre/marketing-segmentacion-en-powerbi

Proyecto prueba de hipótesis en powerbi y python

bigquery google-colab powerbi python sql statsmodels

Last synced: 21 Jan 2025

https://github.com/ymyzk/bq-globalip

Record the current global IPv4 address to a BigQuery table.

bigquery golang

Last synced: 28 Jan 2025

https://github.com/garbetjie/phpunit-bigquery-schema

A BigQuery schema validator constraint for BigQuery

bigquery phpunit

Last synced: 28 Jan 2025

https://github.com/fsistemas/bigquery-td

ETL to extract data from mysql load and merge in BigQuery

bigquery etl mysql python sql

Last synced: 03 Jan 2025

BigQuery Awesome Lists
BigQuery Categories