Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
BigQuery
Google BigQuery enables companies to handle large amounts of data without having to manage infrastructure. Google’s documentation describes it as a « serverless architecture (that) lets you use SQL queries to answer your organization’s biggest questions with zero infrastructure management. BigQuery’s scalable, distributed analysis engine lets you query terabytes in seconds and petabytes in minutes. » Its client libraries allow the use of widely known languages such as Python, Java, JavaScript, and Go. Federated queries are also supported, making it flexible to read data from external sources.
📖 A highly rated canonical book on it is « Google BigQuery: The Definitive Guide », a comprehensive reference. Another enriching read on the subject is the inside story told in the article by the founding product manager of BigQuery celebrating its 10th anniversary.
- GitHub: https://github.com/topics/bigquery
- Wikipedia: https://en.wikipedia.org/wiki/BigQuery/
- Repo: https://github.com/GoogleCloudPlatform/bigquery-utils/
- Released: May 19, 2010
- Related Topics: cloud-computing,
- Aliases: bq,
- Last updated: 2025-01-31 00:03:17 UTC
- JSON Representation
https://github.com/seahrh/nyc-taxi-trips
REST API for the New York City Taxi Trips public dataset, implemented in Scala and Play Framework 2.7
bigquery nyc-taxi-dataset play-framework rest-api scala
Last synced: 08 Dec 2024
https://github.com/quipper/send-ci-result-to-bigquery-action
Send test results to BigQuery in GitHub Actions
bigquery github-actions google-bigquery junit-xml
Last synced: 09 Jan 2025
https://github.com/rifa8/extract-load-demo
Learning Google Cloud Platform (GCP)
Last synced: 27 Jan 2025
https://github.com/rifa8/data-warehouse-submission
Learning about Data Warehouse
bigquery citus columnar data-warehouse datalake gcs-bucket
Last synced: 27 Jan 2025
https://github.com/knands42/data-ingestion
Data Ingestion project to evaluate my Kotlin skill using concurrency
bigquery golang google-cloud-platform google-storage gradle-kotlin-dsl kotlin kotlin-flow
Last synced: 25 Jan 2025
https://github.com/hariprasath-v/mh_google_cloud_bigquery_ltv_prediction_challenge
Build a model that can predict customers' Long Term Value (LTV).
bigquery colab-notebook klib machine-learning matplotlib numpy pandas python python3 seaborn
Last synced: 13 Jan 2025
https://github.com/sayed-ashfaq/target-sql
In this project, I analyzed Target company's data using SQL in BigQuery, focusing on data extraction, manipulation, and performing various analytical queries to derive insights.
aggregation bigquery cte joins sql
Last synced: 23 Dec 2024
https://github.com/owox/sgtm-owox-ga4-bigquery
OWOX BI Streaming is an advanced tracking to get the most from existing Google Analytics 4 installed on your website
Last synced: 20 Dec 2024
https://github.com/patriciavalentine/loan-data-queries
In this project, I analyzed a vehicle loan dataset using BigQuery to identify demographic, financial, and loan patterns. Through SQL queries, I extracted insights such as the credit scores, and loan distribution by region, and explored high-risk profiles. The findings are visualized in Looker Studio, thus helping to inform strategic decisions.
asset-finance bigquery loan-data looker-studio
Last synced: 09 Dec 2024
https://github.com/janmin123/cyclistic
Capstone project for Google/Coursera Data Analytics Course
analysis bigquery sql tableau visualization
Last synced: 09 Dec 2024
https://github.com/akansharajput280799/strategic-analysis-of-retail-brand-in-south-america-using-sql
Leveraged Big Query and MySQL to analyze 100K records for sales optimization, trend identification, and enhancing customer satisfaction for a retail brand in South America and to provide insights and recommendations to improve their userbase and improve their services
bigquery data-analysis data-science database database-schema google-bigquery mysql-server sql
Last synced: 09 Dec 2024
https://github.com/minhajuddin2510/bigquery_alerts
In today’s data-driven world, organisations heavily rely on timely alerts to monitor critical systems and make informed decisions. However, when working with BigQuery, a popular cloud-based data warehouse, there is no built-in functionality to generate alerts. In this article, we will explore how I recently built a cloud function to address this
alerting bigquery cloudfunctions monitoring-tool slack
Last synced: 04 Dec 2024
https://github.com/andrewm4894/gcp-telemetry-example
Simple HTTP endpoint for telemetry data type events in GCP.
bigquery gcp-cloud-functions gcp-storage python terraform
Last synced: 05 Dec 2024
https://github.com/jasontanx/ridership-headline-project
This end to end data engineering / data analytics project will be about the Malaysian public transport ridership data.
bigquery data-engineering minio-server public-transport-ridership terraform
Last synced: 05 Dec 2024
https://github.com/oguzgn/firebase-ab-test-analysis-for-a-mobile-race-game
This repository showcases an infrastructure designed for analyzing A/B tests in mobile games. It leverages BigQuery to process Firebase and GA4-based event data and uses Looker Studio for dynamic visualization. The project simplifies A/B test comparisons, enabling stakeholders to view results directly through interactive dashboards.
ab-testing ab-testing-analysis bigquery event-based-tracking firebase looker-studio mobile-game-analytics race-game sql
Last synced: 26 Jan 2025
https://github.com/jasontanx/terraform-practice
Creating datasets and tables in Google BigQuery via Terraform
bigquery iac-terraform infrastructure-as-code terraform
Last synced: 05 Dec 2024
https://github.com/alessio-siciliano/google-cloud-python-class-wrapper
An example of several classes written in Python to interact with GCP
bigquery datatransfer gcp google-cloud
Last synced: 26 Jan 2025
https://github.com/valenthr/purchase_funnel
Google merch store sales analysis
Last synced: 27 Jan 2025
https://github.com/machinelearningzuu/data-engineering-projects
This repository is a curated collection of projects and tools that exemplify best practices in data engineering. It serves as a resource for data professionals seeking to enhance their data infrastructure, optimize data pipelines, and implement cutting-edge data processing techniques.
airflow bigquery data-engineering data-science data-visualization data-warehouse
Last synced: 10 Dec 2024
https://github.com/sejalmankar1012/product_data_analyst_assessement
Analyzing the Impact of Business Hour Mismatch on Order Volume in the Food Delivery Industry: A Case Study of UEats and Ghub
assessment-project bigquery loop product-analyst sql-query
Last synced: 26 Jan 2025
https://github.com/ankita-selokar/fitbit-for-her-crafting-fitbit-s-strategy-for-women
This project analyzes smart device usage data to uncover trends and insights, guiding Fitbit by Google’s product and marketing strategies for their new women-focused product launch. It combines competitive market analysis with customer behavior insights to inform key decisions.
bigquery excel powerbi spreadsheet sql
Last synced: 10 Dec 2024
https://github.com/ivanildobarauna-dev/pypi-package-stats
Project for ingest pypi packages data from BigQuery and send to DataDog for analysis and insights with dashboards, monitors and more
bigquery cloud data-engineering data-warehouse gcp software-engineering
Last synced: 11 Dec 2024
https://github.com/pawel045/big-tech-stocks
ETL project
big-data bigquery dataengineering etl
Last synced: 11 Dec 2024
https://github.com/samanthalang/samanthalang_portfolio
Une data analyste avec la vision d'une consommatrice et la stratégie d'une marketeuse.
bigquery excel figma mysql notebook numpy pandas postgresql powerbi powerquery python sql sqlite wordpress
Last synced: 11 Dec 2024
https://github.com/francois-lenne/play-bq-gcp
Data pipeline in order to retrieve data from the playstation API to BigQuery
bigquery cicd data-engineering google-cloud python
Last synced: 13 Jan 2025
https://github.com/mchmarny/stocker
Using tweeter sentiment and stock market price signal correlation to predict next day closing price
bigquery ml prediction regression-models
Last synced: 31 Dec 2024
https://github.com/antbit96/dataform_poc
Template for basic data preparation
bigquery bigquery-dataform data-preparation
Last synced: 14 Dec 2024
https://github.com/scraly/bigquery
Google BigQuery AaaS tools, tips and fun
Last synced: 25 Dec 2024
https://github.com/davidkhala/dwh-migration-tools
dwh-migration-tools: contribution fork
Last synced: 23 Jan 2025
https://github.com/tosh2230/cdc-rds-bq
Change data capture from Amazon RDS to Google BigQuery
bigquery changedatacapture rds
Last synced: 21 Jan 2025
https://github.com/codingsancho/fastapi-bigquery
Learning exercise, Python backend, FastAPI, bigquery, React-JS frontend.
bigquery fastapi javascript python react
Last synced: 20 Dec 2024
https://github.com/adindasarianti/rakamin_kf_analytics
This repository contains my project as a Big Data Analytics intern at Kimia Farma, where I analyzed the performance of Kimia Farma from 2020 to 2023
bigquery dataanalytics lookerstudio
Last synced: 02 Jan 2025
https://github.com/newtonmunene99/sec-filings
Simple golang app that crawls sec EDGAR filings and loads indices into Google BigQuery
bigquery cloudstorage gcp golang
Last synced: 21 Jan 2025
https://github.com/ivdatahub/pypi-package-stats
Project for ingest pypi packages data from BigQuery and send to DataDog for analysis and insights with dashboards, monitors and more
bigquery cloud data-engineering data-warehouse gcp software-engineering
Last synced: 21 Nov 2024
https://github.com/push-protocol/push-google-bigquery
The Power of Web3 Big Data: A Guide to Using Google BigQuery and Push Protocol for Data Communication and Analysis
bigquery data push push-notifications web3
Last synced: 31 Jan 2025
https://github.com/simoun-asmar/clinipet_project
BigQuery
bigquery looker-studio lookerstudio sql
Last synced: 13 Jan 2025
https://github.com/rrmcguinness/protoc-gen-bq-schema
A protocol buffer compiler (protoc) plugin for generating Google BigQuery JSON table definitions.
bigquery bigquery-schema protobuf
Last synced: 13 Jan 2025
https://github.com/giorgishengelia/bike-share-analysis-report
Help developing marketing strategy using data analytics to help convert casual riders into members
Last synced: 12 Dec 2024
https://github.com/sangnandar/load-csvs-from-gcs-to-bigquery
Google Apps Script to streamline loading CSV data from Google Cloud Storage (GCS) into BigQuery.
bigquery csv-import google-apps-script google-cloud-storage
Last synced: 13 Jan 2025
https://github.com/progdrummer1/cyclistic-data-analysis-in-sql-and-r
Study Case: Cyclistic
Last synced: 13 Dec 2024
https://github.com/kevin-rsj/real-estate-investments
Sistema de scoring que clasifica ciudades francesas para inversión en segundas viviendas según perfil de riesgo(alto, moderado y bajo). Evalúa ratios clave en áreas como demanda, disponibilidad, infraestructura, demografía y precios.
bigquery data-analytics looker-studio numpy pandas python sklearn-library sql visualization
Last synced: 17 Dec 2024
https://github.com/tupizz/fiap_pnad-covid-19
Este projeto realiza a análise e transformação de dados da PNAD COVID-19 de maio a julho de 2020, utilizando PySpark para processamento de dados em larga escala e BigQuery como destino para armazenamento e análise posterior. O objetivo é consolidar os dados mensais em um único conjunto de dados transformado.
analysis bigquery pyspark python
Last synced: 17 Dec 2024
https://github.com/neo4j-field/dataflow-flex-pyarrow-to-gds
Google Dataflow Flex Templates (in Python) for large scale Graph Loading with GDS and Apache Arrow
apache-arrow apache-beam bigquery dataflow neo4j python
Last synced: 23 Dec 2024
https://github.com/siriospa/gcp-helpers-bigquery
Helpers for Google Cloud BigQuery.
bigquery gcp google-cloud-platform sirio
Last synced: 12 Oct 2024
https://github.com/siobhan-doherty/ag_challenge
airflow bigquery csv-files data-engineering etl google-cloud-platform python sql
Last synced: 17 Dec 2024
https://github.com/istinnew/cook-me-up
Welcome to Cook-Me-Up! This project aims to analyze and organize cooking recipes using data analysis (Python, BigQuery SQL, Looker Studio etc.) and machine learning techniques. The goal is to simplify meal preparation and offer users a comprehensive database of culinary delights.
bigquery clustering cookme culinary data data-science dataanalysis datavisualization looker-studio machine-learning python recipe-search recipes unsupervised-learning
Last synced: 17 Dec 2024
https://github.com/vaibhavs10/ml-on-gcp
The repository walks through a Data Scientist focused way of building and deploying Machine Learning models on Google Cloud
aiplatform bigquery googlecloudplatform ml
Last synced: 19 Dec 2024
https://github.com/chdl17/nyc_green_taxis_peak_hour_analysis
This project analyzes GCP BigQuery data and uses Looker Studio to build a Peak Hour Analysis.
bigquery gcp google-cloud-platform looker-studio sql
Last synced: 21 Nov 2024
https://github.com/jmfeck/bigquery-local-framework
This repo provides tools to manage BigQuery operations locally, simplifying tasks like uploading flat files, running SQL queries, and downloading tables. It offers a unified interface for local BigQuery interactions, enabling more efficient interaction with it.
bigquery data-engineering ingestion pandas python
Last synced: 18 Jan 2025
https://github.com/yu-iskw/bigquery-lineage
Visualize BigQuery data lineage graph
bigquery data-governance data-management visualization
Last synced: 17 Dec 2024
https://github.com/yu-iskw/homebrew-bigquery-to-datastore
A homebrew tap for bigquery-to-datastore
bigquery google-datastore homebrew
Last synced: 17 Dec 2024
https://github.com/amitkumarj441/mysql2bigquery
A script to load a MySQL table in BigQuery. Extracts schema and data as JSON.
Last synced: 26 Jan 2025
https://github.com/nghiant3110/e_com_1
This is a DA project base on E-com Data set (Thelook_ecom) in Big Query from Google
Last synced: 24 Dec 2024
https://github.com/nghiant3110/google_fiber_bi_5
This is a BI Capstone project based on the Google Fiber dataset from Google BI Course
bigquery google-sheets looker-studio sql
Last synced: 24 Dec 2024
https://github.com/nghiant3110/b2b_crm_3
This is a DA project based on the B2B Sales CRM dataset from Maven Analytics
bigquery google-sheets looker-studio sql
Last synced: 24 Dec 2024
https://github.com/raqssoriano/hha504_assignment_nosql_dbs
This task is part of my assignment focused on creating and configuring databases in different platforms, such as GCP's BigQuery, MongoDB Atlas, and Redis Cloud.
bigquery mongodb-atlas mongodbcompass redis redisinsight
Last synced: 18 Dec 2024
https://github.com/adadalshabab/data-engineering-gcp-project
An end-to-end modern data engineering project, including deployment of ETL pipeline on Google Cloud Platform, using BigQuery for data analysis and leveraging Looker to generate an insight dashboard.
bigquery data data-science data-visualization databases dataengineering-a engineering etl-pipeline looker-studio powerbi
Last synced: 19 Dec 2024
https://github.com/hcrlau/cyclistic-bike-share-analysis
Google Data Analytics Capstone Project
bigquery cyclistic-bike-share-analysis-case-study data-analysis data-visualization sql tableau
Last synced: 19 Dec 2024
https://github.com/armahdavi/bigdata_pyspark_sales_analytics
Summarizing my big data code in python pyspark to analyze sales data with retail and walmart superstore to draw sales insights
big-data bigquery clustering dataframe hadoop k-means machine-learning pyspark pyspark-ml python spark unsupervised-learning
Last synced: 28 Dec 2024
https://github.com/zkan/data-engineering-on-gcp
Data Engineering on Google Cloud Platform (GCP)
bigquery data-engineering data-lake data-pipeline data-warehouse gcs google-cloud-platform machine-learning
Last synced: 19 Dec 2024
https://github.com/scraly/flume-bigquery-sink
An Apache Flume Sink implementation to publish data to Google BigQuery
Last synced: 25 Dec 2024
https://github.com/alexgenovese/machine-learning-bigquery-gcp
These SQL are based on available ecommerce dataset that has millions of Google Analytics records for the Google Merchandise Store loaded into BigQuery.
bigquery google google-cloud-platform purchase sql visitors
Last synced: 28 Dec 2024
https://github.com/francois-lenne/elt-mp4-quiberon
the goal of this project is to retrieve the video of the municipality of quiberon and see if a person is in or no
bigquery cicd data-engineering docker elt google-cloud-functions google-cloud-platform google-cloud-run google-cloud-storage pipeline python sql unstructured-data
Last synced: 25 Dec 2024
https://github.com/fakhri098/project-sql-bigquery
This project aims to analyze taxi trip data with a focus on trip duration patterns, popular routes, and trip costs. The study was conducted to gain in-depth insights into taxi travel behavior based on historical data.
Last synced: 17 Jan 2025
https://github.com/celiason/coffee-funnel
webpage for visualizing sales projections of a small coffee business
bigquery prophet sales-analysis streamlit-webapp
Last synced: 26 Dec 2024
https://github.com/ansh-info/stockpulse
Real-time stock market analytics pipeline with live visualization dashboard. Built with Python and GCP, featuring automated data processing and interactive Streamlit analytics.
api big-data bigquery cloud cloud-computing cloud-native data-engineering data-pipeline docker docker-compose gcp gcp-automation-gitops gcp-cloud-run gcp-pubsub google-cloud-platform real-time realtime stock-market stocks streamlit
Last synced: 27 Dec 2024
https://github.com/syedsajjadaskari/end-to-end-chicago-taxi-tip-prediction-with-bigquery-and-vertex-ai
An end-to-end example of Chicago taxi on Google Cloud using TensorFlow, TFX, and Vertex AI
bigquery gcp tensorflow tfx vertex-ai
Last synced: 13 Jan 2025
https://github.com/prashhhant213/strategic-analysis-of-retail-brand-in-south-america-using-sql
Leveraged Big Query and MySQL to analyze 100K records for sales optimization, trend identification, and enhancing customer satisfaction for a retail brand in South America and to provide insights and recommendations to improve their userbase and improve their services
bigquery database mysql-server sql
Last synced: 26 Dec 2024
https://github.com/epomatti/gcp-bigquery
Data sync via CDC from GCP Cloud SQL to Big Query using Datastream
bigquery cloud-sql datastream gcp
Last synced: 17 Jan 2025
https://github.com/mysto-007/cyclistic-bike-share-analysis
Analyzed the dataset of Cyclistic Rental Service as the Capstone project for Google Data Analytics SpecializationAnalyzed the dataset of Cyclistic bike-share (Capstone project for Google Data Analytics Specialization)
bigquery data-analysis excel ms-sql-server sql tableau tableau-public
Last synced: 18 Jan 2025
https://github.com/allanreda/video-processing-and-categorization
Video processing and categorization using computer vision, machine learning and cloud computing
bigquery cloud-storage-bucket cnn computer-vision google-cloud kmeans-clustering machine-learning opencv2 tensorflow virtual-machine
Last synced: 28 Dec 2024
https://github.com/itsubaki/hermes-lambda
Transfers AWS cost data to BigQuery
Last synced: 14 Dec 2024
https://github.com/anyesh/gbq-helpers
GBQ related helper functions and snippets.
Last synced: 10 Jan 2025
https://github.com/mateuszk098/sql-queries
SQL Queries Training.
bigquery hackerrank-solutions query sql
Last synced: 28 Dec 2024
https://github.com/manesioz/airflow-without-code
Dynamically generate DAGs to ingest SQL files into BigQuery with one line of "code"
airflow airflow-plugin bigquery python sql
Last synced: 05 Jan 2025
https://github.com/entur/terraform-aiven-kafka-connect-bigquery-sink
Terraform module for BigQuery sink connector on Aiven KafkaConnect cluster
aiven bigquery kafka-connect sink-connector terraform terraform-modules
Last synced: 17 Jan 2025
https://github.com/ngangawairimu/clv-rfm-and-customer-segmentation-analysis
This project performs cohort analysis to estimate Customer Lifetime Value (CLV) by analyzing weekly revenue and user registrations over 12 weeks, forecasting future revenue, and providing actionable insights for marketing and business strategy.
bigquery clv-analysis cohort-analysis customer-segmentation excel rfm-analysis
Last synced: 03 Jan 2025
https://github.com/mikeghen/metadata
Pulls data from Socrata open data portals
Last synced: 27 Dec 2024
https://github.com/shikanime/seeker
Data platform based on BigQuery
bigquery dataform google-cloud
Last synced: 04 Jan 2025
https://github.com/marceloneppel/map-to-bigquery-structs
Tool to convert a Golang map to a struct containing fields with types like bigquery.Null*.
Last synced: 30 Jan 2025
https://github.com/vigneshss-07/mastering-sql-and-bigquery-on-google-cloud-platform
Take your Data Analytics skills to the next level with this comprehensive playlist. Learn SQL from the basics to advanced techniques while mastering BigQuery on Google Cloud.
Last synced: 05 Jan 2025
https://github.com/khanovico/energy-data-analysis
This is the cloud model analyzing real world dataset with BigQuery and other big-data analyzing tools. I implemented docker image for running this app on cross-platform environments.
big-data-processing bigquery docker google-app-engine jupyter-notebook mlflow python scikit-learn seaborn xgboost
Last synced: 10 Oct 2024
https://github.com/simhayn/genomics-cannabis-bigquery
BigQuery's Cannabis_Genomics Dataset Exploration using SQL in a Python Environment
big-data bigquery bioinformatics exploratory-data-analysis genomics python sql
Last synced: 22 Jan 2025
https://github.com/lu-sketch/google-big-query-sql---credit-risk-analysis
Big Query SQL Credit Risk Analysis
big-data bigquery credit-risk sql
Last synced: 18 Jan 2025
https://github.com/yasarsultan/taxi-trip-analysis
The NYC Taxi Trip Batch Data Pipeline automates processing of large-scale trip data using Apache Spark and Airflow, integrating AWS S3 and Google BigQuery for storage and analytics. It features scalable, containerized workflows with robust data validation.
airflow aws-s3 bash-script batch-processing bigquery data-lake data-warehouse docker python3 spark
Last synced: 11 Jan 2025
https://github.com/shahardekel/diabetes-analysis
bigquery cognos-dashboard python sql
Last synced: 18 Dec 2024
https://github.com/syou6162/mackerel-plugin-bigquery-query-result-importer
Mackerel plugin to post bigquery's query result
Last synced: 12 Oct 2024
https://github.com/denisogr/kaggle-notebook-to-production
This is a study project. I get analytics/ML examples from Kaggle and use different technologies to re-implement them.
bigquery data-engineering gcp kaggle-competition kaggle-dataset python spark
Last synced: 12 Jan 2025
https://github.com/victorelexpe/bq-schema-sync
bigquery gcp google-cloud python schema sync
Last synced: 12 Oct 2024
https://github.com/justinjsd/analytics-engineering
📊 A repository focusing on analytics engineering, particularly using dbt on the Northwind Sample dataset
analytics bigquery dbt engineering sql
Last synced: 12 Jan 2025
https://github.com/shvetsihorr/sql-projects
SQL and Google BigQuery-Portfolio Projects
azuredatastudio bigquery mssql postgresql sql
Last synced: 18 Jan 2025
https://github.com/bedirk/sql-projects-studies
My Projects and Studies by using SQL
azuredatastudio bigquery jupyter-notebook kaggle mssqlserver sql
Last synced: 18 Jan 2025
https://github.com/smohanta23/uber_data-engineering_etl-project
This project demonstrates a comprehensive data engineering workflow using the Uber information dataset. It covers the full spectrum of data engineering pipelines, from data transformation to deployment on Google Cloud, with a focus on creating a scalable and insightful data model.
big-data-analytics bigquery cloudcomputing computeengine dashboard-application dataengineering datainsights datamodelling datapipeline datascience datavisualization etl-pipeline gcp-project googlecloudplatform mage opensource python uber uber-api
Last synced: 21 Jan 2025
https://github.com/lucashomuniz/project-22
[Dashboard] Data and Sustainability: Optimizing Green Flow's Fertilizer Portfolio
agrotech bigquery data-analytics data-structures data-visualization google-cloud-platform powerbi powerbi-visuals powerquery sql sustainability
Last synced: 25 Jan 2025