Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
BigQuery
![](https://explore-feed.github.com/topics/bigquery/bigquery.png)
Google BigQuery enables companies to handle large amounts of data without having to manage infrastructure. Google’s documentation describes it as a « serverless architecture (that) lets you use SQL queries to answer your organization’s biggest questions with zero infrastructure management. BigQuery’s scalable, distributed analysis engine lets you query terabytes in seconds and petabytes in minutes. » Its client libraries allow the use of widely known languages such as Python, Java, JavaScript, and Go. Federated queries are also supported, making it flexible to read data from external sources.
📖 A highly rated canonical book on it is « Google BigQuery: The Definitive Guide », a comprehensive reference. Another enriching read on the subject is the inside story told in the article by the founding product manager of BigQuery celebrating its 10th anniversary.
- GitHub: https://github.com/topics/bigquery
- Wikipedia: https://en.wikipedia.org/wiki/BigQuery/
- Repo: https://github.com/GoogleCloudPlatform/bigquery-utils/
- Released: May 19, 2010
- Related Topics: cloud-computing,
- Aliases: bq,
- Last updated: 2025-02-15 00:03:26 UTC
- JSON Representation
https://github.com/anpandu/ps2bq
Stream insert GCP PubSub messages into BigQuery table.
Last synced: 21 Jan 2025
https://github.com/shaheerazam-dev/cyclistic-case-study-google-data-analytics-certificate
This case study simulates the real-world experience of a junior data analyst at Cyclistic, a fictional company. We will leverage the data analysis process framework (Ask, Prepare, Process, Analyze, Share, Act) to address critical business questions and provide data-driven insights to guide strategic decision-making.
bigquery data-science data-visualization spreadsheet sql tableau
Last synced: 21 Jan 2025
https://github.com/vidyadnina/cyclistic-sql-tableau-project
Trip data analysis for a bike-sharing service company using SQL and Tableau.
bigquery dashboard data-analysis data-analytics-sql data-cleaning data-visualization sql
Last synced: 21 Jan 2025
https://github.com/pratshrestha/cochin-traders---sql--sales-analysis
Cochin Traders imports and exports specialty foods globally. This project analyzes sales and operational data to enhance business efficiency, supply chain management, and sales performance. Key areas of focus include
bigquery customer-engagement employee-performance inventory-management sales-trends sql
Last synced: 21 Jan 2025
https://github.com/sintef/bigquery-postgresql-wire-proxy
A PostgreSQL wire protocol proxy server for BigQuery.
Last synced: 12 Jan 2025
https://github.com/kevin-rsj/real-estate-investments
Sistema de scoring que clasifica ciudades francesas para inversión en segundas viviendas según perfil de riesgo(alto, moderado y bajo). Evalúa ratios clave en áreas como demanda, disponibilidad, infraestructura, demografía y precios.
bigquery data-analytics looker-studio numpy pandas python sklearn-library sql visualization
Last synced: 09 Feb 2025
https://github.com/mateuszk098/sql-queries
SQL Queries Training.
bigquery hackerrank-solutions query sql
Last synced: 28 Dec 2024
https://github.com/flowerinthenight/bqstream
A simple library to help facilitate streaming to BigQuery.
Last synced: 08 Jan 2025
https://github.com/phukon/package-insights
PyPI package reports and insights. The data was ingested from publicly available source using BigQuery and then transformed.
Last synced: 27 Jan 2025
https://github.com/manesioz/airflow-without-code
Dynamically generate DAGs to ingest SQL files into BigQuery with one line of "code"
airflow airflow-plugin bigquery python sql
Last synced: 05 Jan 2025
https://github.com/ayresgneto/use-case-gcp-etl
ELT pipeline GCP. Tecnologias utilizadas: Postgresql, GCP Storage, Airflow (local), Pyspark (local), BigQuery
airflow big-data bigquery data data-engineering etl gcp pipeline postgresql programming-oriented-object pyspark python spark
Last synced: 21 Jan 2025
https://github.com/neo4j-field/dataflow-flex-pyarrow-to-gds
Google Dataflow Flex Templates (in Python) for large scale Graph Loading with GDS and Apache Arrow
apache-arrow apache-beam bigquery dataflow neo4j python
Last synced: 23 Dec 2024
https://github.com/manuelandersen/football-pipeline
DE Zoomcamp 2024 Final Project 🧙
bigquery data-engineering data-lake data-warehouse dbt dbt-cloud etl-pipeline google-cloud looker-studio mageai python
Last synced: 21 Jan 2025
https://github.com/vaishnavipaithane/bellabeat-data-analysis-case-study
This capstone project was done as a part of Google Data Analytics Professional Certificate course.
bigquery data-analysis sql tableau
Last synced: 21 Jan 2025
https://github.com/yu-iskw/bigquery-lineage
Visualize BigQuery data lineage graph
bigquery data-governance data-management visualization
Last synced: 10 Feb 2025
https://github.com/tomgorb/some-data-monitoring
fully functional DAG using Airflow 2 and minikube (locally) to help monitor GCP billing
airflow2 bigquery gcp minikube
Last synced: 21 Jan 2025
https://github.com/nghiant3110/e_com_1
This is a DA project base on E-com Data set (Thelook_ecom) in Big Query from Google
Last synced: 24 Dec 2024
https://github.com/lucashomuniz/project-22
[Dashboard] Data and Sustainability: Optimizing Green Flow's Fertilizer Portfolio
agrotech bigquery data-analytics data-structures data-visualization google-cloud-platform powerbi powerbi-visuals powerquery sql sustainability
Last synced: 25 Jan 2025
https://github.com/nghiant3110/google_fiber_bi_5
This is a BI Capstone project based on the Google Fiber dataset from Google BI Course
bigquery google-sheets looker-studio sql
Last synced: 24 Dec 2024
https://github.com/nghiant3110/b2b_crm_3
This is a DA project based on the B2B Sales CRM dataset from Maven Analytics
bigquery google-sheets looker-studio sql
Last synced: 24 Dec 2024
https://github.com/andrewm4894/gcp-telemetry-example
Simple HTTP endpoint for telemetry data type events in GCP.
bigquery gcp-cloud-functions gcp-storage python terraform
Last synced: 01 Feb 2025
https://github.com/mikeghen/metadata
Pulls data from Socrata open data portals
Last synced: 27 Dec 2024
https://github.com/kartikeya443/automated-data-pipeline-gcp
This project showcases the integration of various Google Cloud Platform services to build an efficient and automated data pipeline for sales data.
bigquery cloud data-engineering flask gcp google-cloud-platform looker-studio pipeline python sql
Last synced: 21 Jan 2025
https://github.com/janaom/gcp-de-project-data-pipeline-with-cloud-run-functions-airflow-biggueryml
Build a data pipeline on Google Cloud using an event-driven architecture, leveraging GCS, Cloud Run functions, and BigQuery. Explore both VM and Composer options for Airflow management, and utilize Logging & Monitoring for pipeline health. Discover how SQL-based BigQuery ML can be used for initial ML implementation in specific scenarios.
airflow bigquery bigqueryml cloud-functions cloud-run-functions composer data-engineering-project google-cloud-platform
Last synced: 26 Jan 2025
https://github.com/vigneshss-07/mastering-sql-and-bigquery-on-google-cloud-platform
Take your Data Analytics skills to the next level with this comprehensive playlist. Learn SQL from the basics to advanced techniques while mastering BigQuery on Google Cloud.
Last synced: 05 Jan 2025
https://github.com/zkan/data-engineering-on-gcp
Data Engineering on Google Cloud Platform (GCP)
bigquery data-engineering data-lake data-pipeline data-warehouse gcs google-cloud-platform machine-learning
Last synced: 19 Dec 2024
https://github.com/scraly/flume-bigquery-sink
An Apache Flume Sink implementation to publish data to Google BigQuery
Last synced: 25 Dec 2024
https://github.com/vidyadnina/other-sql-projects-and-queries
Other SQL projects and queries.
Last synced: 21 Jan 2025
https://github.com/tharun2806/end-to-end-internship-data-analysis
Internship Dataset Analysis is an end-to-end project analyzing an internship dataset obtained from Kaggle. The project involves cleaning and preprocessing the data using Excel and SQL, followed by exploratory data analysis (EDA). The analysis includes statistical, sectoral and geospatial insights, visualized through an interactive Tableau dashboard
bigquery data-analysis data-cleaning data-preprocessing data-visualization exploratory-data-analysis geospatial-analysis microsoft-excel reporting sectoral-analysis statistical-analysis tableau-public
Last synced: 07 Feb 2025
https://github.com/francois-lenne/elt-mp4-quiberon
the goal of this project is to retrieve the video of the municipality of quiberon and see if a person is in or no
bigquery cicd data-engineering docker elt google-cloud-functions google-cloud-platform google-cloud-run google-cloud-storage pipeline python sql unstructured-data
Last synced: 25 Dec 2024
https://github.com/jasontanx/ridership-headline-project
This end to end data engineering / data analytics project will be about the Malaysian public transport ridership data.
bigquery data-engineering minio-server public-transport-ridership terraform
Last synced: 01 Feb 2025
https://github.com/fakhri098/project-sql-bigquery
This project aims to analyze taxi trip data with a focus on trip duration patterns, popular routes, and trip costs. The study was conducted to gain in-depth insights into taxi travel behavior based on historical data.
Last synced: 17 Jan 2025
https://github.com/celiason/coffee-funnel
webpage for visualizing sales projections of a small coffee business
bigquery prophet sales-analysis streamlit-webapp
Last synced: 26 Dec 2024
https://github.com/karencofre/marketing-segmentacion-en-powerbi
Proyecto prueba de hipótesis en powerbi y python
bigquery google-colab powerbi python sql statsmodels
Last synced: 21 Jan 2025
https://github.com/khanovico/energy-data-analysis
This is the cloud model analyzing real world dataset with BigQuery and other big-data analyzing tools. I implemented docker image for running this app on cross-platform environments.
big-data-processing bigquery docker google-app-engine jupyter-notebook mlflow python scikit-learn seaborn xgboost
Last synced: 09 Feb 2025
https://github.com/simhayn/genomics-cannabis-bigquery
BigQuery's Cannabis_Genomics Dataset Exploration using SQL in a Python Environment
big-data bigquery bioinformatics exploratory-data-analysis genomics python sql
Last synced: 22 Jan 2025
https://github.com/minhajuddin2510/bigquery_alerts
In today’s data-driven world, organisations heavily rely on timely alerts to monitor critical systems and make informed decisions. However, when working with BigQuery, a popular cloud-based data warehouse, there is no built-in functionality to generate alerts. In this article, we will explore how I recently built a cloud function to address this
alerting bigquery cloudfunctions monitoring-tool slack
Last synced: 31 Jan 2025
https://github.com/prashhhant213/strategic-analysis-of-retail-brand-in-south-america-using-sql
Leveraged Big Query and MySQL to analyze 100K records for sales optimization, trend identification, and enhancing customer satisfaction for a retail brand in South America and to provide insights and recommendations to improve their userbase and improve their services
bigquery database mysql-server sql
Last synced: 26 Dec 2024
https://github.com/syou6162/mackerel-plugin-bigquery-query-result-importer
Mackerel plugin to post bigquery's query result
Last synced: 12 Oct 2024
https://github.com/epomatti/gcp-bigquery
Data sync via CDC from GCP Cloud SQL to Big Query using Datastream
bigquery cloud-sql datastream gcp
Last synced: 17 Jan 2025
https://github.com/lisabensoussan/bigdataminig_finalassignment
This repository contains solutions for the final assignment of the Big Data Mining course (52002/52019), focusing on querying large datasets with BigQuery, network analysis with Python, and distributed data processing with Apache Spark.
bigquery community-detection data-cleaning dataframe exploratory-data-analysis pagerank rdd sql text-analysis visualization
Last synced: 21 Jan 2025
https://github.com/jasontanx/terraform-practice
Creating datasets and tables in Google BigQuery via Terraform
bigquery iac-terraform infrastructure-as-code terraform
Last synced: 01 Feb 2025
https://github.com/goatcheesesaladwithpeanutoildressing/scio-demo
Playing w/ Scio
Last synced: 08 Jan 2025
https://github.com/victorelexpe/bq-schema-sync
bigquery gcp google-cloud python schema sync
Last synced: 12 Oct 2024
https://github.com/anyesh/gbq-helpers
GBQ related helper functions and snippets.
Last synced: 10 Jan 2025
https://github.com/edwinrlambert/cyclistic-bike-share-analysis
This repository is part of the Google Data Analytics Capstone Project, focusing on analyzing Cyclistic's bike-sharing data to identify trends and strategies for converting casual riders to annual members. It aims to provide actionable insights for enhancing marketing efforts.
act analyze ask bigquery prepare process share sql
Last synced: 21 Jan 2025
https://github.com/smohanta23/uber_data-engineering_etl-project
This project demonstrates a comprehensive data engineering workflow using the Uber information dataset. It covers the full spectrum of data engineering pipelines, from data transformation to deployment on Google Cloud, with a focus on creating a scalable and insightful data model.
big-data-analytics bigquery cloudcomputing computeengine dashboard-application dataengineering datainsights datamodelling datapipeline datascience datavisualization etl-pipeline gcp-project googlecloudplatform mage opensource python uber uber-api
Last synced: 21 Jan 2025
https://github.com/mlund2k/project-1-baseball-performance-vs.-attendance
Project assets for my first exploratory data analysis: Baseball Performance vs. Attendance.
bigquery data-analysis data-cleaning data-visualization excel rstudio sql tableau tidyverse
Last synced: 21 Jan 2025
https://github.com/entur/terraform-aiven-kafka-connect-bigquery-sink
Terraform module for BigQuery sink connector on Aiven KafkaConnect cluster
aiven bigquery kafka-connect sink-connector terraform terraform-modules
Last synced: 17 Jan 2025
https://github.com/ngangawairimu/clv-rfm-and-customer-segmentation-analysis
This project performs cohort analysis to estimate Customer Lifetime Value (CLV) by analyzing weekly revenue and user registrations over 12 weeks, forecasting future revenue, and providing actionable insights for marketing and business strategy.
bigquery clv-analysis cohort-analysis customer-segmentation excel rfm-analysis
Last synced: 03 Jan 2025
https://github.com/shikanime/seeker
Data platform based on BigQuery
bigquery dataform google-cloud
Last synced: 04 Jan 2025
https://github.com/marceloneppel/map-to-bigquery-structs
Tool to convert a Golang map to a struct containing fields with types like bigquery.Null*.
Last synced: 30 Jan 2025
https://github.com/markjamesbutler/dbt-fundamentals-bigquery
Implementation of dbt fundamentals training course material using BigQuery.
bigquery dbt dbt-fundamentals fundamentals jinja2 practice-tasks sql
Last synced: 16 Jan 2025
https://github.com/garbetjie/monolog-bigquery-handler
A simple Monolog handler for writing to BigQuery.
bigquery logging monolog monolog-handler
Last synced: 16 Jan 2025
https://github.com/alessio-siciliano/bigquery-advanced-utils
BigQuery-advanced-utils is a lightweight utility library that extends the official Google BigQuery Python client. It simplifies tasks like query management, data processing, and automation. Aimed at developers and data scientists, the project is open to contributions to improve and enhance its functionality.
bigquery datatransfer google-cloud python
Last synced: 01 Feb 2025
https://github.com/isaacmg/mimic_iv_bq_queries
Queries needed to recreate time series features for model training
Last synced: 21 Jan 2025
https://github.com/ngangawairimu/sales-analysis-and-customer-insights
This project features SQL queries for detailed customer and sales analysis:Customer Analysis and Sales Reporting
bigquery bigquery-dataset excel sql
Last synced: 28 Jan 2025
https://github.com/djdhairya/uber-data-analytics
Mage Vm
aiml api bigdata bigquery deep-learning docker google-maps-api ml python3 sql ssh vmware
Last synced: 07 Jan 2025
https://github.com/yandex-cloud-examples/yc-bigquery-to-object-storage
Экспорт данных из Google Big Query через Google Storage в Object Storage Yandex Cloud.
bigquery object-storage python3 yandex-cloud yandexcloud
Last synced: 29 Dec 2024
https://github.com/abdullahasghar/sql
The repo includes all projects and assessments I have completed with SQL. IDE/s used: MS SQL Server, Google Big Query.
Last synced: 18 Jan 2025
https://github.com/pittica/google-bigquery-helpers
Helpers for Google Cloud BigQuery.
bigquery gcp google-cloud-platform pittica
Last synced: 13 Nov 2024
https://github.com/chukwuemekaaham/ny_taxi_rides
Analytics engineering using Dbt and Google Cloud BigQuery
analytics-engineering bigquery dbt github
Last synced: 10 Jan 2025
https://github.com/toskpl/googlecloud
Challnege 30 days - GoogleCloud
bigquery google-cloud google-cloud-platform ml
Last synced: 14 Nov 2024
https://github.com/lambdamusic/dimschema
CLI to retrieve SQL schema information about the Dimensions on Google BigQuery dataset.
bigquery dimensions python scholarly-metadata
Last synced: 12 Jan 2025
https://github.com/oleksiilatypov/google_cloud
AI & Data, Google Cloud Skills Boost
bigquery document-ai ml vertexai
Last synced: 18 Jan 2025
https://github.com/marielachirinosr/nyc-taxi-trip-exploration-2019-2020
Explores passenger behavior & impact of COVID-19 on NYC taxi industry (Q1 2019-2020).
bigquery data data-analysis data-visualization python sql tableau
Last synced: 29 Dec 2024
https://github.com/karencofre/riesgorelativo-lookerstudio
proyecto de análisis de datos y análisis perdicitvo en looker studio y google colab
bigquery data-analysis data-science machine-learning matplotlib python sklearn sql
Last synced: 22 Jan 2025
https://github.com/zenklinov/correlation-nybikers-with-weather-using-bigquery
Last synced: 22 Jan 2025
https://github.com/zborovskaanna/e-commerce-web-events-analysis
SQL project based on the Big Query public database 'The Look e-Commerce' and a dashboard in Looker Studio
analysis bigquery dashboard data-visualization looker-studio sql
Last synced: 22 Jan 2025
https://github.com/azapeti/bigquery-python-bash-automation
Since you're using the free version, you can only get data from your website through the Google Analytics API for the last 60 days. I would like to demonstrate in this repository how to run BigQuery queries in Python and automate it using bash and crontab for collecting historical data.
analytics automation bash bigquery cronjob crontab ga4 python python3
Last synced: 22 Jan 2025
https://github.com/robinnoiret/importcsv_zendeskbigquery
This project involves developing a Python script to import csv export from Zendesk to BigQuery. It is not intended for recurring use, but to enable an initial dump of historical data.
bigquery connector export-csvfile json zendesk
Last synced: 22 Jan 2025
https://github.com/acardosolima/crypto-ethereum-tokens
This project aims to create a data pipeline using Airflow to ingest dataset from Google Bigquery to a PostgreSQL database. This stack will run in a local environment using Kubernetes.
airflow bigquery postgresql python
Last synced: 22 Jan 2025
https://github.com/andre-gitdev/stocks-functions
This project is for EDA related to stock trading.
alpaca alpaca-trading-api bigquery google-cloud portfolio-optimization robinhood-api robinhood-portfolio stock-analysis stock-data stock-price-prediction stocks-api stocks-trading
Last synced: 22 Jan 2025
https://github.com/lisabensoussan/bigdata_midterm
This project focuses on analyzing Stack Overflow data related to JavaScript and Python questions using a combination of SQL queries (Google BigQuery) and Unix shell commands. The aim is to explore trends, activity patterns, and user behavior around these popular programming languages through data wrangling and querying techniques.
bigquery data-cleaning sql unix-command unix-shell
Last synced: 22 Jan 2025
https://github.com/panagiotischaviaropoulos/google-data-analytics-case-study
bigquery data-visualization sql
Last synced: 22 Jan 2025
https://github.com/thecodersstudio/node-native-test-runner
Code samples and test cases showcasing the power of Node.js's native test runner for streamlined and efficient testing.
bigquery mock nodejs nodejs-test nodenativetestrunner test
Last synced: 22 Jan 2025
https://github.com/noospheracr/twilio-segment-configs
Integration of Twilio Segment with Google BigQuery, Looker/PowerBI, and Google VertexAI to create a data-driven marketing platform
bigquery google-cloud-platform looker-studio marketing noosphera power-bi twilio-segment vertex-ai
Last synced: 22 Jan 2025
https://github.com/thanhloc81/customer-segmentation
✨ Analyze customer segments of Adventure World dataset
bigquery google-cloud powerbi sql
Last synced: 22 Jan 2025
https://github.com/jasontanx/mas-international-arrivals
Code repository about international arrivals into Malaysia
bigquery data-analytics data-engineering etl-pipeline international-arrivals
Last synced: 22 Jan 2025
https://github.com/hanif-syazul/analyzing-kimia-farma-sales-performance-with-gcp
This repository contains the final project for the Rakamin Big Data Analytics Internship. It include a complete dashboard of Kimia Farma's sales performance analysis from 2020 to 2023.
big-data-analytics bigquery internship-project kimia-farma looker-studio rakamin sql
Last synced: 22 Jan 2025
https://github.com/zeinhasan/etl-using-airflow
Extract Transform Load Using Airflow
Last synced: 22 Jan 2025
https://github.com/yasarsultan/olist_datawarehouse
An end-to-end data pipeline that extracts data, processes it, and then loads it into the BigQuery data warehouse.
airflow bigquery data-warehouse docker
Last synced: 22 Jan 2025
https://github.com/iht/bigquery-dataflow-cdc-example
A Dataflow streaming pipeline written in Java, reading data from Pubsub and recovering the sessions from potentially unordered data, and upserting the session data into BigQuery with no duplicates
apache-beam bigquery cdc dataflow google-cloud pubsub
Last synced: 29 Dec 2024
https://github.com/ahbiels/chatbot_analize_avaliation
Um bot feito no dialogflow cx que permite ao usuário avaliar um determinado produto da empresa. Após a avaliação, o bot ira fazer uma análise de sentimentos na avaliação do usuário, e armazenar o resultado da avaliação (juntamente com o texto da avaliação, nome do usuário e produto) dentro de um dataset no BigQuery
bigquery chatbot dataset dialogflow dialogflow-cx documentation flask gcp google-cloud iterator language-model nlu nlu-chatbot python sql
Last synced: 22 Jan 2025
https://github.com/mutaharshaik/airflow_retail_project
Airflow retail project using pipeline with BigQuery, dbt, Soda
airflow astro-cli astro-sdk bigquery datamodeling dbt docker etl-pipeline gcp snowflake soda
Last synced: 22 Jan 2025
https://github.com/victorcezeh/end-to-end-elt-pipeline
An end-to-end ELT project using the Brazilian E-Commerce dataset from Kaggle. This project demonstrates the use of Python, PostgreSQL, Docker, Docker Compose, Airflow, dbt, and BigQuery to ingest, transform, and analyze data, providing insights into sales, delivery times, and order distributions.
airflow bigquery dbt-core docker docker-compose postgresql python
Last synced: 22 Jan 2025
https://github.com/coatless/bigquery-reddit-ask-your-advisor
Analysis code that counts instances of a phrase on Reddit (e.g. "ask your advisor")
ask-your-advisor bigquery r reddit
Last synced: 16 Jan 2025
https://github.com/nikhilsree5/targetcasestudy
An exploratory and in-depth study of the e-commerce market in Brazil.
bigquery eda sql visualization
Last synced: 22 Jan 2025
https://github.com/mattwelke/charter-challenge-for-fair-voting-bot
Bot that web scrapes and logs in BigQuery the donations so far of the Charter Challenge for Fair Voting.
bigquery bot go openwhisk public-data
Last synced: 22 Jan 2025
https://github.com/crudek-data/bigquery-kaggle-apis
kaggle api to download free datasets along with google bigquery api to read/write from cloud data warehouse
bigquery data-engineering kaggle
Last synced: 22 Jan 2025
https://github.com/lu-sketch/google-big-query-sql---credit-risk-analysis
Big Query SQL Credit Risk Analysis
big-data bigquery credit-risk sql
Last synced: 18 Jan 2025
https://github.com/raqssoriano/hha504_assignment_nosql_dbs
This task is part of my assignment focused on creating and configuring databases in different platforms, such as GCP's BigQuery, MongoDB Atlas, and Redis Cloud.
bigquery mongodb-atlas mongodbcompass redis redisinsight
Last synced: 10 Feb 2025
https://github.com/ivanildobarauna/pypi-package-stats
Project for ingest pypi packages data from BigQuery and send to DataDog for analysis and insights with dashboards, monitors and more
bigquery cloud data-engineering data-warehouse gcp software-engineering
Last synced: 29 Dec 2024