Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
BigQuery
![](https://explore-feed.github.com/topics/bigquery/bigquery.png)
Google BigQuery enables companies to handle large amounts of data without having to manage infrastructure. Google’s documentation describes it as a « serverless architecture (that) lets you use SQL queries to answer your organization’s biggest questions with zero infrastructure management. BigQuery’s scalable, distributed analysis engine lets you query terabytes in seconds and petabytes in minutes. » Its client libraries allow the use of widely known languages such as Python, Java, JavaScript, and Go. Federated queries are also supported, making it flexible to read data from external sources.
📖 A highly rated canonical book on it is « Google BigQuery: The Definitive Guide », a comprehensive reference. Another enriching read on the subject is the inside story told in the article by the founding product manager of BigQuery celebrating its 10th anniversary.
- GitHub: https://github.com/topics/bigquery
- Wikipedia: https://en.wikipedia.org/wiki/BigQuery/
- Repo: https://github.com/GoogleCloudPlatform/bigquery-utils/
- Released: May 19, 2010
- Related Topics: cloud-computing,
- Aliases: bq,
- Last updated: 2025-02-10 00:03:25 UTC
- JSON Representation
https://github.com/ritu456286/smartstockai
SmartStockAI uses AI to predict inventory trends, minimize deadstock risks, and provide actionable insights through advanced models and interactive visualizations.
bigquery bigquery-ml cloud-storage cloudrun cloudsql gemini google-maps-api
Last synced: 30 Jan 2025
https://github.com/pmhalvor/whale-speech
A pipeline to map whale sightings to hydrophone audio
beam bigquery gcs mle model-as-a-service python tensorflow2
Last synced: 20 Dec 2024
https://github.com/rsachdeva/illuminatingdeposits-gcp-trigger
Terraform usage in the context of Google Cloud Platform GCP based Trigger of Resources applied to Cloud Functions. Both resource creation and destruction is through Terraform.
bigquery bigquery-table cloud-events functions-framework gcp go golang golangci-lint google-cloud google-cloud-function-pubsub-trigger google-cloud-functions google-cloud-pubsub google-cloud-sdk google-cloud-storage google-cloud-terraform sendgrid terraform
Last synced: 18 Jan 2025
https://github.com/fabioba/netflix-analytics
Analyze personal Netflix usage
airflow astronomer bigquery google-cloud-platform netflix tableau
Last synced: 19 Jan 2025
https://github.com/esanchezros/bigquery-maven-plugin
Maven plugin for managing BigQuery datasets, tables and views
bigquery java maven maven-plugin
Last synced: 22 Jan 2025
https://github.com/chukwuemekaaham/data-engineering-zoomcamp
Datatalks Club Free Data Engineering Zoomcamp Project
bigquery dbt docker-compose duckdb gcp gcp-cloud-storage github-actions jupyter-notebook kafka linux looker-studio mageai pandas postgresql prefect python redpanda risingwave spark terraform
Last synced: 17 Jan 2025
https://github.com/teraearlywine/sample_sql
The following repo contains samples of SQL code that can be referenced by future clients or employers.
Last synced: 21 Jan 2025
https://github.com/metrics-pli/bigquery-export
Exports collected metrics to Google Big Query
bigquery datastudio lighthouse metrics metrics-pli performance pupeteer
Last synced: 25 Jan 2025
https://github.com/elithrar/finding-bugs-with-bigquery
A talk on using BigQuery, the GitHub Public Data & some elbow grease to find bugs in OSS projects.
big-data bigquery bugs github golang open-source
Last synced: 24 Jan 2025
https://github.com/misicode/Kaggle-Intro_to_SQL
Solutions of the exercises from the course "Introduction to SQL with BigQuery" by @Kaggle.
bigquery kaggle kaggle-intro-to-sql sql
Last synced: 23 Oct 2024
https://github.com/mehmoodulhaq570/bigquery_machine_learning_project
Developed a machine learning model to predict incident groups based on data from the London Fire Brigade service calls.
bigquery bigquery-dataset cloud database jupyter-notebook machine-learning machine-learning-algorithms ml models prediction-algorithm prediction-model python
Last synced: 22 Dec 2024
https://github.com/tupizz/fiap_pnad-covid-19
Este projeto realiza a análise e transformação de dados da PNAD COVID-19 de maio a julho de 2020, utilizando PySpark para processamento de dados em larga escala e BigQuery como destino para armazenamento e análise posterior. O objetivo é consolidar os dados mensais em um único conjunto de dados transformado.
analysis bigquery pyspark python
Last synced: 09 Feb 2025
https://github.com/essien1990/etl_pipeline_airflow
Creating pipelines using Python3 and Apache Airflow to load tables into Google Big Query Dataware House
airflow airflow-dags airflow-operators bash bigquery bq datawarehouse etl-pipeline python3
Last synced: 21 Jan 2025
https://github.com/rohitsanj/superset-dbt-demo
This repository contains an example project (Jaffle Shop) demonstrating integration between Superset and dbt, with BigQuery as the data warehouse.
apache-superset bigquery dbt superset
Last synced: 23 Jan 2025
https://github.com/danlessa/meta_qa
A practical one-liner metalanguage for describing common-sense in an machine-friendly way.
Last synced: 08 Feb 2025
https://github.com/mlabarrere/pygquery
🐷 Multitread your data with Google BigQuery
bigquery dataframe google-bigquery multithreading pandas python
Last synced: 21 Jan 2025
https://github.com/fpopic/bigquery-schema-select
(Script) Generates SQL query that selects all fields (recursively for nested fields) from the provided BigQuery schema file.
bigquery bigquery-schema scala sql
Last synced: 21 Jan 2025
https://github.com/ostrokach/uniparc_xml_parser
UniParc dataset describing ~300 million protein sequences converted into relational tables accessible through Google BigQuery (and as Parquet files).
bigquery bioinformatics csv-files parquet-files protein-domains protein-sequences
Last synced: 21 Jan 2025
https://github.com/shubhammohanty680/uber_data_analysis
bigquery data-analysis gcp-compute gcp-project looker-studio mageai python
Last synced: 21 Jan 2025
https://github.com/benitomartin/benitomartin
Personal profile 😎
anaconda artificial-intelligence aws bash-script bigquery data-science gcp lambda-functions large-language-models linux machine-learning python pytorch retrieval-augmented-generation sagemaker scikit-learn tensorflow terraform
Last synced: 31 Dec 2024
https://github.com/chandanpasunoori/event-sync
Event Sync is for syncing events from multiple sources to multiple destinations, targetted for adhoc events, where sources support acknowledgement functionality.
bigquery golang-tools google-cloud-platform pubsub
Last synced: 19 Dec 2024
https://github.com/george-nyamao/gcp_etl_project
An ETL pipeline to move an uploaded flat file ffrom GCS, mask PII, store Big Query, and Create a report in Looker.
airflow bigquery cloudcomposer data-fusion gcs-bucket looker python3 wrangler
Last synced: 21 Jan 2025
https://github.com/prathmeshyelne/etl-pipeline-for-employee-data-using-data-fusion-airflow
This repository contains code and configuration files for an Extract, Transform, Load (ETL) project using Google Cloud Data Fusion for data extraction, Apache Airflow/Composer for orchestration, and Google BigQuery for data loading.
airflow bigquery dataengineering etl gcp googlecloudplatform
Last synced: 21 Jan 2025
https://github.com/tomgorb/project-template-for-production
project template to (help) put a Machine/Deep learning algorithm into production
Last synced: 09 Jan 2025
https://github.com/nghiant3110/google_analytic_4
This is a DA project based on the GA4 Sample dataset on Big Query
bigquery google-analytics looker-studio sql
Last synced: 24 Dec 2024
https://github.com/seandavi/aisr-data-warehouse
Animal Image Shared Resource PACS/Viewer
api bigquery clinical-information-system dicom dicom-files gcp image-analysis pacs radiology
Last synced: 22 Dec 2024
https://github.com/windi-wulandari/pbi_kimia-farma-x-rakamin
A data-driven analytics project for Kimia Farma to evaluate business performance from 2020-2023 using BigQuery. Focused on transaction data, inventory, branch operations, and product insights. Results were visualized through an interactive dashboard to support strategic decisions and optimizations.
big-data-analytics bigquery datawarehouse googlelooker sql
Last synced: 23 Jan 2025
https://github.com/stkchan/web-scraping-with-selenium
bigquery pandas python selenium-webdriver webscraping
Last synced: 21 Jan 2025
https://github.com/antoinegiraud/dataform_hypermarche
SQL repo orchestrated by Dataform for BigQuery
Last synced: 08 Jan 2025
https://github.com/alterra-greeve/de-capstone
Capstone Project SIB Batch 6 x Alterra Academy - Data Engineer
bigquery cloud-function data-engineering docker googlefirebase looker-studio python
Last synced: 21 Nov 2024
https://github.com/shrawans007/google_cyclistic_2023
Google Data Analytics Capstone Case Study (SQL and Tableau)
big-query bigquery coursera-assignment cyclistic cyclistic-bike-share-analysis-case-study cyclistic-bikshare data-analysis data-analysis-project data-analytics data-cleaning data-combination data-exploration data-science google-data-analytics sql tableau tableau-dashboard tableau-public
Last synced: 08 Jan 2025
https://github.com/shinichi-takii/atom-language-sql-bigquery
BigQuery SQL language support in Atom
atom atom-package bigquery grammar snippets sql syntax-highlighting
Last synced: 18 Dec 2024
https://github.com/ackeecz/terraform-gcp-dataflow_pubsub_to_bq
Dataflow job subscriber to PubSub subscription. It takes message from subscription and push it into BigQuery table.
bigquery dataflow pubsub terraform-module
Last synced: 07 Jan 2025
https://github.com/greenpeace/gpes-old-en-petitions-api-emulator
Emulates the deprecated EN petition's API. Useful if you have legacy microsites with petitions.
bigquery mysql petitions sqlite3
Last synced: 17 Nov 2024
https://github.com/zkan/running-bigquery-query-from-airflow-using-bigqueryexecuteoperator
Running BigQuery Query from Airflow using BigQueryExecuteOperator
Last synced: 19 Dec 2024
https://github.com/yaph/queries
Collection of Data Queries in SPARQL and SQL
bigquery data-mining dbpedia openstreetmap osm queries sparql sql stackoverflow wikidata
Last synced: 08 Jan 2025
https://github.com/squidmin/java17-spring-gradle-bigquery-reference
Java v17⋅ Spring v3 ⋅ Gradle ⋅ BigQuery
bigquery gradle java java-17-gradle java17 java17-spring-boot spring-boot-3
Last synced: 07 Feb 2025
https://github.com/i62navpm/vue-front-app
Front app + Google cloud tools
appengine bigquery firebase firestore-database google-cloud google-functions google-storage puppeteer
Last synced: 17 Jan 2025
https://github.com/icarusso/bigqueryexporter
Export query data from google bigquery to local machine
Last synced: 21 Nov 2024
https://github.com/ajaxbarcelonacruyff/gcp_cost
Monitoring Google Cloud costs with Looker Studio.
bigquery googlecloud googlecloudplatform lookerstudio
Last synced: 25 Dec 2024
https://github.com/ajaxbarcelonacruyff/ga4_bigquery_session_source
Creating GA4 session references in BigQuery.
Last synced: 25 Dec 2024
https://github.com/yu-iskw/terraform-google-copy-bq-datasets
A terraform module to copy BigQuery datasets across regions
bigquery data-engineering google-cloud terraform
Last synced: 21 Dec 2024
https://github.com/matt-strautmann/dbt-bigquery-ecommerce-quickstart
Welcome to the dbt-BigQuery Quickstart Project! 🎉 This repository is designed as a hands-on guide to help you build a modern data stack leveraging powerful tools like Airbyte for ingestion, dbt for transformation, and BigQuery for storage and analytics.
bigquery dbt e-commerce quickstarts
Last synced: 17 Jan 2025
https://github.com/nais/bqrator
Operator for creating BigQuery datasets
bigquery bigquery-operator kubernetes kubernetes-operator nais-features
Last synced: 04 Feb 2025
https://github.com/gdbecker/dbtlabslearning
Learn the foundational steps of transforming data in dbt Cloud. Start by connecting dbt Cloud to a data warehouse and Git repository, then explore key concepts like modeling, sources, testing, documentation, and deployment. Get hands-on by building a model and running tests in dbt Cloud.
analytics-engineering bigquery dbt dbt-cloud jinja macros models packages sql testing
Last synced: 22 Jan 2025
https://github.com/pawel045/big-tech-stocks
ETL project
big-data bigquery dataengineering etl
Last synced: 05 Feb 2025
https://github.com/samanthalang/samanthalang_portfolio
Une data analyste avec la vision d'une consommatrice et la stratégie d'une marketeuse.
bigquery excel figma mysql notebook numpy pandas postgresql powerbi powerquery python sql sqlite wordpress
Last synced: 05 Feb 2025
https://github.com/digitaloptimizationgroup/digitaloptgroup-r-notebooks
A collection of R notebooks to analyze data from the Digital Optimization Group Platform
ab-testing bigquery jupyter-notebook performance-analysis r web-analytics
Last synced: 21 Jan 2025
https://github.com/francois-lenne/play-bq-gcp
Data pipeline in order to retrieve data from the playstation API to BigQuery
bigquery cicd data-engineering google-cloud python
Last synced: 13 Jan 2025
https://github.com/ruru-lyy/nyc-taxi-service-pipeline
In this project, I built a data pipeline using Mage.ai for ETL, GCP for storage, BigQuery for querying, and Looker Studio for analytics. This project helped me learn how to process, store, and visualize data effectively using modern tools.
bigquery data-engineering data-modeling etl-pipeline looker mage-ai python
Last synced: 23 Jan 2025
https://github.com/edumoraes1/spam_count_sfmc
Consulta de SQL com contagem de envios de email e spam dos ultimos 365 dias
bigquery marketing-cloud salesforce sql
Last synced: 31 Dec 2024
https://github.com/yoshiyukikato/nightharbor-bigquery-reporter
A nightharbor reporter for GCP BigQuery
Last synced: 23 Jan 2025
https://github.com/victorcezeh/data-engineering-final-semester-portfolio
This GitHub repository serves as a comprehensive platform for managing and showcasing my data engineering projects and assessments throughout my final semester at Alt School Africa. Designed to foster collaboration, organization, and continuous improvement, this repository is the backbone of my academic journey in data engineering.
bigquery docker gcs-bucket postgresql python
Last synced: 17 Nov 2024
https://github.com/vidyadnina/cyclistic-sql-tableau-project
Trip data analysis for a bike-sharing service company using SQL and Tableau.
bigquery dashboard data-analysis data-analytics-sql data-cleaning data-visualization sql
Last synced: 21 Jan 2025
https://github.com/antbit96/dataform_poc
Template for basic data preparation
bigquery bigquery-dataform data-preparation
Last synced: 14 Dec 2024
https://github.com/lupusruber/music_analytics
This project processes real-time music event data using Kafka, Apache Spark on Google Cloud Dataproc, and stores the transformed data in BigQuery for analytics, all orchestrated by Airflow and managed with Terraform.
bigquery data-proc dimensional-modeling gcp-project kafka spark-structured-streaming
Last synced: 02 Feb 2025
https://github.com/drvipulasharma/e-commerce-data-analysis-sql-big---query
E-Commerce-Data-Analysis-SQL-Big-Query
Last synced: 23 Jan 2025
https://github.com/shahardekel/diabetes-analysis
bigquery cognos-dashboard python sql
Last synced: 18 Dec 2024
https://github.com/xennis/particulate-matter-sensor-storage
Store the particulate matter data from a luftdaten.info sensor in BigQuery
bigquery cloud-function luftdaten particulate-matter sensor-data
Last synced: 18 Nov 2024
https://github.com/lixx21/airflow-dbt-gcp
A comprehensive data pipeline leveraging Airflow, DBT, Google Cloud Platform (GCP), and Docker to extract, transform, and load data seamlessly from a staging layer to a data warehouse and data mart.
airflow bigquery data-engineer dbt gcp
Last synced: 29 Jan 2025
https://github.com/oguzgn/fully-automated-performance-marketing-dashboard
This project integrates data from multiple ad platforms with Google Analytics to track marketing campaigns. It uses a structured naming system and UTM tags. Data is visualized in Looker Studio dashboards to analyze campaign performance and ad spend.
bigquery data-analysis data-engineering data-modeling marketing-analytics marketing-automation marketing-data-science marketingdata sql
Last synced: 29 Jan 2025
https://github.com/gabrieladados/people-analytics
People Analytics: Insights para Retenção de Talentos
bigquery figma people-analytics sql tableau
Last synced: 29 Jan 2025
https://github.com/pratshrestha/cochin-traders---sql--sales-analysis
Cochin Traders imports and exports specialty foods globally. This project analyzes sales and operational data to enhance business efficiency, supply chain management, and sales performance. Key areas of focus include
bigquery customer-engagement employee-performance inventory-management sales-trends sql
Last synced: 21 Jan 2025
https://github.com/scraly/bigquery
Google BigQuery AaaS tools, tips and fun
Last synced: 25 Dec 2024
https://github.com/brpy/nyc-trips
Data engineering | Zoomcamp journey on nyc trip data with gcp stack
Last synced: 22 Dec 2024
https://github.com/ayresgneto/use-case-gcp-etl
ELT pipeline GCP. Tecnologias utilizadas: Postgresql, GCP Storage, Airflow (local), Pyspark (local), BigQuery
airflow big-data bigquery data data-engineering etl gcp pipeline postgresql programming-oriented-object pyspark python spark
Last synced: 21 Jan 2025
https://github.com/tosh2230/cdc-rds-bq
Change data capture from Amazon RDS to Google BigQuery
bigquery changedatacapture rds
Last synced: 21 Jan 2025
https://github.com/manuelandersen/football-pipeline
DE Zoomcamp 2024 Final Project 🧙
bigquery data-engineering data-lake data-warehouse dbt dbt-cloud etl-pipeline google-cloud looker-studio mageai python
Last synced: 21 Jan 2025
https://github.com/codingsancho/fastapi-bigquery
Learning exercise, Python backend, FastAPI, bigquery, React-JS frontend.
bigquery fastapi javascript python react
Last synced: 20 Dec 2024
https://github.com/adindasarianti/rakamin_kf_analytics
This repository contains my project as a Big Data Analytics intern at Kimia Farma, where I analyzed the performance of Kimia Farma from 2020 to 2023
bigquery dataanalytics lookerstudio
Last synced: 02 Jan 2025
https://github.com/phukon/package-insights
PyPI package reports and insights. The data was ingested from publicly available source using BigQuery and then transformed.
Last synced: 27 Jan 2025
https://github.com/dobsontom/basket-abandonment
Data pipeline for detecting and responding to basket abandonment using BigQuery and Adobe Campaign.
adobe-campaign bigquery ga4 gcp sql
Last synced: 21 Nov 2024
https://github.com/niteshchawla/nc-sql-business-case
A Leading Retail chain brand and a prominent retailer in the United States. It makes itself a preferred shopping destination by offering outstanding value, inspiration, innovation and an exceptional guest experience that no other retailer can deliver.
bigquery retail sql supermarket
Last synced: 21 Jan 2025
https://github.com/erik-ingwersen-ey/iowa_sales_forecast
Iowa Liquor Sales Forecast Model
arima bigquery bigquery-ml google-cloud sales-forecast
Last synced: 21 Nov 2024
https://github.com/hrialan/dataform-prune
An open-source tool for automating the cleanup of outdated objects in Dataform configurations, optimizing data workflows with seamless CI/CD integration.
automation bigquery data-analytics dataform
Last synced: 21 Nov 2024
https://github.com/sahilmb/employee-churn-da
A data analysis project on employee churn rate using Google Bigquery, Looker, Pycaret and Colab
bigquery looker-studio pycaret
Last synced: 21 Nov 2024
https://github.com/ket0825/v1-gcp-preview
Preview 서비스를 위한 GCP 레포 / Manage GCP src for preview services
bigquery cloud-functions cloud-run cloudbuild gcp logging pubsub
Last synced: 21 Nov 2024
https://github.com/ka-zo/booking-data-analysis
Booking data analysis
airline-booking apache-beam bigquery google-cloud looker-studio python3
Last synced: 10 Feb 2025
https://github.com/marielachirinosr/analysis-urgencias-hospital-pitalito
This project involves analyzing emergency room admission data from the E.S.E Hospital Departamental de Pitalito using a star schema model.
bigquery data data-analysis etl-pipeline tableau
Last synced: 21 Nov 2024
https://github.com/themihirmathur/uber-data-analytics
The goal of this project is to perform comprehensive data analytics on Uber trip data using a modern data engineering stack on Google Cloud Platform (GCP).
bigquery data-analysis data-engineering etl-pipeline google-cloud-platform looker python
Last synced: 21 Nov 2024
https://github.com/thanhloc81/sql-project-bicycles-practise
✨ Utilizing SQL to extract data following a simulated task involving the Sales and Product modules
adventureworks bicycle bigquery google-cloud sql
Last synced: 21 Jan 2025
https://github.com/davidkhala/gcp-collections
Notebooks for GCP services
bigquery bq databricks datastore firestore google-cloud-platform
Last synced: 21 Nov 2024
https://github.com/abdelnaem2002/ecommerce-analysis-dbt
Ecommerce Analysis Using Dbt
bigquery dbt dbt-cloud github looker-studio sql
Last synced: 29 Jan 2025
https://github.com/tharun2806/end-to-end-internship-data-analysis
Internship Dataset Analysis is an end-to-end project analyzing an internship dataset obtained from Kaggle. The project involves cleaning and preprocessing the data using Excel and SQL, followed by exploratory data analysis (EDA). The analysis includes statistical, sectoral and geospatial insights, visualized through an interactive Tableau dashboard
bigquery data-analysis data-cleaning data-preprocessing data-visualization exploratory-data-analysis geospatial-analysis microsoft-excel reporting sectoral-analysis statistical-analysis tableau-public
Last synced: 07 Feb 2025
https://github.com/newtonmunene99/sec-filings
Simple golang app that crawls sec EDGAR filings and loads indices into Google BigQuery
bigquery cloudstorage gcp golang
Last synced: 21 Jan 2025
https://github.com/ivanildobarauna/ivanildobarauna
Special Repository to Make README
ai airflow big-data bigquery data-engineering gcp python
Last synced: 22 Jan 2025
https://github.com/juldrixx/bigquery-avro-schema-converter
Website to convert a schema from one format to another between BigQuery and Avro
avro avro-schema bigquery bigquery-schema converter schema
Last synced: 22 Jan 2025
https://github.com/branb97/jobstreet-data-eng-project
Building a data pipeline to deliver job listing data from Jobstreet for analysis.
airflow bigquery data-engineering etl-pipeline google-cloud looker-studio python sql
Last synced: 22 Jan 2025
https://github.com/ivdatahub/pypi-package-stats
Project for ingest pypi packages data from BigQuery and send to DataDog for analysis and insights with dashboards, monitors and more
bigquery cloud data-engineering data-warehouse gcp software-engineering
Last synced: 21 Nov 2024
https://github.com/push-protocol/push-google-bigquery
The Power of Web3 Big Data: A Guide to Using Google BigQuery and Push Protocol for Data Communication and Analysis
bigquery data push push-notifications web3
Last synced: 31 Jan 2025
https://github.com/yu-iskw/homebrew-bigquery-to-datastore
A homebrew tap for bigquery-to-datastore
bigquery google-datastore homebrew
Last synced: 10 Feb 2025