Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
BigQuery
Google BigQuery enables companies to handle large amounts of data without having to manage infrastructure. Google’s documentation describes it as a « serverless architecture (that) lets you use SQL queries to answer your organization’s biggest questions with zero infrastructure management. BigQuery’s scalable, distributed analysis engine lets you query terabytes in seconds and petabytes in minutes. » Its client libraries allow the use of widely known languages such as Python, Java, JavaScript, and Go. Federated queries are also supported, making it flexible to read data from external sources.
📖 A highly rated canonical book on it is « Google BigQuery: The Definitive Guide », a comprehensive reference. Another enriching read on the subject is the inside story told in the article by the founding product manager of BigQuery celebrating its 10th anniversary.
- GitHub: https://github.com/topics/bigquery
- Wikipedia: https://en.wikipedia.org/wiki/BigQuery/
- Repo: https://github.com/GoogleCloudPlatform/bigquery-utils/
- Released: May 19, 2010
- Related Topics: cloud-computing,
- Aliases: bq,
- Last updated: 2024-11-12 00:03:12 UTC
- JSON Representation
https://github.com/teraearlywine/sample_sql
The following repo contains samples of SQL code that can be referenced by future clients or employers.
Last synced: 12 Oct 2024
https://github.com/esanchezros/bigquery-maven-plugin
Maven plugin for managing BigQuery datasets, tables and views
bigquery java maven maven-plugin
Last synced: 28 Sep 2024
https://github.com/justinbeckwith/bisquick
🥞Synchronize your GitHub issues with BigQuery. Do neat stuff.
Last synced: 01 Nov 2024
https://github.com/gdbecker/dbtlabslearning
Learn the foundational steps of transforming data in dbt Cloud. Start by connecting dbt Cloud to a data warehouse and Git repository, then explore key concepts like modeling, sources, testing, documentation, and deployment. Get hands-on by building a model and running tests in dbt Cloud.
analytics-engineering bigquery dbt dbt-cloud jinja macros models packages sql testing
Last synced: 13 Oct 2024
https://github.com/cch0/data-engineering-zoomcamp-2024-project
2024 project
bigquery cicd cloud-storage-application cloudstorage gcp mage pipelines terraform
Last synced: 01 Nov 2024
https://github.com/misszeferino/sql-projects
bigquery data-analysis mysql queries sql sqlite3
Last synced: 12 Oct 2024
https://github.com/nguyendangxuanlinh/newyorkbike-rental-trip-time-prediction-model-googlebigquery
The ML project uses Linear Regression to predict the trip time of a bike rental for a new prediction system in new mobile application. The ML datasets have been collected and stored in a BigQuery public dataset
bigquery linear-regression machine-learning
Last synced: 12 Oct 2024
https://github.com/essien1990/etl_pipeline_airflow
Creating pipelines using Python3 and Apache Airflow to load tables into Google Big Query Dataware House
airflow airflow-dags airflow-operators bash bigquery bq datawarehouse etl-pipeline python3
Last synced: 12 Oct 2024
https://github.com/icarusso/bigqueryexporter
Export query data from google bigquery to local machine
Last synced: 12 Oct 2024
https://github.com/poogles/pytest-bq
pytest fixtures for a local bigquery suitable for local development.
bigquery bigquery-emulator pytest
Last synced: 12 Oct 2024
https://github.com/mlabarrere/pygquery
🐷 Multitread your data with Google BigQuery
bigquery dataframe google-bigquery multithreading pandas python
Last synced: 12 Oct 2024
https://github.com/alimarzouk/paris-aq
ELTL pipeline to monitor air quality in the Paris Île-de-France area
airflow airquality big-data bigquery dataengineering gcs spark
Last synced: 13 Oct 2024
https://github.com/chukwuemekaaham/data-engineering-zoomcamp
Datatalks Club Free Data Engineering Zoomcamp Project
bigquery dbt docker-compose duckdb gcp gcp-cloud-storage github-actions jupyter-notebook kafka linux looker-studio mageai pandas postgresql prefect python redpanda risingwave spark terraform
Last synced: 11 Oct 2024
https://github.com/mattwelke/packt-book-bot
Bot that tweets and logs the Packt free eBook of the day in BigQuery daily.
bigquery bot ebooks ibm-cloud-functions java openwhisk
Last synced: 13 Oct 2024
https://github.com/xlfe/pyjdbq
The easiest way to ship journald logs to Google BigQuery
bigquery journald journald-logs logging security
Last synced: 12 Oct 2024
https://github.com/rsachdeva/illuminatingdeposits-gcp-trigger
Terraform usage in the context of Google Cloud Platform GCP based Trigger of Resources applied to Cloud Functions. Both resource creation and destruction is through Terraform.
bigquery bigquery-table cloud-events functions-framework gcp go golang golangci-lint google-cloud google-cloud-function-pubsub-trigger google-cloud-functions google-cloud-pubsub google-cloud-sdk google-cloud-storage google-cloud-terraform sendgrid terraform
Last synced: 12 Oct 2024
https://github.com/misicode/Kaggle-Intro_to_SQL
Solutions of the exercises from the course "Introduction to SQL with BigQuery" by @Kaggle.
bigquery kaggle kaggle-intro-to-sql sql
Last synced: 23 Oct 2024
https://github.com/romange/puma
Bigquery-like engine for processing structured json-like records
Last synced: 13 Oct 2024
https://github.com/elithrar/finding-bugs-with-bigquery
A talk on using BigQuery, the GitHub Public Data & some elbow grease to find bugs in OSS projects.
big-data bigquery bugs github golang open-source
Last synced: 13 Oct 2024
https://github.com/chandanpasunoori/event-sync
Event Sync is for syncing events from multiple sources to multiple destinations, targetted for adhoc events, where sources support acknowledgement functionality.
bigquery golang-tools google-cloud-platform pubsub
Last synced: 15 Oct 2024
https://github.com/coatless/bigquery-reddit-ask-your-advisor
Analysis code that counts instances of a phrase on Reddit (e.g. "ask your advisor")
ask-your-advisor bigquery r reddit
Last synced: 11 Oct 2024
https://github.com/pmhalvor/whale-speech
A pipeline to map whale sightings to hydrophone audio
beam bigquery gcs mle model-as-a-service python tensorflow2
Last synced: 21 Oct 2024
https://github.com/squidmin/java17-spring-gradle-bigquery-reference
Java v17⋅ Spring v3 ⋅ Gradle ⋅ BigQuery
bigquery gradle java java-17-gradle java17 java17-spring-boot spring-boot-3
Last synced: 27 Oct 2024
https://github.com/greenpeace/gpes-old-en-petitions-api-emulator
Emulates the deprecated EN petition's API. Useful if you have legacy microsites with petitions.
bigquery mysql petitions sqlite3
Last synced: 03 Aug 2024
https://github.com/yu-iskw/terraform-google-copy-bq-datasets
A terraform module to copy BigQuery datasets across regions
bigquery data-engineering google-cloud terraform
Last synced: 27 Oct 2024
https://github.com/seandavi/aisr-data-warehouse
Animal Image Shared Resource PACS/Viewer
api bigquery clinical-information-system dicom dicom-files gcp image-analysis pacs radiology
Last synced: 05 Nov 2024
https://github.com/anilkhichar/bq-table-copy-automation
Copy table from one dataset to another in google big query using bash script
automation bash bash-script big-query bigquery bigquery-cp gcp google
Last synced: 07 Nov 2024
https://github.com/tuancamtbtx/gcp-udfs-example
Google BigQuery Javascript UDF Function Examples
bigquery gcp javascript nodejs npm udf
Last synced: 09 Nov 2024
https://github.com/shinichi-takii/atom-language-sql-bigquery
BigQuery SQL language support in Atom
atom atom-package bigquery grammar snippets sql syntax-highlighting
Last synced: 31 Oct 2024
https://github.com/benitomartin/benitomartin
Personal profile 😎
anaconda artificial-intelligence aws bash-script bigquery data-science gcp lambda-functions large-language-models linux machine-learning python pytorch retrieval-augmented-generation sagemaker scikit-learn tensorflow terraform
Last synced: 08 Nov 2024
https://github.com/ackeecz/terraform-gcp-dataflow_pubsub_to_bq
Dataflow job subscriber to PubSub subscription. It takes message from subscription and push it into BigQuery table.
bigquery dataflow pubsub terraform-module
Last synced: 10 Nov 2024
https://github.com/tomgorb/project-template-for-production
project template to (help) put a Machine/Deep learning algorithm into production
Last synced: 11 Nov 2024
https://github.com/chukwuemekaaham/uber-gcp-etl-project
Data Engineering Zoomcamp Final Project
bigquery cloud-storage csv docker-compose gcp jupyter-notebook looker-studio mageai python spark spreadsheets terraform
Last synced: 11 Nov 2024
https://github.com/morphl-ai/morphl-model-publishers-churning-users-bigquery
BigQuery connector, pre-processor and model for predicting churning users for digital publishers using Google Analytics 360
bigquery google-analytics machine-learning morphl-platform pipeline preprocessor pyspark
Last synced: 12 Nov 2024
https://github.com/dataform-co/bigquery-ml-pipeline
An example of machine pipeline on Bigquery ML using Dataform
bigquery bigquery-ml dataform machine-learning-pip sql
Last synced: 13 Nov 2024
https://github.com/justinjsd/analytics-engineering
📊 A repository focusing on analytics engineering, particularly using dbt on the Northwind Sample dataset
analytics bigquery dbt engineering sql
Last synced: 13 Nov 2024
https://github.com/samedhi/gaend
Convert GAE Models into endpoints
bigquery elasticsearch google-app-engine restful taskqueue
Last synced: 13 Nov 2024
https://github.com/tatamiya/new-books-notification
Fetch new books from [版元ドットコム](https://www.hanmoto.com/) and notify them to Slack
bigquery cloudrun-jobs gcs golang slack
Last synced: 13 Nov 2024
https://github.com/zeinhasan/etl-using-airflow
Extract Transform Load Using Airflow
Last synced: 13 Oct 2024
https://github.com/garbetjie/phpunit-bigquery-schema
A BigQuery schema validator constraint for BigQuery
Last synced: 14 Oct 2024
https://github.com/nghiant3110/google_fiber_bi_5
This is a BI Capstone project based on the Google Fiber dataset from Google BI Course
bigquery google-sheets looker-studio sql
Last synced: 06 Nov 2024
https://github.com/ymyzk/bq-globalip
Record the current global IPv4 address to a BigQuery table.
Last synced: 14 Oct 2024
https://github.com/nghiant3110/b2b_crm_3
This is a DA project based on the B2B Sales CRM dataset from Maven Analytics
bigquery google-sheets looker-studio sql
Last synced: 06 Nov 2024
https://github.com/nghiant3110/e_com_1
This is a DA project base on E-com Data set (Thelook_ecom) in Big Query from Google
Last synced: 06 Nov 2024
https://github.com/nghiant3110/google_analytic_4
This is a DA project based on the GA4 Sample dataset on Big Query
bigquery google-analytics looker-studio sql
Last synced: 06 Nov 2024
https://github.com/mdornseif/datastore-to-bigquery
The missing Data Transfer Tool: Dump Google Cloud Datastore contents and load them into BigQuery.
backup bigquery bigquery-backup cloud datastore google
Last synced: 12 Oct 2024
https://github.com/ankita-selokar/fitbit-for-her-crafting-fitbit-s-strategy-for-women
This project analyzes smart device usage data to uncover trends and insights, guiding Fitbit by Google’s product and marketing strategies for their new women-focused product launch. It combines competitive market analysis with customer behavior insights to inform key decisions.
bigquery excel powerbi spreadsheet sql
Last synced: 23 Oct 2024
https://github.com/anpandu/ps2bq
Stream insert GCP PubSub messages into BigQuery table.
Last synced: 12 Oct 2024
https://github.com/siriospa/gcp-helpers-bigquery
Helpers for Google Cloud BigQuery.
bigquery gcp google-cloud-platform sirio
Last synced: 12 Oct 2024
https://github.com/victorcezeh/data-engineering-final-semester-portfolio
This GitHub repository serves as a comprehensive platform for managing and showcasing my data engineering projects and assessments throughout my final semester at Alt School Africa. Designed to foster collaboration, organization, and continuous improvement, this repository is the backbone of my academic journey in data engineering.
bigquery docker gcs-bucket postgresql python
Last synced: 12 Oct 2024
https://github.com/spacepatcher/google-workspace-gmail-collector
👁 App for collecting Gmail logs from your Google Workspace account and sending them to Kafka
bigquery gmail google-workspace security soc
Last synced: 23 Oct 2024
https://github.com/davelester/gharchive-bigquery-examples
Examples Using BigQuery to Analyze GH Archive Data
Last synced: 15 Oct 2024
https://github.com/vaibhavs10/ml-on-gcp
The repository walks through a Data Scientist focused way of building and deploying Machine Learning models on Google Cloud
aiplatform bigquery googlecloudplatform ml
Last synced: 25 Oct 2024
https://github.com/denny-b-justin/purdue
The internship was broadly to understand if the topics/events are being covered differently in the different countries and how they affect stock market returns. The provided dataset is a post-processed set of news articles, so already reflects topic modelling and sentiment analysis.
big-data bigquery finance gdelt-events python
Last synced: 12 Oct 2024
https://github.com/plishka/blockchain_analysis
Cryptocurrency On-Chain Analysis (Bitcoin Blockchain)
bigquery blockchain data-cleaning scraping-websites sql tableau
Last synced: 12 Oct 2024
https://github.com/raqssoriano/hha504_assignment_nosql_dbs
This task is part of my assignment focused on creating and configuring databases in different platforms, such as GCP's BigQuery, MongoDB Atlas, and Redis Cloud.
bigquery mongodb-atlas mongodbcompass redis redisinsight
Last synced: 31 Oct 2024
https://github.com/acardosolima/crypto-ethereum-tokens
This project aims to create a data pipeline using Airflow to ingest dataset from Google Bigquery to a PostgreSQL database. This stack will run in a local environment using Kubernetes.
airflow bigquery postgresql python
Last synced: 13 Oct 2024
https://github.com/patriciavalentine/loan-data-queries
In this project, I analyzed a vehicle loan dataset using BigQuery to identify demographic, financial, and loan patterns. Through SQL queries, I extracted insights such as the credit scores, and loan distribution by region, and explored high-risk profiles. The findings are visualized in Looker Studio, thus helping to inform strategic decisions.
asset-finance bigquery loan-data looker-studio
Last synced: 21 Oct 2024
https://github.com/akansharajput280799/strategic-analysis-of-retail-brand-in-south-america-using-sql
Leveraged Big Query and MySQL to analyze 100K records for sales optimization, trend identification, and enhancing customer satisfaction for a retail brand in South America and to provide insights and recommendations to improve their userbase and improve their services
bigquery data-analysis data-science database database-schema google-bigquery mysql-server sql
Last synced: 21 Oct 2024
https://github.com/aazuspan/landsat-bigquery
Summarizing 51 years of Landsat data using Earth Engine and BigQuery
bigquery google-earth-engine landsat
Last synced: 12 Oct 2024
https://github.com/janmin123/cyclistic
Capstone project for Google/Coursera Data Analytics Course
analysis bigquery sql tableau visualization
Last synced: 21 Oct 2024
https://github.com/yasarsultan/taxi-trip-analysis
The NYC Taxi Trip Batch Data Pipeline automates processing of large-scale trip data using Apache Spark and Airflow, integrating AWS S3 and Google BigQuery for storage and analytics. It features scalable, containerized workflows with robust data validation.
airflow aws-s3 bash-script batch-processing bigquery data-lake data-warehouse docker python3 spark
Last synced: 12 Nov 2024
https://github.com/kmohamedalie/bigquery-intro
Coursera BigQuery Introduction using Covid19 dataset
bigquery coursera covid-19 datavisualization looker-studio sql
Last synced: 12 Oct 2024
https://github.com/mikeghen/metadata
Pulls data from Socrata open data portals
Last synced: 07 Nov 2024
https://github.com/marielachirinosr/analysis-urgencias-hospital-pitalito
This project involves analyzing emergency room admission data from the E.S.E Hospital Departamental de Pitalito using a star schema model.
bigquery data data-analysis etl-pipeline tableau
Last synced: 12 Oct 2024
https://github.com/shubhammohanty680/uber_data_analysis
bigquery data-analysis gcp-compute gcp-project looker-studio mageai python
Last synced: 12 Oct 2024
https://github.com/marcopellegrinoit/web-traffic-time-series-predictions
Forecast Web Traffic Demand Time Series with ARIMA+ BigQuery and Looker Studio. Addionatel modeling available with ARIMA, LSTM, and Facebook Prophet.
arima bigquery gcp lstm prophet-model time-series vertex-ai
Last synced: 12 Oct 2024
https://github.com/antbit96/dataform_poc
Template for basic data preparation
bigquery bigquery-dataform data-preparation
Last synced: 26 Oct 2024
https://github.com/chdl17/nyc_green_taxis_peak_hour_analysis
This project analyzes GCP BigQuery data and uses Looker Studio to build a Peak Hour Analysis.
bigquery gcp google-cloud-platform looker-studio sql
Last synced: 12 Oct 2024
https://github.com/hckhanh/pg2bigquery
A CLI tool to convert query from PostgreSQL to BigQuery
big bigquery google pg pgsql postgres postgres-tool postgresql postgresql-database postgressql query query-parser querybuilder sql sql-toolkit sql-tools tool toolbox toolkit utility
Last synced: 14 Oct 2024
https://github.com/manuelandersen/football-pipeline
DE Zoomcamp 2024 Final Project 🧙
bigquery data-engineering data-lake data-warehouse dbt dbt-cloud etl-pipeline google-cloud looker-studio mageai python
Last synced: 12 Oct 2024
https://github.com/shahardekel/diabetes-analysis
bigquery cognos-dashboard python sql
Last synced: 31 Oct 2024
https://github.com/ket0825/v1-gcp-preview
Preview 서비스를 위한 GCP 레포 / Manage GCP src for preview services
bigquery cloud-functions cloud-run cloudbuild gcp logging pubsub
Last synced: 12 Oct 2024
https://github.com/vidyadnina/other-sql-projects-and-queries
Other SQL projects and queries.
Last synced: 12 Oct 2024
https://github.com/sahilmb/employee-churn-da
A data analysis project on employee churn rate using Google Bigquery, Looker, Pycaret and Colab
bigquery looker-studio pycaret
Last synced: 12 Oct 2024
https://github.com/hanif-syazul/analyzing-kimia-farma-sales-performance-with-gcp
This repository contains the final project for the Rakamin Big Data Analytics Internship. It include a complete dashboard of Kimia Farma's sales performance analysis from 2020 to 2023.
big-data-analytics bigquery internship-project kimia-farma looker-studio rakamin sql
Last synced: 13 Oct 2024
https://github.com/lawal-hash/olistelt
An end-to-end ELT data pipeline of the Brazilian olist e-commerce dataset using the modern data stack
airflow bigquery dbt dbt-core docker postgresql sql
Last synced: 12 Oct 2024
https://github.com/shaheerazam-dev/cyclistic-case-study-google-data-analytics-certificate
This case study simulates the real-world experience of a junior data analyst at Cyclistic, a fictional company. We will leverage the data analysis process framework (Ask, Prepare, Process, Analyze, Share, Act) to address critical business questions and provide data-driven insights to guide strategic decision-making.
bigquery data-science data-visualization spreadsheet sql tableau
Last synced: 12 Oct 2024
https://github.com/erik-ingwersen-ey/iowa_sales_forecast
Iowa Liquor Sales Forecast Model
arima bigquery bigquery-ml google-cloud sales-forecast
Last synced: 12 Oct 2024
https://github.com/kevin-rsj/real-estate-investments
Sistema de scoring que clasifica ciudades francesas para inversión en segundas viviendas según perfil de riesgo(alto, moderado y bajo). Evalúa ratios clave en áreas como demanda, disponibilidad, infraestructura, demografía y precios.
bigquery data-analytics looker-studio numpy pandas python sklearn-library sql visualization
Last synced: 29 Oct 2024
https://github.com/lisabensoussan/bigdataminig_finalassignment
This repository contains solutions for the final assignment of the Big Data Mining course (52002/52019), focusing on querying large datasets with BigQuery, network analysis with Python, and distributed data processing with Apache Spark.
bigquery community-detection data-cleaning dataframe exploratory-data-analysis pagerank rdd sql text-analysis visualization
Last synced: 12 Oct 2024
https://github.com/karencofre/marketing-segmentacion-en-powerbi
Proyecto prueba de hipótesis en powerbi y python
bigquery google-colab powerbi python sql statsmodels
Last synced: 12 Oct 2024
https://github.com/dobsontom/basket-abandonment
Data pipeline for detecting and responding to basket abandonment using BigQuery and Adobe Campaign.
adobe-campaign bigquery ga4 gcp sql
Last synced: 12 Oct 2024
https://github.com/janaom/gcp-de-project-data-pipeline-with-cloud-run-functions-airflow-biggueryml
Build a data pipeline on Google Cloud using an event-driven architecture, leveraging GCS, Cloud Run functions, and BigQuery. Explore both VM and Composer options for Airflow management, and utilize Logging & Monitoring for pipeline health. Discover how SQL-based BigQuery ML can be used for initial ML implementation in specific scenarios.
airflow bigquery bigqueryml cloud-functions cloud-run-functions composer data-engineering-project google-cloud-platform
Last synced: 13 Oct 2024
https://github.com/vaishnavipaithane/bellabeat-data-analysis-case-study
This capstone project was done as a part of Google Data Analytics Professional Certificate course.
bigquery data-analysis sql tableau
Last synced: 12 Oct 2024
https://github.com/tirendazacademy/hands-on-data-science-with-gcp
Google BigQuery Tutorial
big-data big-data-analytics bigdata bigquery bigquery-ml bigqueryml cloud-computing data-analysis data-analytics data-engineering data-science dataanalysis dataengineering google-bigquery google-cloud-platform machienlearning machine-learning
Last synced: 08 Nov 2024
https://github.com/yu-iskw/homebrew-bigquery-to-datastore
A homebrew tap for bigquery-to-datastore
bigquery google-datastore homebrew
Last synced: 30 Oct 2024
https://github.com/phstudy/zetasketch-bigquery-example
An example demonstrates how to use ZetaSketch with BigQuery
Last synced: 12 Oct 2024
https://github.com/mlund2k/project-1-baseball-performance-vs.-attendance
Project assets for my first exploratory data analysis: Baseball Performance vs. Attendance.
bigquery data-analysis data-cleaning data-visualization excel rstudio sql tableau tidyverse
Last synced: 12 Oct 2024
https://github.com/yu-iskw/bigquery-lineage
Visualize BigQuery data lineage graph
bigquery data-governance data-management visualization
Last synced: 30 Oct 2024
https://github.com/mutaharshaik/airflow_retail_project
Airflow retail project using pipeline with BigQuery, dbt, Soda
airflow astro-cli astro-sdk bigquery datamodeling dbt docker etl-pipeline gcp snowflake soda
Last synced: 13 Oct 2024
https://github.com/cyber-programmer/web-traffic-analytics-ml-model
This Jupyter Notebook focuses on classifying website visitors using logistic regression. The project leverages Google Analytics sample data and BigQuery for data analysis and feature engineering. It provides a comprehensive workflow that includes data import, preprocessing, exploratory data analysis.
bigquery logistic-regression machine-learning
Last synced: 12 Oct 2024