Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
BigQuery
Google BigQuery enables companies to handle large amounts of data without having to manage infrastructure. Google’s documentation describes it as a « serverless architecture (that) lets you use SQL queries to answer your organization’s biggest questions with zero infrastructure management. BigQuery’s scalable, distributed analysis engine lets you query terabytes in seconds and petabytes in minutes. » Its client libraries allow the use of widely known languages such as Python, Java, JavaScript, and Go. Federated queries are also supported, making it flexible to read data from external sources.
📖 A highly rated canonical book on it is « Google BigQuery: The Definitive Guide », a comprehensive reference. Another enriching read on the subject is the inside story told in the article by the founding product manager of BigQuery celebrating its 10th anniversary.
- GitHub: https://github.com/topics/bigquery
- Wikipedia: https://en.wikipedia.org/wiki/BigQuery/
- Repo: https://github.com/GoogleCloudPlatform/bigquery-utils/
- Released: May 19, 2010
- Related Topics: cloud-computing,
- Aliases: bq,
- Last updated: 2024-11-15 00:03:21 UTC
- JSON Representation
https://github.com/anilkhichar/bq-table-copy-automation
Copy table from one dataset to another in google big query using bash script
automation bash bash-script big-query bigquery bigquery-cp gcp google
Last synced: 07 Nov 2024
https://github.com/esanchezros/bigquery-maven-plugin
Maven plugin for managing BigQuery datasets, tables and views
bigquery java maven maven-plugin
Last synced: 28 Sep 2024
https://github.com/mattwelke/packt-book-bot
Bot that tweets and logs the Packt free eBook of the day in BigQuery daily.
bigquery bot ebooks ibm-cloud-functions java openwhisk
Last synced: 13 Oct 2024
https://github.com/shinichi-takii/atom-language-sql-bigquery
BigQuery SQL language support in Atom
atom atom-package bigquery grammar snippets sql syntax-highlighting
Last synced: 31 Oct 2024
https://github.com/romange/puma
Bigquery-like engine for processing structured json-like records
Last synced: 13 Oct 2024
https://github.com/yu-iskw/terraform-google-copy-bq-datasets
A terraform module to copy BigQuery datasets across regions
bigquery data-engineering google-cloud terraform
Last synced: 27 Oct 2024
https://github.com/tomgorb/project-template-for-production
project template to (help) put a Machine/Deep learning algorithm into production
Last synced: 11 Nov 2024
https://github.com/icarusso/bigqueryexporter
Export query data from google bigquery to local machine
Last synced: 12 Oct 2024
https://github.com/kellyjadams/bigquery-python-weekly-report
A script to automate a weekly report that runs BigQuery in Python.
Last synced: 13 Oct 2024
https://github.com/elithrar/finding-bugs-with-bigquery
A talk on using BigQuery, the GitHub Public Data & some elbow grease to find bugs in OSS projects.
big-data bigquery bugs github golang open-source
Last synced: 13 Oct 2024
https://github.com/teraearlywine/sample_sql
The following repo contains samples of SQL code that can be referenced by future clients or employers.
Last synced: 12 Oct 2024
https://github.com/cch0/data-engineering-zoomcamp-2024-project
2024 project
bigquery cicd cloud-storage-application cloudstorage gcp mage pipelines terraform
Last synced: 01 Nov 2024
https://github.com/nguyendangxuanlinh/newyorkbike-rental-trip-time-prediction-model-googlebigquery
The ML project uses Linear Regression to predict the trip time of a bike rental for a new prediction system in new mobile application. The ML datasets have been collected and stored in a BigQuery public dataset
bigquery linear-regression machine-learning
Last synced: 12 Oct 2024
https://github.com/alterra-greeve/de-capstone
Capstone Project SIB Batch 6 x Alterra Academy - Data Engineer
bigquery cloud-function data-engineering docker googlefirebase looker-studio python
Last synced: 12 Oct 2024
https://github.com/chukwuemekaaham/uber-gcp-etl-project
Data Engineering Zoomcamp Final Project
bigquery cloud-storage csv docker-compose gcp jupyter-notebook looker-studio mageai python spark spreadsheets terraform
Last synced: 11 Nov 2024
https://github.com/justinbeckwith/bisquick
🥞Synchronize your GitHub issues with BigQuery. Do neat stuff.
Last synced: 01 Nov 2024
https://github.com/greenpeace/gpes-old-en-petitions-api-emulator
Emulates the deprecated EN petition's API. Useful if you have legacy microsites with petitions.
bigquery mysql petitions sqlite3
Last synced: 03 Aug 2024
https://github.com/tuancamtbtx/gcp-udfs-example
Google BigQuery Javascript UDF Function Examples
bigquery gcp javascript nodejs npm udf
Last synced: 09 Nov 2024
https://github.com/moh-ayman/stripeapi-to-bq---cfunc-etl
Google Cloud Function built to perform an ETL Job to Collect StripeAPI Data and Transform it to be able to Import it to Bigquery.
bigquery dataengineering etl-pipeline gcp gcp-cloud-functions pandas-dataframe python stripe-api
Last synced: 15 Nov 2024
https://github.com/analyticace/data-engineering-projects
Collection of Open Source Data Engineering Projects
aws big-data bigquery data docker engineering etl oracle-database pipeline sql
Last synced: 05 Nov 2024
https://github.com/morphl-ai/morphl-model-publishers-churning-users-bigquery
BigQuery connector, pre-processor and model for predicting churning users for digital publishers using Google Analytics 360
bigquery google-analytics machine-learning morphl-platform pipeline preprocessor pyspark
Last synced: 12 Nov 2024
https://github.com/chandanpasunoori/event-sync
Event Sync is for syncing events from multiple sources to multiple destinations, targetted for adhoc events, where sources support acknowledgement functionality.
bigquery golang-tools google-cloud-platform pubsub
Last synced: 15 Oct 2024
https://github.com/dataform-co/bigquery-ml-pipeline
An example of machine pipeline on Bigquery ML using Dataform
bigquery bigquery-ml dataform machine-learning-pip sql
Last synced: 13 Nov 2024
https://github.com/fpopic/bigquery-schema-select
(Script) Generates SQL query that selects all fields (recursively for nested fields) from the provided BigQuery schema file.
bigquery bigquery-schema scala sql
Last synced: 12 Oct 2024
https://github.com/mchmarny/sbomer
Generates daily SBOM and vulnerability reports for container images and saves resulting files into GCS bucket and data into BigQuery tables.
bigquery gcp gcs grype report sbom syft vex vulnerability
Last synced: 08 Nov 2024
https://github.com/stkchan/web-scraping-with-selenium
bigquery pandas python selenium-webdriver webscraping
Last synced: 12 Oct 2024
https://github.com/jaehyeon-kim/dbt-cicd-demo
DBT CI/CD Demo
bigquery cicd dataengineering dbt gcp github-actions
Last synced: 13 Oct 2024
https://github.com/benitomartin/benitomartin
Personal profile 😎
anaconda artificial-intelligence aws bash-script bigquery data-science gcp lambda-functions large-language-models linux machine-learning python pytorch retrieval-augmented-generation sagemaker scikit-learn tensorflow terraform
Last synced: 08 Nov 2024
https://github.com/justinjsd/analytics-engineering
📊 A repository focusing on analytics engineering, particularly using dbt on the Northwind Sample dataset
analytics bigquery dbt engineering sql
Last synced: 13 Nov 2024
https://github.com/samedhi/gaend
Convert GAE Models into endpoints
bigquery elasticsearch google-app-engine restful taskqueue
Last synced: 13 Nov 2024
https://github.com/tatamiya/new-books-notification
Fetch new books from [版元ドットコム](https://www.hanmoto.com/) and notify them to Slack
bigquery cloudrun-jobs gcs golang slack
Last synced: 13 Nov 2024
https://github.com/pedrocarmona/big_query_adapter
An ActiveRecord Google BigQuery adapter
activerecord bigquery gem ruby-on-rails
Last synced: 13 Oct 2024
https://github.com/sigpwned/jdbq
JDBI-inspired Database Access Framework for Java + BigQuery
bigquery data-access-framework data-access-layer data-access-library data-lake java persistence persistence-framework persistence-layer
Last synced: 12 Oct 2024
https://github.com/pmhalvor/whale-speech
A pipeline to map whale sightings to hydrophone audio
beam bigquery gcs mle model-as-a-service python tensorflow2
Last synced: 21 Oct 2024
https://github.com/justinjsd/analytics-engineer-bootcamp
This repository serves as a collection of my work and learnings throughout the bootcamp, focusing on developing skills in analytics engineering, particularly using dbt.
analytics bigquery dbt engineering sql
Last synced: 05 Nov 2024
https://github.com/miguelapp10/api_simpliroute_urbano
extraer datos de la API de SimpliRoute y Urbano en un rango de fechas específico y procesarlos para su análisis y almacenamiento en Google BigQuery
api-client bigquery pandas python
Last synced: 12 Oct 2024
https://github.com/rsachdeva/illuminatingdeposits-gcp-trigger
Terraform usage in the context of Google Cloud Platform GCP based Trigger of Resources applied to Cloud Functions. Both resource creation and destruction is through Terraform.
bigquery bigquery-table cloud-events functions-framework gcp go golang golangci-lint google-cloud google-cloud-function-pubsub-trigger google-cloud-functions google-cloud-pubsub google-cloud-sdk google-cloud-storage google-cloud-terraform sendgrid terraform
Last synced: 12 Oct 2024
https://github.com/xlfe/pyjdbq
The easiest way to ship journald logs to Google BigQuery
bigquery journald journald-logs logging security
Last synced: 12 Oct 2024
https://github.com/essien1990/etl_pipeline_airflow
Creating pipelines using Python3 and Apache Airflow to load tables into Google Big Query Dataware House
airflow airflow-dags airflow-operators bash bigquery bq datawarehouse etl-pipeline python3
Last synced: 12 Oct 2024
https://github.com/mlabarrere/pygquery
🐷 Multitread your data with Google BigQuery
bigquery dataframe google-bigquery multithreading pandas python
Last synced: 12 Oct 2024
https://github.com/phstudy/zetasketch-bigquery-example
An example demonstrates how to use ZetaSketch with BigQuery
Last synced: 12 Oct 2024
https://github.com/acardosolima/crypto-ethereum-tokens
This project aims to create a data pipeline using Airflow to ingest dataset from Google Bigquery to a PostgreSQL database. This stack will run in a local environment using Kubernetes.
airflow bigquery postgresql python
Last synced: 13 Oct 2024
https://github.com/aazuspan/landsat-bigquery
Summarizing 51 years of Landsat data using Earth Engine and BigQuery
bigquery google-earth-engine landsat
Last synced: 12 Oct 2024
https://github.com/antbit96/dataform_poc
Template for basic data preparation
bigquery bigquery-dataform data-preparation
Last synced: 26 Oct 2024
https://github.com/vaishnavipaithane/bellabeat-data-analysis-case-study
This capstone project was done as a part of Google Data Analytics Professional Certificate course.
bigquery data-analysis sql tableau
Last synced: 12 Oct 2024
https://github.com/mlund2k/project-1-baseball-performance-vs.-attendance
Project assets for my first exploratory data analysis: Baseball Performance vs. Attendance.
bigquery data-analysis data-cleaning data-visualization excel rstudio sql tableau tidyverse
Last synced: 12 Oct 2024
https://github.com/mutaharshaik/airflow_retail_project
Airflow retail project using pipeline with BigQuery, dbt, Soda
airflow astro-cli astro-sdk bigquery datamodeling dbt docker etl-pipeline gcp snowflake soda
Last synced: 13 Oct 2024
https://github.com/cyber-programmer/web-traffic-analytics-ml-model
This Jupyter Notebook focuses on classifying website visitors using logistic regression. The project leverages Google Analytics sample data and BigQuery for data analysis and feature engineering. It provides a comprehensive workflow that includes data import, preprocessing, exploratory data analysis.
bigquery logistic-regression machine-learning
Last synced: 12 Oct 2024
https://github.com/nghiant3110/firebase_6
This is a DA project based on the Firebase Sample dataset on Big Query
bigquery firebase looker-studio sql
Last synced: 12 Oct 2024
https://github.com/prathmeshyelne/etl-pipeline-for-employee-data-using-data-fusion-airflow
This repository contains code and configuration files for an Extract, Transform, Load (ETL) project using Google Cloud Data Fusion for data extraction, Apache Airflow/Composer for orchestration, and Google BigQuery for data loading.
airflow bigquery dataengineering etl gcp googlecloudplatform
Last synced: 12 Oct 2024
https://github.com/vidyadnina/cyclistic-sql-tableau-project
Trip data analysis for a bike-sharing service company using SQL and Tableau.
bigquery dashboard data-analysis data-analytics-sql data-cleaning data-visualization sql
Last synced: 12 Oct 2024
https://github.com/kevin-rsj/real-estate-investments
Sistema de scoring que clasifica ciudades francesas para inversión en segundas viviendas según perfil de riesgo(alto, moderado y bajo). Evalúa ratios clave en áreas como demanda, disponibilidad, infraestructura, demografía y precios.
bigquery data-analytics looker-studio numpy pandas python sklearn-library sql visualization
Last synced: 29 Oct 2024
https://github.com/vikasgupta1812/google_cloudml_scripts
https://goo.gl/dFjFQf
bigquery google-cloud-ml python-language tensorflow-tutorial
Last synced: 26 Sep 2024
https://github.com/khanovico/energy-data-analysis
This is the cloud model analyzing real world dataset with BigQuery and other big-data analyzing tools. I implemented docker image for running this app on cross-platform environments.
big-data-processing bigquery docker google-app-engine jupyter-notebook mlflow python scikit-learn seaborn xgboost
Last synced: 10 Oct 2024
https://github.com/moeabbas6/dbt_analytics_engine
An end-to-end project using dbt to demonstrate data transformations, testing, and visualization with Google BigQuery, and Looker Studio. It showcases a complete data pipeline from extraction/generation to deployment.
analytics-engineering bigquery data data-pipeline data-transformation data-visualization dbt testing
Last synced: 12 Oct 2024
https://github.com/kartikeya443/automated-data-pipeline-gcp
This project showcases the integration of various Google Cloud Platform services to build an efficient and automated data pipeline for sales data.
bigquery cloud data-engineering flask gcp google-cloud-platform looker-studio pipeline python sql
Last synced: 12 Oct 2024
https://github.com/mchmarny/stocker
Using tweeter sentiment and stock market price signal correlation to predict next day closing price
bigquery ml prediction regression-models
Last synced: 08 Nov 2024
https://github.com/yu-iskw/bigquery-lineage
Visualize BigQuery data lineage graph
bigquery data-governance data-management visualization
Last synced: 30 Oct 2024
https://github.com/yeha98555/google-maps-analysis-pipeline
Taiwan Travel Attractions Analysis Data Pipeline
airflow bigquery cloudfunctions docker gcp gcs googlemaps googlesheets python terraform
Last synced: 29 Sep 2024
https://github.com/mehmoodulhaq570/bigquery_machine_learning_project
This project develops a machine learning model to predict incident groups based on data from the London Fire Brigade service calls. Using Python and the Google Colab environment, the model utilizes a Gradient Boosting Classifier to categorize incidents, improving resource allocation and incident response within the London Fire Brigade.
bigquery bigquery-dataset cloud colabs database database-project google-colab ipnyb jupyter-notebook machine-learning prediction-algorithm prediction-model python
Last synced: 05 Nov 2024
https://github.com/anyesh/gbq-helpers
GBQ related helper functions and snippets.
Last synced: 12 Nov 2024
https://github.com/brpy/nyc-trips
Data engineering | Zoomcamp journey on nyc trip data with gcp stack
Last synced: 05 Nov 2024
https://github.com/walterowisk/sql-learn
SQL Learning
beekeeper bigquery data-analytics dbeaver mysql sql
Last synced: 15 Nov 2024
https://github.com/hayashi-yudai/cloudfunc_login
Example of authentication function for login with Cloud Functions and BigQuery
bigquery gcp-cloud-functions golang server
Last synced: 15 Nov 2024
https://github.com/night-fury-me/real-time-vehicle-data-processing
A repository that contains implementation of a Real-Time Vehicle Data Processing Pipeline that efficiently manages and analyzes vehicle data through a cohesive system.
bigquery cpp data-engineering data-streaming flink grpc kafka python real-time-data-processing
Last synced: 13 Oct 2024
https://github.com/adadalshabab/data-engineering-gcp-project
An end-to-end modern data engineering project, including deployment of ETL pipeline on Google Cloud Platform, using BigQuery for data analysis and leveraging Looker to generate an insight dashboard.
bigquery data data-science data-visualization databases dataengineering-a engineering etl-pipeline looker-studio powerbi
Last synced: 31 Oct 2024
https://github.com/knands42/data-ingestion
Data Ingestion project to evaluate my Kotlin skill using concurrency
bigquery golang google-cloud-platform google-storage gradle-kotlin-dsl kotlin kotlin-flow
Last synced: 31 Oct 2024
https://github.com/alexgenovese/machine-learning-bigquery-gcp
These SQL are based on available ecommerce dataset that has millions of Google Analytics records for the Google Merchandise Store loaded into BigQuery.
bigquery google google-cloud-platform purchase sql visitors
Last synced: 07 Nov 2024
https://github.com/scraly/flume-bigquery-sink
An Apache Flume Sink implementation to publish data to Google BigQuery
Last synced: 06 Nov 2024
https://github.com/scraly/bigquery
Google BigQuery AaaS tools, tips and fun
Last synced: 06 Nov 2024
https://github.com/squidmin/java11-spring-gradle-bigquery-reference
Java v11 ⋅ Spring v2 ⋅ Gradle ⋅ BigQuery
bigquery gradle gradle-java java java-gradle java11 java11-spring-boot spring spring-boot-2 spring-mvc spring-rest
Last synced: 13 Oct 2024
https://github.com/victorcezeh/end-to-end-elt-pipeline
An end-to-end ELT project using the Brazilian E-Commerce dataset from Kaggle. This project demonstrates the use of Python, PostgreSQL, Docker, Docker Compose, Airflow, dbt, and BigQuery to ingest, transform, and analyze data, providing insights into sales, delivery times, and order distributions.
airflow bigquery dbt-core docker docker-compose postgresql python
Last synced: 13 Oct 2024
https://github.com/galois1915/google-ml-engineer
This program provides the skills you need to advance your career and provides training to support your preparation for the industry-recognized Google Cloud Professional Machine Learning Engineer certification.
api automl bigquery keras mlops-workflow tensorflow2 vertex-ai
Last synced: 13 Oct 2024
https://github.com/prashhhant213/strategic-analysis-of-retail-brand-in-south-america-using-sql
Leveraged Big Query and MySQL to analyze 100K records for sales optimization, trend identification, and enhancing customer satisfaction for a retail brand in South America and to provide insights and recommendations to improve their userbase and improve their services
bigquery database mysql-server sql
Last synced: 07 Nov 2024
https://github.com/panagiotischaviaropoulos/google-data-analytics-case-study
bigquery data-visualization sql
Last synced: 13 Oct 2024
https://github.com/juldrixx/bigquery-avro-schema-converter
Website to convert a schema from one format to another between BigQuery and Avro
avro avro-schema bigquery bigquery-schema converter schema
Last synced: 13 Oct 2024
https://github.com/yandex-cloud-examples/yc-bigquery-to-object-storage
Экспорт данных из Google Big Query через Google Storage в Object Storage Yandex Cloud.
bigquery object-storage python3 yandex-cloud yandexcloud
Last synced: 07 Nov 2024
https://github.com/marielachirinosr/nyc-taxi-trip-exploration-2019-2020
Explores passenger behavior & impact of COVID-19 on NYC taxi industry (Q1 2019-2020).
bigquery data data-analysis data-visualization python sql tableau
Last synced: 07 Nov 2024
https://github.com/djdhairya/uber-data-analytics
Mage Vm
aiml api bigdata bigquery deep-learning docker google-maps-api ml python3 sql ssh vmware
Last synced: 10 Nov 2024
https://github.com/themihirmathur/uber-data-analytics
The goal of this project is to perform comprehensive data analytics on Uber trip data using a modern data engineering stack on Google Cloud Platform (GCP).
bigquery data-analysis data-engineering etl-pipeline google-cloud-platform looker python
Last synced: 12 Oct 2024
https://github.com/simhayn/genomics-cannabis-bigquery
BigQuery's Cannabis_Genomics Dataset Exploration using SQL in a Python Environment
big-data bigquery bioinformatics exploratory-data-analysis genomics python sql
Last synced: 13 Oct 2024
https://github.com/azapeti/bigquery-python-bash-automation
Since you're using the free version, you can only get data from your website through the Google Analytics API for the last 60 days. I would like to demonstrate in this repository how to run BigQuery queries in Python and automate it using bash and crontab for collecting historical data.
analytics automation bash bigquery cronjob crontab ga4 python python3
Last synced: 13 Oct 2024
https://github.com/andre-gitdev/stocks-functions
This project is for EDA related to stock trading.
alpaca alpaca-trading-api bigquery google-cloud portfolio-optimization robinhood-api robinhood-portfolio stock-analysis stock-data stock-price-prediction stocks-api stocks-trading
Last synced: 13 Oct 2024
https://github.com/edumoraes1/spam_count_sfmc
Consulta de SQL com contagem de envios de email e spam dos ultimos 365 dias
bigquery marketing-cloud salesforce sql
Last synced: 08 Nov 2024
https://github.com/manesioz/airflow-without-code
Dynamically generate DAGs to ingest SQL files into BigQuery with one line of "code"
airflow airflow-plugin bigquery python sql
Last synced: 09 Nov 2024
https://github.com/karencofre/riesgorelativo-lookerstudio
proyecto de análisis de datos y análisis perdicitvo en looker studio y google colab
bigquery data-analysis data-science machine-learning matplotlib python sklearn sql
Last synced: 13 Oct 2024
https://github.com/ivanildobarauna/ivanildobarauna
Special Repository to Make README
ai airflow big-data bigquery data-engineering gcp python
Last synced: 13 Oct 2024
https://github.com/syou6162/mackerel-plugin-bigquery-query-result-importer
Mackerel plugin to post bigquery's query result
Last synced: 12 Oct 2024
https://github.com/davidkhala/gcp-collections
Notebooks for GCP services
bigquery bq databricks datastore firestore google-cloud-platform
Last synced: 12 Oct 2024