Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
BigQuery
Google BigQuery enables companies to handle large amounts of data without having to manage infrastructure. Google’s documentation describes it as a « serverless architecture (that) lets you use SQL queries to answer your organization’s biggest questions with zero infrastructure management. BigQuery’s scalable, distributed analysis engine lets you query terabytes in seconds and petabytes in minutes. » Its client libraries allow the use of widely known languages such as Python, Java, JavaScript, and Go. Federated queries are also supported, making it flexible to read data from external sources.
📖 A highly rated canonical book on it is « Google BigQuery: The Definitive Guide », a comprehensive reference. Another enriching read on the subject is the inside story told in the article by the founding product manager of BigQuery celebrating its 10th anniversary.
- GitHub: https://github.com/topics/bigquery
- Wikipedia: https://en.wikipedia.org/wiki/BigQuery/
- Repo: https://github.com/GoogleCloudPlatform/bigquery-utils/
- Released: May 19, 2010
- Related Topics: cloud-computing,
- Aliases: bq,
- Last updated: 2024-11-15 00:03:21 UTC
- JSON Representation
https://github.com/elithrar/finding-bugs-with-bigquery
A talk on using BigQuery, the GitHub Public Data & some elbow grease to find bugs in OSS projects.
big-data bigquery bugs github golang open-source
Last synced: 13 Oct 2024
https://github.com/chandanpasunoori/event-sync
Event Sync is for syncing events from multiple sources to multiple destinations, targetted for adhoc events, where sources support acknowledgement functionality.
bigquery golang-tools google-cloud-platform pubsub
Last synced: 15 Oct 2024
https://github.com/tuancamtbtx/gcp-udfs-example
Google BigQuery Javascript UDF Function Examples
bigquery gcp javascript nodejs npm udf
Last synced: 09 Nov 2024
https://github.com/ostrokach/uniparc_xml_parser
UniParc dataset describing ~300 million protein sequences converted into relational tables accessible through Google BigQuery (and as Parquet files).
bigquery bioinformatics csv-files parquet-files protein-domains protein-sequences
Last synced: 12 Oct 2024
https://github.com/misicode/Kaggle-Intro_to_SQL
Solutions of the exercises from the course "Introduction to SQL with BigQuery" by @Kaggle.
bigquery kaggle kaggle-intro-to-sql sql
Last synced: 23 Oct 2024
https://github.com/anilkhichar/bq-table-copy-automation
Copy table from one dataset to another in google big query using bash script
automation bash bash-script big-query bigquery bigquery-cp gcp google
Last synced: 07 Nov 2024
https://github.com/benitomartin/benitomartin
Personal profile 😎
anaconda artificial-intelligence aws bash-script bigquery data-science gcp lambda-functions large-language-models linux machine-learning python pytorch retrieval-augmented-generation sagemaker scikit-learn tensorflow terraform
Last synced: 08 Nov 2024
https://github.com/teraearlywine/sample_sql
The following repo contains samples of SQL code that can be referenced by future clients or employers.
Last synced: 12 Oct 2024
https://github.com/cch0/data-engineering-zoomcamp-2024-project
2024 project
bigquery cicd cloud-storage-application cloudstorage gcp mage pipelines terraform
Last synced: 01 Nov 2024
https://github.com/seandavi/aisr-data-warehouse
Animal Image Shared Resource PACS/Viewer
api bigquery clinical-information-system dicom dicom-files gcp image-analysis pacs radiology
Last synced: 05 Nov 2024
https://github.com/sigpwned/jdbq
JDBI-inspired Database Access Framework for Java + BigQuery
bigquery data-access-framework data-access-layer data-access-library data-lake java persistence persistence-framework persistence-layer
Last synced: 12 Oct 2024
https://github.com/ackeecz/terraform-gcp-dataflow_pubsub_to_bq
Dataflow job subscriber to PubSub subscription. It takes message from subscription and push it into BigQuery table.
bigquery dataflow pubsub terraform-module
Last synced: 10 Nov 2024
https://github.com/coatless/bigquery-reddit-ask-your-advisor
Analysis code that counts instances of a phrase on Reddit (e.g. "ask your advisor")
ask-your-advisor bigquery r reddit
Last synced: 11 Oct 2024
https://github.com/justinjsd/analytics-engineer-bootcamp
This repository serves as a collection of my work and learnings throughout the bootcamp, focusing on developing skills in analytics engineering, particularly using dbt.
analytics bigquery dbt engineering sql
Last synced: 05 Nov 2024
https://github.com/kellyjadams/bigquery-python-weekly-report
A script to automate a weekly report that runs BigQuery in Python.
Last synced: 13 Oct 2024
https://github.com/greenpeace/gpes-old-en-petitions-api-emulator
Emulates the deprecated EN petition's API. Useful if you have legacy microsites with petitions.
bigquery mysql petitions sqlite3
Last synced: 03 Aug 2024
https://github.com/misszeferino/sql-projects
bigquery data-analysis mysql queries sql sqlite3
Last synced: 12 Oct 2024
https://github.com/alimarzouk/paris-aq
ELTL pipeline to monitor air quality in the Paris Île-de-France area
airflow airquality big-data bigquery dataengineering gcs spark
Last synced: 13 Oct 2024
https://github.com/tomgorb/project-template-for-production
project template to (help) put a Machine/Deep learning algorithm into production
Last synced: 11 Nov 2024
https://github.com/justinbeckwith/bisquick
🥞Synchronize your GitHub issues with BigQuery. Do neat stuff.
Last synced: 01 Nov 2024
https://github.com/yu-iskw/terraform-google-copy-bq-datasets
A terraform module to copy BigQuery datasets across regions
bigquery data-engineering google-cloud terraform
Last synced: 27 Oct 2024
https://github.com/nguyendangxuanlinh/newyorkbike-rental-trip-time-prediction-model-googlebigquery
The ML project uses Linear Regression to predict the trip time of a bike rental for a new prediction system in new mobile application. The ML datasets have been collected and stored in a BigQuery public dataset
bigquery linear-regression machine-learning
Last synced: 12 Oct 2024
https://github.com/gdbecker/dbtlabslearning
Learn the foundational steps of transforming data in dbt Cloud. Start by connecting dbt Cloud to a data warehouse and Git repository, then explore key concepts like modeling, sources, testing, documentation, and deployment. Get hands-on by building a model and running tests in dbt Cloud.
analytics-engineering bigquery dbt dbt-cloud jinja macros models packages sql testing
Last synced: 13 Oct 2024
https://github.com/chukwuemekaaham/uber-gcp-etl-project
Data Engineering Zoomcamp Final Project
bigquery cloud-storage csv docker-compose gcp jupyter-notebook looker-studio mageai python spark spreadsheets terraform
Last synced: 11 Nov 2024
https://github.com/mlabarrere/pygquery
🐷 Multitread your data with Google BigQuery
bigquery dataframe google-bigquery multithreading pandas python
Last synced: 12 Oct 2024
https://github.com/esanchezros/bigquery-maven-plugin
Maven plugin for managing BigQuery datasets, tables and views
bigquery java maven maven-plugin
Last synced: 28 Sep 2024
https://github.com/miguelapp10/api_simpliroute_urbano
extraer datos de la API de SimpliRoute y Urbano en un rango de fechas específico y procesarlos para su análisis y almacenamiento en Google BigQuery
api-client bigquery pandas python
Last synced: 12 Oct 2024
https://github.com/morphl-ai/morphl-model-publishers-churning-users-bigquery
BigQuery connector, pre-processor and model for predicting churning users for digital publishers using Google Analytics 360
bigquery google-analytics machine-learning morphl-platform pipeline preprocessor pyspark
Last synced: 12 Nov 2024
https://github.com/squidmin/java17-spring-gradle-bigquery-reference
Java v17⋅ Spring v3 ⋅ Gradle ⋅ BigQuery
bigquery gradle java java-17-gradle java17 java17-spring-boot spring-boot-3
Last synced: 27 Oct 2024
https://github.com/dataform-co/bigquery-ml-pipeline
An example of machine pipeline on Bigquery ML using Dataform
bigquery bigquery-ml dataform machine-learning-pip sql
Last synced: 13 Nov 2024
https://github.com/shinichi-takii/atom-language-sql-bigquery
BigQuery SQL language support in Atom
atom atom-package bigquery grammar snippets sql syntax-highlighting
Last synced: 31 Oct 2024
https://github.com/chukwuemekaaham/data-engineering-zoomcamp
Datatalks Club Free Data Engineering Zoomcamp Project
bigquery dbt docker-compose duckdb gcp gcp-cloud-storage github-actions jupyter-notebook kafka linux looker-studio mageai pandas postgresql prefect python redpanda risingwave spark terraform
Last synced: 11 Oct 2024
https://github.com/mattwelke/packt-book-bot
Bot that tweets and logs the Packt free eBook of the day in BigQuery daily.
bigquery bot ebooks ibm-cloud-functions java openwhisk
Last synced: 13 Oct 2024
https://github.com/justinjsd/analytics-engineering
📊 A repository focusing on analytics engineering, particularly using dbt on the Northwind Sample dataset
analytics bigquery dbt engineering sql
Last synced: 13 Nov 2024
https://github.com/samedhi/gaend
Convert GAE Models into endpoints
bigquery elasticsearch google-app-engine restful taskqueue
Last synced: 13 Nov 2024
https://github.com/tatamiya/new-books-notification
Fetch new books from [版元ドットコム](https://www.hanmoto.com/) and notify them to Slack
bigquery cloudrun-jobs gcs golang slack
Last synced: 13 Nov 2024
https://github.com/poogles/pytest-bq
pytest fixtures for a local bigquery suitable for local development.
bigquery bigquery-emulator pytest
Last synced: 12 Oct 2024
https://github.com/fpopic/bigquery-schema-select
(Script) Generates SQL query that selects all fields (recursively for nested fields) from the provided BigQuery schema file.
bigquery bigquery-schema scala sql
Last synced: 12 Oct 2024
https://github.com/xlfe/pyjdbq
The easiest way to ship journald logs to Google BigQuery
bigquery journald journald-logs logging security
Last synced: 12 Oct 2024
https://github.com/moh-ayman/stripeapi-to-bq---cfunc-etl
Google Cloud Function built to perform an ETL Job to Collect StripeAPI Data and Transform it to be able to Import it to Bigquery.
bigquery dataengineering etl-pipeline gcp gcp-cloud-functions pandas-dataframe python stripe-api
Last synced: 15 Nov 2024
https://github.com/mchmarny/sbomer
Generates daily SBOM and vulnerability reports for container images and saves resulting files into GCS bucket and data into BigQuery tables.
bigquery gcp gcs grype report sbom syft vex vulnerability
Last synced: 08 Nov 2024
https://github.com/hayashi-yudai/cloudfunc_login
Example of authentication function for login with Cloud Functions and BigQuery
bigquery gcp-cloud-functions golang server
Last synced: 15 Nov 2024
https://github.com/victorcezeh/end-to-end-elt-pipeline
An end-to-end ELT project using the Brazilian E-Commerce dataset from Kaggle. This project demonstrates the use of Python, PostgreSQL, Docker, Docker Compose, Airflow, dbt, and BigQuery to ingest, transform, and analyze data, providing insights into sales, delivery times, and order distributions.
airflow bigquery dbt-core docker docker-compose postgresql python
Last synced: 13 Oct 2024
https://github.com/squidmin/java11-spring-gradle-bigquery-reference
Java v11 ⋅ Spring v2 ⋅ Gradle ⋅ BigQuery
bigquery gradle gradle-java java java-gradle java11 java11-spring-boot spring spring-boot-2 spring-mvc spring-rest
Last synced: 13 Oct 2024
https://github.com/knands42/data-ingestion
Data Ingestion project to evaluate my Kotlin skill using concurrency
bigquery golang google-cloud-platform google-storage gradle-kotlin-dsl kotlin kotlin-flow
Last synced: 31 Oct 2024
https://github.com/adadalshabab/data-engineering-gcp-project
An end-to-end modern data engineering project, including deployment of ETL pipeline on Google Cloud Platform, using BigQuery for data analysis and leveraging Looker to generate an insight dashboard.
bigquery data data-science data-visualization databases dataengineering-a engineering etl-pipeline looker-studio powerbi
Last synced: 31 Oct 2024
https://github.com/night-fury-me/real-time-vehicle-data-processing
A repository that contains implementation of a Real-Time Vehicle Data Processing Pipeline that efficiently manages and analyzes vehicle data through a cohesive system.
bigquery cpp data-engineering data-streaming flink grpc kafka python real-time-data-processing
Last synced: 13 Oct 2024
https://github.com/panagiotischaviaropoulos/google-data-analytics-case-study
bigquery data-visualization sql
Last synced: 13 Oct 2024
https://github.com/juldrixx/bigquery-avro-schema-converter
Website to convert a schema from one format to another between BigQuery and Avro
avro avro-schema bigquery bigquery-schema converter schema
Last synced: 13 Oct 2024
https://github.com/mchmarny/stocker
Using tweeter sentiment and stock market price signal correlation to predict next day closing price
bigquery ml prediction regression-models
Last synced: 08 Nov 2024
https://github.com/simhayn/genomics-cannabis-bigquery
BigQuery's Cannabis_Genomics Dataset Exploration using SQL in a Python Environment
big-data bigquery bioinformatics exploratory-data-analysis genomics python sql
Last synced: 13 Oct 2024
https://github.com/azapeti/bigquery-python-bash-automation
Since you're using the free version, you can only get data from your website through the Google Analytics API for the last 60 days. I would like to demonstrate in this repository how to run BigQuery queries in Python and automate it using bash and crontab for collecting historical data.
analytics automation bash bigquery cronjob crontab ga4 python python3
Last synced: 13 Oct 2024
https://github.com/khanovico/energy-data-analysis
This is the cloud model analyzing real world dataset with BigQuery and other big-data analyzing tools. I implemented docker image for running this app on cross-platform environments.
big-data-processing bigquery docker google-app-engine jupyter-notebook mlflow python scikit-learn seaborn xgboost
Last synced: 10 Oct 2024
https://github.com/andre-gitdev/stocks-functions
This project is for EDA related to stock trading.
alpaca alpaca-trading-api bigquery google-cloud portfolio-optimization robinhood-api robinhood-portfolio stock-analysis stock-data stock-price-prediction stocks-api stocks-trading
Last synced: 13 Oct 2024
https://github.com/vikasgupta1812/google_cloudml_scripts
https://goo.gl/dFjFQf
bigquery google-cloud-ml python-language tensorflow-tutorial
Last synced: 26 Sep 2024
https://github.com/karencofre/riesgorelativo-lookerstudio
proyecto de análisis de datos y análisis perdicitvo en looker studio y google colab
bigquery data-analysis data-science machine-learning matplotlib python sklearn sql
Last synced: 13 Oct 2024
https://github.com/ivanildobarauna/ivanildobarauna
Special Repository to Make README
ai airflow big-data bigquery data-engineering gcp python
Last synced: 13 Oct 2024
https://github.com/syou6162/mackerel-plugin-bigquery-query-result-importer
Mackerel plugin to post bigquery's query result
Last synced: 12 Oct 2024
https://github.com/davidkhala/gcp-collections
Notebooks for GCP services
bigquery bq databricks datastore firestore google-cloud-platform
Last synced: 12 Oct 2024
https://github.com/nlgtuankiet/bq-noti
BigQuery notification
bigquery bq notification notifier
Last synced: 12 Oct 2024
https://github.com/vedantwalia/google-data-analytics-capstone-case-study
This is a repository of my work on data analysis as a part of the Google Data Analytics Capstone
bigquery data data-viz datavisualization-project divvy-bikes google googledataanalytics sql tableau tableau-public
Last synced: 12 Oct 2024
https://github.com/rubnsbarbosa/nasa-asteroids-extractor
ETL asteroids data extractor using some Google Cloud services
bigquery bucket cloud-storage google-cloud nasa-api-neows
Last synced: 12 Oct 2024
https://github.com/george-nyamao/gcp_etl_project
An ETL pipeline to move an uploaded flat file ffrom GCS, mask PII, store Big Query, and Create a report in Looker.
airflow bigquery cloudcomposer data-fusion gcs-bucket looker python3 wrangler
Last synced: 12 Oct 2024
https://github.com/hrialan/dataform-prune
An open-source tool for automating the cleanup of outdated objects in Dataform configurations, optimizing data workflows with seamless CI/CD integration.
automation bigquery data-analytics dataform
Last synced: 12 Oct 2024
https://github.com/branb97/jobstreet-data-eng-project
Building a data pipeline to deliver job listing data from Jobstreet for analysis.
airflow bigquery data-engineering etl-pipeline google-cloud looker-studio python sql
Last synced: 13 Oct 2024
https://github.com/robinnoiret/importcsv_zendeskbigquery
This project involves developing a Python script to import csv export from Zendesk to BigQuery. It is not intended for recurring use, but to enable an initial dump of historical data.
bigquery connector export-csvfile json zendesk
Last synced: 13 Oct 2024
https://github.com/thanhloc81/sql-project-bicycles-practise
✨ Utilizing SQL to extract data following a simulated task involving the Sales and Product modules
adventureworks bicycle bigquery google-cloud sql
Last synced: 12 Oct 2024
https://github.com/tomgorb/some-data-monitoring
fully functional DAG using Airflow 2 and minikube (locally) to help monitor GCP billing
airflow2 bigquery gcp minikube
Last synced: 12 Oct 2024
https://github.com/phstudy/zetasketch-bigquery-example
An example demonstrates how to use ZetaSketch with BigQuery
Last synced: 12 Oct 2024
https://github.com/yu-iskw/homebrew-bigquery-to-datastore
A homebrew tap for bigquery-to-datastore
bigquery google-datastore homebrew
Last synced: 30 Oct 2024
https://github.com/pratshrestha/cochin-traders---sql--sales-analysis
Cochin Traders imports and exports specialty foods globally. This project analyzes sales and operational data to enhance business efficiency, supply chain management, and sales performance. Key areas of focus include
bigquery customer-engagement employee-performance inventory-management sales-trends sql
Last synced: 12 Oct 2024
https://github.com/tirendazacademy/hands-on-data-science-with-gcp
Google BigQuery Tutorial
big-data big-data-analytics bigdata bigquery bigquery-ml bigqueryml cloud-computing data-analysis data-analytics data-engineering data-science dataanalysis dataengineering google-bigquery google-cloud-platform machienlearning machine-learning
Last synced: 08 Nov 2024
https://github.com/ka-zo/booking-data-analysis
Booking data analysis
airline-booking apache-beam bigquery google-cloud looker-studio python3
Last synced: 12 Oct 2024
https://github.com/janaom/gcp-de-project-data-pipeline-with-cloud-run-functions-airflow-biggueryml
Build a data pipeline on Google Cloud using an event-driven architecture, leveraging GCS, Cloud Run functions, and BigQuery. Explore both VM and Composer options for Airflow management, and utilize Logging & Monitoring for pipeline health. Discover how SQL-based BigQuery ML can be used for initial ML implementation in specific scenarios.
airflow bigquery bigqueryml cloud-functions cloud-run-functions composer data-engineering-project google-cloud-platform
Last synced: 13 Oct 2024
https://github.com/dobsontom/basket-abandonment
Data pipeline for detecting and responding to basket abandonment using BigQuery and Adobe Campaign.
adobe-campaign bigquery ga4 gcp sql
Last synced: 12 Oct 2024
https://github.com/karencofre/marketing-segmentacion-en-powerbi
Proyecto prueba de hipótesis en powerbi y python
bigquery google-colab powerbi python sql statsmodels
Last synced: 12 Oct 2024
https://github.com/lisabensoussan/bigdataminig_finalassignment
This repository contains solutions for the final assignment of the Big Data Mining course (52002/52019), focusing on querying large datasets with BigQuery, network analysis with Python, and distributed data processing with Apache Spark.
bigquery community-detection data-cleaning dataframe exploratory-data-analysis pagerank rdd sql text-analysis visualization
Last synced: 12 Oct 2024
https://github.com/erik-ingwersen-ey/iowa_sales_forecast
Iowa Liquor Sales Forecast Model
arima bigquery bigquery-ml google-cloud sales-forecast
Last synced: 12 Oct 2024
https://github.com/shaheerazam-dev/cyclistic-case-study-google-data-analytics-certificate
This case study simulates the real-world experience of a junior data analyst at Cyclistic, a fictional company. We will leverage the data analysis process framework (Ask, Prepare, Process, Analyze, Share, Act) to address critical business questions and provide data-driven insights to guide strategic decision-making.
bigquery data-science data-visualization spreadsheet sql tableau
Last synced: 12 Oct 2024
https://github.com/lawal-hash/olistelt
An end-to-end ELT data pipeline of the Brazilian olist e-commerce dataset using the modern data stack
airflow bigquery dbt dbt-core docker postgresql sql
Last synced: 12 Oct 2024
https://github.com/sahilmb/employee-churn-da
A data analysis project on employee churn rate using Google Bigquery, Looker, Pycaret and Colab
bigquery looker-studio pycaret
Last synced: 12 Oct 2024
https://github.com/vidyadnina/other-sql-projects-and-queries
Other SQL projects and queries.
Last synced: 12 Oct 2024
https://github.com/ket0825/v1-gcp-preview
Preview 서비스를 위한 GCP 레포 / Manage GCP src for preview services
bigquery cloud-functions cloud-run cloudbuild gcp logging pubsub
Last synced: 12 Oct 2024
https://github.com/manuelandersen/football-pipeline
DE Zoomcamp 2024 Final Project 🧙
bigquery data-engineering data-lake data-warehouse dbt dbt-cloud etl-pipeline google-cloud looker-studio mageai python
Last synced: 12 Oct 2024
https://github.com/chdl17/nyc_green_taxis_peak_hour_analysis
This project analyzes GCP BigQuery data and uses Looker Studio to build a Peak Hour Analysis.
bigquery gcp google-cloud-platform looker-studio sql
Last synced: 12 Oct 2024
https://github.com/marcopellegrinoit/web-traffic-time-series-predictions
Forecast Web Traffic Demand Time Series with ARIMA+ BigQuery and Looker Studio. Addionatel modeling available with ARIMA, LSTM, and Facebook Prophet.
arima bigquery gcp lstm prophet-model time-series vertex-ai
Last synced: 12 Oct 2024
https://github.com/shubhammohanty680/uber_data_analysis
bigquery data-analysis gcp-compute gcp-project looker-studio mageai python
Last synced: 12 Oct 2024
https://github.com/marielachirinosr/analysis-urgencias-hospital-pitalito
This project involves analyzing emergency room admission data from the E.S.E Hospital Departamental de Pitalito using a star schema model.
bigquery data data-analysis etl-pipeline tableau
Last synced: 12 Oct 2024
https://github.com/kmohamedalie/bigquery-intro
Coursera BigQuery Introduction using Covid19 dataset
bigquery coursera covid-19 datavisualization looker-studio sql
Last synced: 12 Oct 2024
https://github.com/hanif-syazul/analyzing-kimia-farma-sales-performance-with-gcp
This repository contains the final project for the Rakamin Big Data Analytics Internship. It include a complete dashboard of Kimia Farma's sales performance analysis from 2020 to 2023.
big-data-analytics bigquery internship-project kimia-farma looker-studio rakamin sql
Last synced: 13 Oct 2024
https://github.com/plishka/blockchain_analysis
Cryptocurrency On-Chain Analysis (Bitcoin Blockchain)
bigquery blockchain data-cleaning scraping-websites sql tableau
Last synced: 12 Oct 2024
https://github.com/denny-b-justin/purdue
The internship was broadly to understand if the topics/events are being covered differently in the different countries and how they affect stock market returns. The provided dataset is a post-processed set of news articles, so already reflects topic modelling and sentiment analysis.
big-data bigquery finance gdelt-events python
Last synced: 12 Oct 2024