Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Projects in Awesome Lists tagged with bigquery

A curated list of projects in awesome lists tagged with bigquery .

https://github.com/hasura/graphql-engine

Blazing fast, instant realtime GraphQL APIs on your DB with fine grained access control, also trigger webhooks on database events.

access-control api automatic-api bigquery graphql graphql-api graphql-server haskell hasura mongodb postgres rest-api sql-server subgraph supergraph

Last synced: 16 Dec 2024

https://github.com/getredash/redash

Make Your Company Data Driven. Connect to any data source, easily visualize, dashboard and share your data.

analytics athena bi bigquery business-intelligence dashboard databricks hacktoberfest javascript mysql postgresql python redash redshift spark spark-sql visualization

Last synced: 16 Dec 2024

https://github.com/cube-js/cube

📊 Cube — Universal semantic layer platform for AI, BI, spreadsheets, and embedded analytics

analytics bigquery cube databricks headless-bi hive microservice mysql postgresql presto rust semantic-layer serverless snowflake sql

Last synced: 16 Dec 2024

https://github.com/beekeeper-studio/beekeeper-studio

Modern and easy to use SQL client for MySQL, Postgres, SQLite, SQL Server, and more. Linux, MacOS, and Windows.

bigquery cassandra cockroachdb database electron firebird linux-app mac-app mariadb mssql mysql postgresql sql sql-server sqlite windows-app

Last synced: 16 Dec 2024

https://github.com/airbytehq/airbyte

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.

bigquery change-data-capture data data-analysis data-collection data-engineering data-integration data-pipeline elt etl java mssql mysql pipeline postgresql python redshift s3 self-hosted snowflake

Last synced: 16 Dec 2024

https://github.com/apache/doris

Apache Doris is an easy-to-use, high performance and unified analytics database.

bigquery database dbt delta-lake elt etl hadoop hive hudi iceberg lakehouse olap query-engine real-time redshift snowflake spark sql

Last synced: 16 Dec 2024

https://github.com/apache/incubator-doris

Apache Doris is an easy-to-use, high performance and unified analytics database.

bigquery database dbt delta-lake elt etl hadoop hive hudi iceberg lakehouse olap query-engine real-time redshift snowflake spark sql

Last synced: 14 Dec 2024

https://github.com/oceanbase/oceanbase

OceanBase is an enterprise distributed relational database with high availability, high performance, horizontal scalability, and compatibility with SQL standards.

analytics bigquery cloud-native cpp database distributed-database distributed-transactions hacktoberfest htap mysql mysql-compatibility mysql-database oceanbase olap oltp paxos scalable sql vector-database

Last synced: 16 Dec 2024

https://github.com/jitsucom/jitsu

Jitsu is an open-source Segment alternative. Fully-scriptable data ingestion engine for modern data teams. Set-up a real-time data pipeline in minutes, not days

bigquery clickhouse data-collection data-connectors data-integration golang postgres redshift snowflake

Last synced: 16 Dec 2024

https://github.com/hvf/franchise

🍟 a notebook sql client. what you get when have a lot of sequels.

bigquery database mysql postgresql sql

Last synced: 20 Dec 2024

https://github.com/HVF/franchise

🍟 a notebook sql client. what you get when have a lot of sequels.

bigquery database mysql postgresql sql

Last synced: 29 Oct 2024

https://github.com/briefercloud/briefer

Dashboards and notebooks in a single place. Create powerful and flexible dashboards using code, or build beautiful Notion-like notebooks and share them with your team.

analytics bi bigquery briefer business-intelligence businessintelligence dashboard data-analysis data-visualization jupyter notebook postgres postgresql reporting visualization

Last synced: 17 Dec 2024

https://github.com/blockchain-etl/ethereum-etl

Python scripts for ETL (extract, transform and load) jobs for Ethereum blocks, transactions, ERC20 / ERC721 tokens, transfers, receipts, logs, contracts, internal transactions. Data is available in Google BigQuery https://goo.gl/oY5BCQ

aws bigquery blockchain-analytics csv erc20 erc20-tokens erc721 ethereum etl export gcp google-cloud sql transaction

Last synced: 16 Dec 2024

https://github.com/googlecloudplatform/professional-services

Common solutions and tools developed by Google Cloud's Professional Services team. This repository and its contents are not an officially supported Google product.

bigquery examples gke google-cloud-compute google-cloud-dataflow google-cloud-ml google-cloud-platform solutions tools

Last synced: 17 Dec 2024

https://github.com/GoogleCloudPlatform/professional-services

Common solutions and tools developed by Google Cloud's Professional Services team. This repository and its contents are not an officially supported Google product.

bigquery examples gke google-cloud-compute google-cloud-dataflow google-cloud-ml google-cloud-platform solutions tools

Last synced: 25 Oct 2024

https://github.com/bruin-data/ingestr

ingestr is a CLI tool to copy data between any databases with a single command seamlessly.

bigquery copy-database data-ingestion data-integration data-pipeline duckdb ingestion-pipeline mssql postgresql snowflake

Last synced: 17 Dec 2024

https://github.com/spotify/scio

A Scala API for Apache Beam and Google Cloud Dataflow.

batch beam bigquery data dataflow google-cloud ml scala scio streaming

Last synced: 17 Dec 2024

https://github.com/peerdb-io/peerdb

Fast, Simple and a cost effective tool to replicate data from Postgres to Data Warehouses, Queues and Storage

bigquery cdc clickhouse cloud-native distributed-systems etl eventhubs kafka postgres postgresql realtime rust s3 snowflake sql stream-processing

Last synced: 19 Dec 2024

https://github.com/PeerDB-io/peerdb

Fast, Simple and a cost effective tool to replicate data from Postgres to Data Warehouses, Queues and Storage

bigquery cdc clickhouse cloud-native distributed-systems etl eventhubs kafka postgres postgresql realtime rust s3 snowflake sql stream-processing

Last synced: 31 Oct 2024

https://github.com/Canner/WrenAI

🚀 Open-source SQL AI Agent for Text-to-SQL. Supporting PostgreSQL, DuckDB, MySQL, MS SQL, ClickHouse, Trino, JSON, CSV, Parquet data sources, and more! 🚀

agent ai bigquery duckdb fastapi gpt hacktoberfest llm nextjs nlp openai postgresql python rag sql sqlai text-to-sql text2sql typescript

Last synced: 25 Nov 2024

https://github.com/canner/wrenai

🚀 Open-source SQL AI Agent for Text-to-SQL. Supporting PostgreSQL, DuckDB, MySQL, MS SQL, ClickHouse, Trino, JSON, CSV, Parquet data sources, and more! 🚀

agent ai bigquery duckdb fastapi gpt hacktoberfest llm nextjs nlp openai postgresql python rag sql sqlai text-to-sql text2sql typescript

Last synced: 19 Dec 2024

https://github.com/elementary-data/elementary

The dbt-native data observability solution for data & analytics engineers. Monitor your data pipelines in minutes. Available as self-hosted or cloud service with premium features.

analytics-engineer bigquery data-analysis data-governance data-lineage data-observability data-pipeline data-pipelines data-reliability data-warehouse dataops dbt dbt-artifacts dbt-packages lineage redshift snowflake

Last synced: 17 Dec 2024

https://github.com/swirlai/swirl-search

SWIRL AI Connect: AI infrastructure software that powers your Search & Retrieval Augmented Generation (RAG) applications. Simplify and enhance your AI pipelines with seamless integration of large language models (LLMs) and data sources.

ai-search bigquery django federated-query federated-search gpt large-language-models metasearch python rag relevancy retrieval-augmented-generation search search-engine unified-search

Last synced: 19 Dec 2024

https://github.com/evgskv/logica

Logica is a logic programming language that compiles to SQL. It runs on DuckDB, Google BigQuery, PostgreSQL and SQLite.

bigquery datalog language logic-programming logica postgresql presto prolog prolog-implementation sql sqlite trino

Last synced: 17 Dec 2024

https://github.com/EvgSkv/logica

Logica is a logic programming language that compiles to SQL. It runs on Google BigQuery, PostgreSQL and SQLite.

bigquery datalog language logic-programming logica postgresql presto prolog prolog-implementation sql sqlite trino

Last synced: 27 Oct 2024

https://github.com/googlecloudplatform/bigquery-utils

Useful scripts, udfs, views, and other utilities for migration and data warehouse operations in BigQuery.

bigquery data-warehouse google-cloud-platform sql utilities

Last synced: 20 Dec 2024

https://github.com/GoogleCloudPlatform/bigquery-utils

Useful scripts, udfs, views, and other utilities for migration and data warehouse operations in BigQuery.

bigquery data-warehouse google-cloud-platform sql utilities

Last synced: 13 Nov 2024

https://github.com/goccy/bigquery-emulator

BigQuery emulator server implemented in Go

bigquery emulator gcp go golang google-cloud google-cloud-platform

Last synced: 19 Dec 2024

https://github.com/raystack/optimus

Optimus is an easy-to-use, reliable, and performant workflow orchestrator for data transformation, data modeling, pipelines, and data quality management.

airflow analytics analytics-engineering automation bigquery business-intelligence data-modelling data-pipelines data-transformation data-warehouse dataops elt etl golang workflows

Last synced: 20 Dec 2024

https://github.com/httparchive/almanac.httparchive.org

HTTP Archive's annual "State of the Web" report made by the web community

bigquery http-archive web-almanac

Last synced: 20 Dec 2024

https://github.com/HTTPArchive/almanac.httparchive.org

HTTP Archive's annual "State of the Web" report made by the web community

bigquery http-archive web-almanac

Last synced: 16 Nov 2024

https://github.com/artie-labs/transfer

Database replication platform that leverages change data capture. Stream production data from databases to your data warehouse (Snowflake, BigQuery, Redshift, Databricks) in real-time.

apache-kafka bigquery cdc change-data-capture data-integration data-pipelines database debezium elt golang kafka redshift snowflake

Last synced: 20 Dec 2024

https://github.com/dbt-checkpoint/dbt-checkpoint

:fishing_pole_and_fish: List of `pre-commit` hooks to ensure the quality of your `dbt` projects.

bigquery business-intelligence dbt pre-commit pre-commit-hook quality-assurance snowflake sql

Last synced: 20 Dec 2024

https://github.com/r-dbi/bigrquery

An interface to Google's BigQuery from R.

bigquery database r

Last synced: 18 Dec 2024

https://github.com/googleapis/nodejs-bigquery

Node.js client for Google Cloud BigQuery: A fast, economical and fully-managed enterprise data warehouse for large-scale data analytics.

bigquery database nodejs sql

Last synced: 17 Dec 2024

https://github.com/tylertreat/bigquery-python

Simple Python client for interacting with Google BigQuery.

bigquery google-bigquery python

Last synced: 21 Dec 2024

https://github.com/googleapis/python-bigquery-pandas

Google BigQuery connector for pandas

bigquery data pandas

Last synced: 17 Dec 2024

https://github.com/ofek/pypinfo

Easily view PyPI download statistics via Google's BigQuery.

bigquery pypi python statistics

Last synced: 20 Dec 2024

https://github.com/harisekhon/sql-scripts

100+ SQL Scripts - PostgreSQL, MySQL, Oracle, Google BigQuery, MariaDB, AWS Athena. DBA, Analytics, DevOps, performance engineering. Google BigQuery ML machine learning classification.

athena aws aws-athena bigquery bigquery-ml dba devops gcp google-bigquery google-cloud-sql google-cloudsql-mysql machine-learning mariadb mysql oracle performance postgres postgresql rds sql

Last synced: 21 Dec 2024

https://github.com/basedosdados/sdk

⚙️ Código de manutenção do datalake (metadados e pacotes de acesso) | 📖 Docs: https://basedosdados.github.io/mais/

bigquery dados-abertos data-science govtech hacktoberfest hacktoberfest2022 open-data python r sql transparencia

Last synced: 15 Dec 2024

https://github.com/basedosdados/mais

⚙️ Código de manutenção do datalake (metadados e pacotes de acesso) | 📖 Docs: https://basedosdados.github.io/mais/

bigquery dados-abertos data-science govtech hacktoberfest hacktoberfest2022 open-data python r sql transparencia

Last synced: 13 Oct 2024

https://github.com/HariSekhon/SQL-scripts

100+ SQL Scripts - PostgreSQL, MySQL, Oracle, Google BigQuery, MariaDB, AWS Athena. DBA, Analytics, DevOps, performance engineering. Google BigQuery ML machine learning classification.

athena aws aws-athena bigquery bigquery-ml dba devops gcp google-bigquery google-cloud-sql google-cloudsql-mysql machine-learning mariadb mysql oracle performance postgres postgresql rds sql

Last synced: 07 Nov 2024

https://github.com/googleclouddataproc/spark-bigquery-connector

BigQuery data source for Apache Spark: Read data from BigQuery into DataFrames, write DataFrames into BigQuery tables.

bigquery bigquery-storage-api google-bigquery google-cloud google-cloud-dataproc spark

Last synced: 19 Dec 2024

https://github.com/tellery/tellery

Tellery lets you build metrics using SQL and bring them to your team. As easy as using a document. As powerful as a data modeling tool.

analytics bigquery business-intelligence collaboration dashboard data-analytics data-modeling data-science data-visualization database dbt notebook self-hosted sql

Last synced: 15 Dec 2024

https://github.com/GoogleCloudDataproc/spark-bigquery-connector

BigQuery data source for Apache Spark: Read data from BigQuery into DataFrames, write DataFrames into BigQuery tables.

bigquery bigquery-storage-api google-bigquery google-cloud google-cloud-dataproc spark

Last synced: 30 Sep 2024

https://github.com/astronomer/astro-sdk

Astro SDK allows rapid and clean development of {Extract, Load, Transform} workflows using Python and SQL, powered by Apache Airflow.

airflow apache-airflow bigquery dags data-analysis data-science elt etl gcs pandas postgres python s3 snowflake sql sqlite workflows

Last synced: 20 Dec 2024

https://github.com/spotify/ratatool

A tool for data sampling, data generation, and data diffing

avro bigquery parquet protobuf scala scalacheck

Last synced: 21 Dec 2024

https://github.com/machine-learning-apps/Issue-Label-Bot

Code For The Issue Label Bot, an App that automatically labels issues using machine learning, available on the GitHub Marketplace. This is also code for the blog article: "How to automate tasks on GitHub with machine learning for fun and profit"

bigquery bootstrap data-science deep-learning end-to-end-application flask gcp-cloud gharchive github-api-v3 github-app keras kubernetes machine-learning machine-learning-tutorials nlp production-machine-learning tensorflow

Last synced: 25 Oct 2024

https://github.com/machine-learning-apps/issue-label-bot

Code For The Issue Label Bot, an App that automatically labels issues using machine learning, available on the GitHub Marketplace. This is also code for the blog article: "How to automate tasks on GitHub with machine learning for fun and profit"

bigquery bootstrap data-science deep-learning end-to-end-application flask gcp-cloud gharchive github-api-v3 github-app keras kubernetes machine-learning machine-learning-tutorials nlp production-machine-learning tensorflow

Last synced: 29 Sep 2024

https://github.com/mprove-io/mprove

Open Source Self-service Business Intelligence with Version Control :tada:

analytics bigquery business-intelligence clickhouse dashboard data-visualization looker metrics postgresql snowflake

Last synced: 16 Dec 2024

https://github.com/raystack/firehose

Firehose is an extensible, no-code, and cloud-native service to load real-time streaming data from Kafka to data stores, data lakes, and analytical storage systems.

apache-kafka bigquery dataops firehose influxdb kafka postgresql prometheus sink streaming

Last synced: 17 Dec 2024

https://github.com/scale8/scale8-tag-manager-and-analytics

Website analytics, JavaScript error tracking + analytics, tag manager, data ingest endpoint creation (tracking pixels). GDPR + CCPA compliant.

advertising analytics app bigquery charts clickhouse cloud cmp gdpr google-analytics google-tag-manager marketing metrics privacy scale8 statistics tag-manager typescript website

Last synced: 29 Sep 2024

https://github.com/googleclouddataproc/hadoop-connectors

Libraries and tools for interoperability between Hadoop-related open-source software and Google Cloud Platform.

bigquery google-cloud-dataproc hadoop hadoop-filesystem hadoop-hcfs

Last synced: 18 Dec 2024

https://github.com/GoogleCloudDataproc/hadoop-connectors

Libraries and tools for interoperability between Hadoop-related open-source software and Google Cloud Platform.

bigquery google-cloud-dataproc hadoop hadoop-filesystem hadoop-hcfs

Last synced: 25 Oct 2024

https://wix.github.io/quix

Quix Notebook Manager

athena bigquery notebook-manager presto trino

Last synced: 01 Nov 2024

https://github.com/wix-incubator/quix

Quix Notebook Manager

athena bigquery notebook-manager presto trino

Last synced: 15 Dec 2024

https://github.com/yoshidan/google-cloud-rust

Google Cloud Client Libraries for Rust.

bigquery gcp gcs google-cloud-platform pubsub rust spanner

Last synced: 19 Dec 2024

https://github.com/datacoves/dbt-coves

CLI tool for dbt users to simplify creation of staging models (yml and sql) files

analytics bigquery datacoves dbt elt etl jinja python redshift snowflake sql

Last synced: 20 Dec 2024

https://github.com/bxparks/bigquery-schema-generator

Generates the BigQuery schema from newline-delimited JSON or CSV data records.

bigquery bigquery-schema google-bigquery python3

Last synced: 15 Dec 2024

https://github.com/googlecloudplatform/data-analytics-golden-demo

An end to end demo of Google's Cloud data and analytic stack.

bigdata bigquery composer dataflow dataproc gcp

Last synced: 18 Dec 2024

https://github.com/cuebook/CueObserve

Timeseries Anomaly detection and Root Cause Analysis on data in SQL data warehouses and databases

anomaly anomaly-detection bigquery datawarehouse prophet-facebook redshift root-cause-analysis snowflake sql timeseries-analysis timeseries-forecasting

Last synced: 14 Nov 2024

https://github.com/cuebook/cueobserve

Timeseries Anomaly detection and Root Cause Analysis on data in SQL data warehouses and databases

anomaly anomaly-detection bigquery datawarehouse prophet-facebook redshift root-cause-analysis snowflake sql timeseries-analysis timeseries-forecasting

Last synced: 19 Dec 2024

https://github.com/thinkingmachines/geomancer

Automated feature engineering for geospatial data

bigquery feature-engineering geospatial machine-learning openstreetmap

Last synced: 29 Sep 2024

https://github.com/googlecloudplatform/fraudfinder

Fraudfinder: A comprehensive lab series on how to build a real-time fraud detection system on Google Cloud

bigquery bigquery-ml dataflow google-cloud-platform machine-learning mlops mlpipelines vertex-ai

Last synced: 17 Dec 2024

https://github.com/digitalghost-dev/premier-league

A Data Engineering project. Repository for backend infrastructure and Streamlit app files for a Premier League Dashboard.

bigquery cloud-run data-engineer data-pipeline data-visualization docker firestore go google-cloud prefect python streamlit

Last synced: 26 Sep 2024

https://github.com/lots-of-things/gpt2-bert-reddit-bot

a bot that generates realistic replies using a combination of pretrained GPT-2 and BERT models

bert bigquery colab-notebook gpt-2 praw

Last synced: 20 Dec 2024

https://github.com/cartodb/analytics-toolbox-core

A set of UDFs and Procedures to extend BigQuery, Snowflake, Redshift, Postgres and Databricks with Spatial Analytics capabilities

analytics-toolbox bigquery carto databricks geospatial gis postgres redshift snowflake sql

Last synced: 17 Dec 2024

https://github.com/xnuinside/simple-ddl-parser

Simple DDL Parser to parse SQL (HQL, TSQL, AWS Redshift, BigQuery, Snowflake and other dialects) ddl files to json/python dict with full information about columns: types, defaults, primary keys, etc. & table properties, types, domains, etc.

bigquery columns ddl ddl-parser ddls hive hql mssql mysql oracle-database oracle-db parser postgresql redshift schemas snowflake sql sql-parser tsql types

Last synced: 20 Dec 2024

https://github.com/omnata-labs/dbt-ml-preprocessing

A SQL port of python's scikit-learn preprocessing module, provided as cross-database dbt macros.

bigquery dbt redshift scikit-learn snowflake

Last synced: 19 Dec 2024

https://github.com/mara/mara-example-project-2

An example mini data warehouse for python project stats, template for new projects

bigquery data-integration etl pypi sql

Last synced: 20 Dec 2024

https://github.com/googlecloudplatform/cortex-data-foundation

Data Foundation - Google Cloud Cortex Framework

airflow bigquery cloud google googlecloud salesforce sap

Last synced: 20 Dec 2024

https://github.com/google/starthinker

Reference framework for building data workflows provided by Google. Accelerates authentication, logging, scheduling, and deployment of solutions using GCP. To borrow a tagline.. "The framework for professionals with deadlines."

airflow app-engine automation bigquery cloud-functions cm360 colab-notebook data-science django dv360 google-ads google-analytics logger python scheduler ui workflows

Last synced: 29 Sep 2024