Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
Projects in Awesome Lists tagged with bigquery
A curated list of projects in awesome lists tagged with bigquery .
https://github.com/hasura/graphql-engine
Blazing fast, instant realtime GraphQL APIs on your DB with fine grained access control, also trigger webhooks on database events.
access-control api automatic-api bigquery graphql graphql-api graphql-server haskell hasura mongodb postgres rest-api sql-server subgraph supergraph
Last synced: 16 Dec 2024
https://github.com/getredash/redash
Make Your Company Data Driven. Connect to any data source, easily visualize, dashboard and share your data.
analytics athena bi bigquery business-intelligence dashboard databricks hacktoberfest javascript mysql postgresql python redash redshift spark spark-sql visualization
Last synced: 16 Dec 2024
https://github.com/cube-js/cube
📊 Cube — Universal semantic layer platform for AI, BI, spreadsheets, and embedded analytics
analytics bigquery cube databricks headless-bi hive microservice mysql postgresql presto rust semantic-layer serverless snowflake sql
Last synced: 16 Dec 2024
https://github.com/beekeeper-studio/beekeeper-studio
Modern and easy to use SQL client for MySQL, Postgres, SQLite, SQL Server, and more. Linux, MacOS, and Windows.
bigquery cassandra cockroachdb database electron firebird linux-app mac-app mariadb mssql mysql postgresql sql sql-server sqlite windows-app
Last synced: 16 Dec 2024
https://github.com/airbytehq/airbyte
The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
bigquery change-data-capture data data-analysis data-collection data-engineering data-integration data-pipeline elt etl java mssql mysql pipeline postgresql python redshift s3 self-hosted snowflake
Last synced: 16 Dec 2024
https://github.com/oceanbase/oceanbase
OceanBase is an enterprise distributed relational database with high availability, high performance, horizontal scalability, and compatibility with SQL standards.
analytics bigquery cloud-native cpp database distributed-database distributed-transactions hacktoberfest htap mysql mysql-compatibility mysql-database oceanbase olap oltp paxos scalable sql vector-database
Last synced: 16 Dec 2024
https://github.com/growthbook/growthbook
Open Source Feature Flagging and A/B Testing Platform
ab-testing abtest abtesting analytics bigquery clickhouse continuous-delivery data-analysis data-engineering data-science experimentation feature-flagging feature-flags mixpanel redshift remote-config snowflake split-testing statistics
Last synced: 16 Dec 2024
https://github.com/cloudquery/cloudquery
The open source high performance ELT framework powered by Apache Arrow
airbyte attack-surface-management aws azure bigquery cspm data data-analysis data-collection data-engineering data-integration elt etl etl-framework gcp github-api go google kubernetes sql
Last synced: 16 Dec 2024
https://github.com/ibis-project/ibis
the portable Python dataframe library
bigquery clickhouse database datafusion duckdb impala mssql mysql pandas polars postgresql pyarrow pyspark python snowflake sql sqlite trino
Last synced: 16 Dec 2024
https://github.com/rudderlabs/rudder-server
Privacy and Security focused Segment-alternative, in Golang and React
bigquery cdp customer-data customer-data-lake customer-data-pipeline customer-data-platform data-engineering data-integration data-pipeline data-synchronization data-warehouse elt etl event-streaming privacy redshift segment-alternative snowflake warehouse-management warehouse-native
Last synced: 16 Dec 2024
https://github.com/jitsucom/jitsu
Jitsu is an open-source Segment alternative. Fully-scriptable data ingestion engine for modern data teams. Set-up a real-time data pipeline in minutes, not days
bigquery clickhouse data-collection data-connectors data-integration golang postgres redshift snowflake
Last synced: 16 Dec 2024
https://github.com/hvf/franchise
🍟 a notebook sql client. what you get when have a lot of sequels.
bigquery database mysql postgresql sql
Last synced: 20 Dec 2024
https://github.com/HVF/franchise
🍟 a notebook sql client. what you get when have a lot of sequels.
bigquery database mysql postgresql sql
Last synced: 29 Oct 2024
https://github.com/briefercloud/briefer
Dashboards and notebooks in a single place. Create powerful and flexible dashboards using code, or build beautiful Notion-like notebooks and share them with your team.
analytics bi bigquery briefer business-intelligence businessintelligence dashboard data-analysis data-visualization jupyter notebook postgres postgresql reporting visualization
Last synced: 17 Dec 2024
https://github.com/k1low/tbls
tbls is a CI-Friendly tool for document a database, written in Go.
bigquery continuous-integration database-document database-schema documentation-tool dynamodb er-diagram excel hacktoberfest mariadb markdown mermaid mysql plantuml postgresql redshift snowflake spanner sqlite sqlserver
Last synced: 17 Dec 2024
https://github.com/k1LoW/tbls
tbls is a CI-Friendly tool for document a database, written in Go.
bigquery continuous-integration database-document database-schema documentation-tool dynamodb er-diagram excel hacktoberfest mariadb markdown mermaid mysql plantuml postgresql redshift snowflake spanner sqlite sqlserver
Last synced: 29 Oct 2024
https://github.com/blockchain-etl/ethereum-etl
Python scripts for ETL (extract, transform and load) jobs for Ethereum blocks, transactions, ERC20 / ERC721 tokens, transfers, receipts, logs, contracts, internal transactions. Data is available in Google BigQuery https://goo.gl/oY5BCQ
aws bigquery blockchain-analytics csv erc20 erc20-tokens erc721 ethereum etl export gcp google-cloud sql transaction
Last synced: 16 Dec 2024
https://github.com/googlecloudplatform/professional-services
Common solutions and tools developed by Google Cloud's Professional Services team. This repository and its contents are not an officially supported Google product.
bigquery examples gke google-cloud-compute google-cloud-dataflow google-cloud-ml google-cloud-platform solutions tools
Last synced: 17 Dec 2024
https://github.com/GoogleCloudPlatform/professional-services
Common solutions and tools developed by Google Cloud's Professional Services team. This repository and its contents are not an officially supported Google product.
bigquery examples gke google-cloud-compute google-cloud-dataflow google-cloud-ml google-cloud-platform solutions tools
Last synced: 25 Oct 2024
https://github.com/bruin-data/ingestr
ingestr is a CLI tool to copy data between any databases with a single command seamlessly.
bigquery copy-database data-ingestion data-integration data-pipeline duckdb ingestion-pipeline mssql postgresql snowflake
Last synced: 17 Dec 2024
https://github.com/peerdb-io/peerdb
Fast, Simple and a cost effective tool to replicate data from Postgres to Data Warehouses, Queues and Storage
bigquery cdc clickhouse cloud-native distributed-systems etl eventhubs kafka postgres postgresql realtime rust s3 snowflake sql stream-processing
Last synced: 19 Dec 2024
https://github.com/PeerDB-io/peerdb
Fast, Simple and a cost effective tool to replicate data from Postgres to Data Warehouses, Queues and Storage
bigquery cdc clickhouse cloud-native distributed-systems etl eventhubs kafka postgres postgresql realtime rust s3 snowflake sql stream-processing
Last synced: 31 Oct 2024
https://github.com/Canner/WrenAI
🚀 Open-source SQL AI Agent for Text-to-SQL. Supporting PostgreSQL, DuckDB, MySQL, MS SQL, ClickHouse, Trino, JSON, CSV, Parquet data sources, and more! 🚀
agent ai bigquery duckdb fastapi gpt hacktoberfest llm nextjs nlp openai postgresql python rag sql sqlai text-to-sql text2sql typescript
Last synced: 25 Nov 2024
https://github.com/canner/wrenai
🚀 Open-source SQL AI Agent for Text-to-SQL. Supporting PostgreSQL, DuckDB, MySQL, MS SQL, ClickHouse, Trino, JSON, CSV, Parquet data sources, and more! 🚀
agent ai bigquery duckdb fastapi gpt hacktoberfest llm nextjs nlp openai postgresql python rag sql sqlai text-to-sql text2sql typescript
Last synced: 19 Dec 2024
https://github.com/elementary-data/elementary
The dbt-native data observability solution for data & analytics engineers. Monitor your data pipelines in minutes. Available as self-hosted or cloud service with premium features.
analytics-engineer bigquery data-analysis data-governance data-lineage data-observability data-pipeline data-pipelines data-reliability data-warehouse dataops dbt dbt-artifacts dbt-packages lineage redshift snowflake
Last synced: 17 Dec 2024
https://github.com/swirlai/swirl-search
SWIRL AI Connect: AI infrastructure software that powers your Search & Retrieval Augmented Generation (RAG) applications. Simplify and enhance your AI pipelines with seamless integration of large language models (LLMs) and data sources.
ai-search bigquery django federated-query federated-search gpt large-language-models metasearch python rag relevancy retrieval-augmented-generation search search-engine unified-search
Last synced: 19 Dec 2024
https://github.com/evgskv/logica
Logica is a logic programming language that compiles to SQL. It runs on DuckDB, Google BigQuery, PostgreSQL and SQLite.
bigquery datalog language logic-programming logica postgresql presto prolog prolog-implementation sql sqlite trino
Last synced: 17 Dec 2024
https://github.com/EvgSkv/logica
Logica is a logic programming language that compiles to SQL. It runs on Google BigQuery, PostgreSQL and SQLite.
bigquery datalog language logic-programming logica postgresql presto prolog prolog-implementation sql sqlite trino
Last synced: 27 Oct 2024
https://github.com/Multiwoven/multiwoven
🔥🔥🔥 Open Source Alternative to Hightouch, Census, and RudderStack - Reverse ETL & Data Activation
bigquery cdp customer-data-platform data-activation data-engineering data-pipeline data-warehouse databricks dbt etl hacktoberfest open-source postresql react redshift reverse-etl ruby self-hosted snowflake typescript
Last synced: 02 Nov 2024
https://github.com/multiwoven/multiwoven
🔥🔥🔥 Open Source Alternative to Hightouch, Census, and RudderStack - Reverse ETL & Data Activation
bigquery cdp customer-data-platform data-activation data-engineering data-pipeline data-warehouse databricks dbt etl hacktoberfest open-source postresql react redshift reverse-etl ruby self-hosted snowflake typescript
Last synced: 17 Dec 2024
https://github.com/googlecloudplatform/dataflowtemplates
Cloud Dataflow Google-provided templates for solving in-Cloud data tasks
apache-beam bigquery bigtable dataflow-templates google-cloud-dataflow google-cloud-spanner google-cloud-storage
Last synced: 18 Dec 2024
https://github.com/GoogleCloudPlatform/DataflowTemplates
Cloud Dataflow Google-provided templates for solving in-Cloud data tasks
apache-beam bigquery bigtable dataflow-templates google-cloud-dataflow google-cloud-spanner google-cloud-storage
Last synced: 05 Nov 2024
https://github.com/googlecloudplatform/bigquery-utils
Useful scripts, udfs, views, and other utilities for migration and data warehouse operations in BigQuery.
bigquery data-warehouse google-cloud-platform sql utilities
Last synced: 20 Dec 2024
https://github.com/GoogleCloudPlatform/bigquery-utils
Useful scripts, udfs, views, and other utilities for migration and data warehouse operations in BigQuery.
bigquery data-warehouse google-cloud-platform sql utilities
Last synced: 13 Nov 2024
https://github.com/scratchdata/scratchdata
Scratch is a swiss army knife for big data.
bigquery clickhouse data-warehouse duckdb hacktoberfest motherduck olap redshift snowflake
Last synced: 20 Dec 2024
https://github.com/madnight/githut
Github Language Statistics
bigquery dataset functional-reactive-programming github-language-statistics github-pages-website jamstack languages programming-languages react react-hooks serverless sql-query statistics
Last synced: 15 Dec 2024
https://github.com/goccy/bigquery-emulator
BigQuery emulator server implemented in Go
bigquery emulator gcp go golang google-cloud google-cloud-platform
Last synced: 19 Dec 2024
https://github.com/raystack/optimus
Optimus is an easy-to-use, reliable, and performant workflow orchestrator for data transformation, data modeling, pipelines, and data quality management.
airflow analytics analytics-engineering automation bigquery business-intelligence data-modelling data-pipelines data-transformation data-warehouse dataops elt etl golang workflows
Last synced: 20 Dec 2024
https://github.com/canner/vulcan-sql
Data API Framework for AI Agents and Data Apps
ai ai-agent analytics api-builder bigquery clickhouse data-lake data-warehouse database duckdb ksqldb postgresql reporting restful-api snowflake spreadsheet sql typescript vulcan-sql vulcansql
Last synced: 19 Dec 2024
https://github.com/Canner/vulcan-sql
Data API Framework for AI Agents and Data Apps
ai ai-agent analytics api-builder bigquery clickhouse data-lake data-warehouse database duckdb ksqldb postgresql reporting restful-api snowflake spreadsheet sql typescript vulcan-sql vulcansql
Last synced: 07 Nov 2024
https://github.com/unytics/bigfunctions
Supercharge BigQuery with BigFunctions
bigquery data data-analytics data-engineering data-visualization data-warehouse
Last synced: 19 Dec 2024
https://github.com/httparchive/almanac.httparchive.org
HTTP Archive's annual "State of the Web" report made by the web community
bigquery http-archive web-almanac
Last synced: 20 Dec 2024
https://github.com/HTTPArchive/almanac.httparchive.org
HTTP Archive's annual "State of the Web" report made by the web community
bigquery http-archive web-almanac
Last synced: 16 Nov 2024
https://github.com/artie-labs/transfer
Database replication platform that leverages change data capture. Stream production data from databases to your data warehouse (Snowflake, BigQuery, Redshift, Databricks) in real-time.
apache-kafka bigquery cdc change-data-capture data-integration data-pipelines database debezium elt golang kafka redshift snowflake
Last synced: 20 Dec 2024
https://github.com/dbt-checkpoint/dbt-checkpoint
:fishing_pole_and_fish: List of `pre-commit` hooks to ensure the quality of your `dbt` projects.
bigquery business-intelligence dbt pre-commit pre-commit-hook quality-assurance snowflake sql
Last synced: 20 Dec 2024
https://github.com/ploomber/jupysql
Better SQL in Jupyter. 📊
bigquery clickhouse data-engineering data-science duckdb hive jupyter mysql polars postgres presto python redshift snowflake spark-sql sql sqlite trino tsql
Last synced: 29 Sep 2024
https://github.com/synmetrix/synmetrix
Synmetrix – production-ready open source semantic layer on Cube
big-data bigquery business-intelligence clickhouse cube cubejs data-engineering databricks dremio druid firebolt llm prestodb redshift semantic-layer snowflake vertica
Last synced: 21 Dec 2024
https://github.com/r-dbi/bigrquery
An interface to Google's BigQuery from R.
Last synced: 18 Dec 2024
https://github.com/mlcraft-io/mlcraft
Synmetrix – production-ready open source semantic layer on Cube
big-data bigquery business-intelligence clickhouse cube cubejs data-engineering databricks dremio druid firebolt llm prestodb redshift semantic-layer snowflake vertica
Last synced: 09 Nov 2024
https://github.com/googleapis/nodejs-bigquery
Node.js client for Google Cloud BigQuery: A fast, economical and fully-managed enterprise data warehouse for large-scale data analytics.
Last synced: 17 Dec 2024
https://github.com/tylertreat/bigquery-python
Simple Python client for interacting with Google BigQuery.
bigquery google-bigquery python
Last synced: 21 Dec 2024
https://github.com/googleapis/python-bigquery-pandas
Google BigQuery connector for pandas
Last synced: 17 Dec 2024
https://github.com/ofek/pypinfo
Easily view PyPI download statistics via Google's BigQuery.
bigquery pypi python statistics
Last synced: 20 Dec 2024
https://github.com/harisekhon/sql-scripts
100+ SQL Scripts - PostgreSQL, MySQL, Oracle, Google BigQuery, MariaDB, AWS Athena. DBA, Analytics, DevOps, performance engineering. Google BigQuery ML machine learning classification.
athena aws aws-athena bigquery bigquery-ml dba devops gcp google-bigquery google-cloud-sql google-cloudsql-mysql machine-learning mariadb mysql oracle performance postgres postgresql rds sql
Last synced: 21 Dec 2024
https://github.com/basedosdados/sdk
⚙️ Código de manutenção do datalake (metadados e pacotes de acesso) | 📖 Docs: https://basedosdados.github.io/mais/
bigquery dados-abertos data-science govtech hacktoberfest hacktoberfest2022 open-data python r sql transparencia
Last synced: 15 Dec 2024
https://github.com/basedosdados/mais
⚙️ Código de manutenção do datalake (metadados e pacotes de acesso) | 📖 Docs: https://basedosdados.github.io/mais/
bigquery dados-abertos data-science govtech hacktoberfest hacktoberfest2022 open-data python r sql transparencia
Last synced: 13 Oct 2024
https://github.com/HariSekhon/SQL-scripts
100+ SQL Scripts - PostgreSQL, MySQL, Oracle, Google BigQuery, MariaDB, AWS Athena. DBA, Analytics, DevOps, performance engineering. Google BigQuery ML machine learning classification.
athena aws aws-athena bigquery bigquery-ml dba devops gcp google-bigquery google-cloud-sql google-cloudsql-mysql machine-learning mariadb mysql oracle performance postgres postgresql rds sql
Last synced: 07 Nov 2024
https://github.com/googleclouddataproc/spark-bigquery-connector
BigQuery data source for Apache Spark: Read data from BigQuery into DataFrames, write DataFrames into BigQuery tables.
bigquery bigquery-storage-api google-bigquery google-cloud google-cloud-dataproc spark
Last synced: 19 Dec 2024
https://github.com/tellery/tellery
Tellery lets you build metrics using SQL and bring them to your team. As easy as using a document. As powerful as a data modeling tool.
analytics bigquery business-intelligence collaboration dashboard data-analytics data-modeling data-science data-visualization database dbt notebook self-hosted sql
Last synced: 15 Dec 2024
https://github.com/GoogleCloudDataproc/spark-bigquery-connector
BigQuery data source for Apache Spark: Read data from BigQuery into DataFrames, write DataFrames into BigQuery tables.
bigquery bigquery-storage-api google-bigquery google-cloud google-cloud-dataproc spark
Last synced: 30 Sep 2024
https://github.com/astronomer/astro-sdk
Astro SDK allows rapid and clean development of {Extract, Load, Transform} workflows using Python and SQL, powered by Apache Airflow.
airflow apache-airflow bigquery dags data-analysis data-science elt etl gcs pandas postgres python s3 snowflake sql sqlite workflows
Last synced: 20 Dec 2024
https://github.com/spotify/ratatool
A tool for data sampling, data generation, and data diffing
avro bigquery parquet protobuf scala scalacheck
Last synced: 21 Dec 2024
https://github.com/machine-learning-apps/Issue-Label-Bot
Code For The Issue Label Bot, an App that automatically labels issues using machine learning, available on the GitHub Marketplace. This is also code for the blog article: "How to automate tasks on GitHub with machine learning for fun and profit"
bigquery bootstrap data-science deep-learning end-to-end-application flask gcp-cloud gharchive github-api-v3 github-app keras kubernetes machine-learning machine-learning-tutorials nlp production-machine-learning tensorflow
Last synced: 25 Oct 2024
https://github.com/machine-learning-apps/issue-label-bot
Code For The Issue Label Bot, an App that automatically labels issues using machine learning, available on the GitHub Marketplace. This is also code for the blog article: "How to automate tasks on GitHub with machine learning for fun and profit"
bigquery bootstrap data-science deep-learning end-to-end-application flask gcp-cloud gharchive github-api-v3 github-app keras kubernetes machine-learning machine-learning-tutorials nlp production-machine-learning tensorflow
Last synced: 29 Sep 2024
https://github.com/googlecloudplatform/security-analytics
Community Security Analytics provides a set of community-driven audit & threat queries for Google Cloud
audit-logs bigquery chronicle cloud-security-command-center gcp google-cloud log-analytics logging network-analysis network-logs security security-operations threat-detection
Last synced: 15 Dec 2024
https://github.com/data-drift/data-drift
Metrics Observability & Troubleshooting
analytics bigquery context data-diffing data-governance data-lineage data-monitoring data-observability data-quality data-reliability data-version-control dbt dbt-metrics dbt-packages drill-down metrics reconciliation redshift semantic-layer snowflake
Last synced: 18 Dec 2024
https://github.com/mprove-io/mprove
Open Source Self-service Business Intelligence with Version Control :tada:
analytics bigquery business-intelligence clickhouse dashboard data-visualization looker metrics postgresql snowflake
Last synced: 16 Dec 2024
https://github.com/raystack/firehose
Firehose is an extensible, no-code, and cloud-native service to load real-time streaming data from Kafka to data stores, data lakes, and analytical storage systems.
apache-kafka bigquery dataops firehose influxdb kafka postgresql prometheus sink streaming
Last synced: 17 Dec 2024
https://github.com/scale8/scale8-tag-manager-and-analytics
Website analytics, JavaScript error tracking + analytics, tag manager, data ingest endpoint creation (tracking pixels). GDPR + CCPA compliant.
advertising analytics app bigquery charts clickhouse cloud cmp gdpr google-analytics google-tag-manager marketing metrics privacy scale8 statistics tag-manager typescript website
Last synced: 29 Sep 2024
https://github.com/GoogleCloudPlatform/security-analytics
Community Security Analytics provides a set of community-driven audit & threat queries for Google Cloud
audit-logs bigquery chronicle cloud-security-command-center gcp google-cloud log-analytics logging network-analysis network-logs security security-operations threat-detection
Last synced: 02 Nov 2024
https://github.com/googleclouddataproc/hadoop-connectors
Libraries and tools for interoperability between Hadoop-related open-source software and Google Cloud Platform.
bigquery google-cloud-dataproc hadoop hadoop-filesystem hadoop-hcfs
Last synced: 18 Dec 2024
https://github.com/GoogleCloudDataproc/hadoop-connectors
Libraries and tools for interoperability between Hadoop-related open-source software and Google Cloud Platform.
bigquery google-cloud-dataproc hadoop hadoop-filesystem hadoop-hcfs
Last synced: 25 Oct 2024
https://wix.github.io/quix
Quix Notebook Manager
athena bigquery notebook-manager presto trino
Last synced: 01 Nov 2024
https://github.com/wix-incubator/quix
Quix Notebook Manager
athena bigquery notebook-manager presto trino
Last synced: 15 Dec 2024
https://github.com/yoshidan/google-cloud-rust
Google Cloud Client Libraries for Rust.
bigquery gcp gcs google-cloud-platform pubsub rust spanner
Last synced: 19 Dec 2024
https://github.com/doitintl/bigquery-grafana
Google BigQuery Datasource Plugin for Grafana. (NO LONGER MAINTAINED)
bigquery bigquery-datasource google-bigquery google-cloud-platform grafana grafana-bigquery grafana-bigquery-datasource grafana-datasource metrics monitoring typescript
Last synced: 29 Sep 2024
https://github.com/lynnlangit/gcp-essentials
Sample code and notes for my GCP courses on LinkedIn Learning
bigquery gce gcloud gcp gcs gemini gke google-cloud google-cloud-functions google-cloud-platform google-cloud-run google-cloud-storage tensorflow vertex-ai
Last synced: 15 Dec 2024
https://github.com/bxparks/bigquery-schema-generator
Generates the BigQuery schema from newline-delimited JSON or CSV data records.
bigquery bigquery-schema google-bigquery python3
Last synced: 15 Dec 2024
https://github.com/cuebook/CueObserve
Timeseries Anomaly detection and Root Cause Analysis on data in SQL data warehouses and databases
anomaly anomaly-detection bigquery datawarehouse prophet-facebook redshift root-cause-analysis snowflake sql timeseries-analysis timeseries-forecasting
Last synced: 14 Nov 2024
https://github.com/cuebook/cueobserve
Timeseries Anomaly detection and Root Cause Analysis on data in SQL data warehouses and databases
anomaly anomaly-detection bigquery datawarehouse prophet-facebook redshift root-cause-analysis snowflake sql timeseries-analysis timeseries-forecasting
Last synced: 19 Dec 2024
https://github.com/thinkingmachines/geomancer
Automated feature engineering for geospatial data
bigquery feature-engineering geospatial machine-learning openstreetmap
Last synced: 29 Sep 2024
https://github.com/googleapis/python-bigquery-dataframes
BigQuery DataFrames
bigquery data-science machine-learning python
Last synced: 20 Dec 2024
https://github.com/googlecloudplatform/fraudfinder
Fraudfinder: A comprehensive lab series on how to build a real-time fraud detection system on Google Cloud
bigquery bigquery-ml dataflow google-cloud-platform machine-learning mlops mlpipelines vertex-ai
Last synced: 17 Dec 2024
https://github.com/digitalghost-dev/premier-league
A Data Engineering project. Repository for backend infrastructure and Streamlit app files for a Premier League Dashboard.
bigquery cloud-run data-engineer data-pipeline data-visualization docker firestore go google-cloud prefect python streamlit
Last synced: 26 Sep 2024
https://github.com/lots-of-things/gpt2-bert-reddit-bot
a bot that generates realistic replies using a combination of pretrained GPT-2 and BERT models
bert bigquery colab-notebook gpt-2 praw
Last synced: 20 Dec 2024
https://github.com/cartodb/analytics-toolbox-core
A set of UDFs and Procedures to extend BigQuery, Snowflake, Redshift, Postgres and Databricks with Spatial Analytics capabilities
analytics-toolbox bigquery carto databricks geospatial gis postgres redshift snowflake sql
Last synced: 17 Dec 2024
https://github.com/tuva-health/tuva
Main repo including core data model, data marts, reference data, terminology, and the clinical concept library
analytics-engineering bigquery data-analytics data-governance data-lineage data-pipelines data-warehouse dbt dbt-packages healthcare healthcare-analysis healthcare-data open-source redshift snowflake sql terminology
Last synced: 17 Dec 2024
https://github.com/xnuinside/simple-ddl-parser
Simple DDL Parser to parse SQL (HQL, TSQL, AWS Redshift, BigQuery, Snowflake and other dialects) ddl files to json/python dict with full information about columns: types, defaults, primary keys, etc. & table properties, types, domains, etc.
bigquery columns ddl ddl-parser ddls hive hql mssql mysql oracle-database oracle-db parser postgresql redshift schemas snowflake sql sql-parser tsql types
Last synced: 20 Dec 2024
https://github.com/omnata-labs/dbt-ml-preprocessing
A SQL port of python's scikit-learn preprocessing module, provided as cross-database dbt macros.
bigquery dbt redshift scikit-learn snowflake
Last synced: 19 Dec 2024
https://github.com/mara/mara-example-project-2
An example mini data warehouse for python project stats, template for new projects
bigquery data-integration etl pypi sql
Last synced: 20 Dec 2024
https://github.com/spotify/magnolify
A collection of Magnolia add-on modules
avro bigquery bigtable cats datastore guava magnolia neo4j parquet protobuf scala scalacheck tensorflow
Last synced: 15 Dec 2024
https://github.com/googlecloudplatform/cortex-data-foundation
Data Foundation - Google Cloud Cortex Framework
airflow bigquery cloud google googlecloud salesforce sap
Last synced: 20 Dec 2024
https://github.com/google/starthinker
Reference framework for building data workflows provided by Google. Accelerates authentication, logging, scheduling, and deployment of solutions using GCP. To borrow a tagline.. "The framework for professionals with deadlines."
airflow app-engine automation bigquery cloud-functions cm360 colab-notebook data-science django dv360 google-ads google-analytics logger python scheduler ui workflows
Last synced: 29 Sep 2024
https://github.com/googlecloudplatform/public-datasets-pipelines
Cloud-native, data onboarding architecture for Google Cloud Datasets
airflow bigquery cloud-composer cloud-native cloud-storage data-architecture data-engineering data-pipelines datasets google-cloud open-data
Last synced: 21 Dec 2024