Projects in Awesome Lists tagged with hive
A curated list of projects in awesome lists tagged with hive .
https://github.com/CodePhiliaX/Chat2DB
🔥🔥🔥AI-driven database tool and SQL client, The hottest GUI client, supporting MySQL, Oracle, PostgreSQL, DB2, SQL Server, DB2, SQLite, H2, ClickHouse, and more.
ai bi chatgpt clickhouse clickhouse-client database datagrip db2 dbeaver gpt hive mysql navicat oracle postgresql redis redis-client sqlserver text2sql
Last synced: 17 Aug 2025
https://github.com/codephiliax/chat2db
🔥🔥🔥AI-driven database tool and SQL client, The hottest GUI client, supporting MySQL, Oracle, PostgreSQL, DB2, SQL Server, DB2, SQLite, H2, ClickHouse, and more.
ai bi chatgpt clickhouse clickhouse-client database datagrip db2 dbeaver gpt hive mysql navicat oracle postgresql redis redis-client sqlserver text2sql
Last synced: 14 May 2025
https://github.com/cube-js/cube
📊 Cube’s universal semantic layer platform is the next evolution of OLAP technology for AI, BI, spreadsheets, and embedded analytics
analytics bigquery cube databricks headless-bi hive microservice mysql postgresql presto rust semantic-layer serverless snowflake sql
Last synced: 09 Sep 2025
https://github.com/cube-js/cube.js
📊 Cube — Universal semantic layer platform for AI, BI, spreadsheets, and embedded analytics
analytics bigquery cube databricks headless-bi hive microservice mysql postgresql presto rust semantic-layer serverless snowflake sql
Last synced: 19 Mar 2025
https://github.com/tencent/apijson
🏆 实时 零代码、全功能、强安全 ORM 库 🚀 后端接口和文档零代码,前端(客户端) 定制返回 JSON 的数据和结构 🏆 Real-Time coding-free, powerful and secure ORM 🚀 providing APIs and Docs without coding by Backend, and the returned JSON of API can be customized by Frontend(Client) users
baas clickhouse crud databricks elasticsearch hadoop hive influxdb low-code lowcode milvus nocode oracle postgresql postgresql-database serverless snowflake sqlserver tdengine tidb
Last synced: 13 May 2025
https://github.com/Tencent/APIJSON
🏆 实时 零代码、全功能、强安全 ORM 库 🚀 后端接口和文档零代码,前端(客户端) 定制返回 JSON 的数据和结构 🏆 Real-Time coding-free, powerful and secure ORM 🚀 providing APIs and Docs without coding by Backend, and the returned JSON of API can be customized by Frontend(Client) users
baas clickhouse crud databricks elasticsearch hadoop hive influxdb low-code lowcode milvus nocode oracle postgresql postgresql-database serverless snowflake sqlserver tdengine tidb
Last synced: 01 Apr 2025
https://github.com/codePhiliaX/Chat2DB
🔥🔥🔥AI-driven database tool and SQL client, The hottest GUI client, supporting MySQL, Oracle, PostgreSQL, DB2, SQL Server, DB2, SQLite, H2, ClickHouse, and more.
ai bi chatgpt clickhouse clickhouse-client database datagrip db2 dbeaver gpt hive mysql navicat oracle postgresql redis redis-client sqlserver text2sql
Last synced: 26 Sep 2025
https://github.com/trinodb/trino
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
analytics big-data data-science database databases datalake delta-lake distributed-database distributed-systems hadoop hive iceberg java jdbc presto prestodb query-engine sql trino
Last synced: 12 Nov 2025
https://github.com/isar/hive
Lightweight and blazing fast key-value database written in pure Dart.
dart database encryption flutter hive key-value nosql
Last synced: 14 May 2025
https://github.com/liyupi/sql-generator
🔨 用 JSON 来生成结构化的 SQL 语句,基于 Vue3 + TypeScript + Vite + Ant Design + MonacoEditor 实现,项目简单(重逻辑轻页面)、适合练手~
ant-design bigdata hive javascript json monaco-editor mysql spark sql typescript vite vue vue3
Last synced: 14 May 2025
https://github.com/apache/linkis
Apache Linkis builds a computation middleware layer to facilitate connection, governance and orchestration between the upper applications and the underlying data engines.
application-manager context-service engine hive hive-table impala jdbc jobserver linkis livy presto pyspark resource-manager rest-api scriptis spark sql storage thrift-server udf
Last synced: 11 May 2025
https://github.com/webankfintech/dataspherestudio
DataSphereStudio is a one stop data application development& management portal, covering scenarios including data exchange, desensitization/cleansing, analysis/mining, quality measurement, visualization, and task scheduling.
airflow atlas azkaban dataworks davinci dolphinscheduler flink governance griffin hadoop hive hue kettle linkis spark supperset tableau visualis workflow zeppelin
Last synced: 14 May 2025
https://github.com/WeBankFinTech/DataSphereStudio
DataSphereStudio is a one stop data application development& management portal, covering scenarios including data exchange, desensitization/cleansing, analysis/mining, quality measurement, visualization, and task scheduling.
airflow atlas azkaban dataworks davinci dolphinscheduler flink governance griffin hadoop hive hue kettle linkis spark supperset tableau visualis workflow zeppelin
Last synced: 14 Mar 2025
https://github.com/luckyzxl2016/movie_recommend
基于Spark的电影推荐系统,包含爬虫项目、web网站、后台管理系统以及spark推荐系统
hadoop hive mysql nginx scala scrapy spark-mllib spark-streaming ssm-maven
Last synced: 15 May 2025
https://github.com/LuckyZXL2016/Movie_Recommend
基于Spark的电影推荐系统,包含爬虫项目、web网站、后台管理系统以及spark推荐系统
hadoop hive mysql nginx scala scrapy spark-mllib spark-streaming ssm-maven
Last synced: 26 Mar 2025
https://github.com/geekyouth/szt-bigdata
深圳地铁大数据客流分析系统🚇🚄🌟
cdh6 clickhouse docker elasticsearch flink hadoop hbase hive kafka kibana kylin mongodb mysql phoenix redis scala spark springboot szt-bigdata zookeeper
Last synced: 14 Apr 2025
https://github.com/geekyouth/SZT-bigdata
深圳地铁大数据客流分析系统🚇🚄🌟
cdh6 clickhouse docker elasticsearch flink hadoop hbase hive kafka kibana kylin mongodb mysql phoenix redis scala spark springboot szt-bigdata zookeeper
Last synced: 28 Mar 2025
https://github.com/apache/kyuubi
Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on data warehouses and lakehouses.
data-lake hacktoberfest hadoop hive jdbc kubernetes spark spark-sql sql thrift
Last synced: 13 May 2025
https://github.com/pinterest/querybook
Querybook is a Big Data Querying UI, combining collocated table metadata and a simple notebook interface.
analyses celery charting flask hive metastore notebook presto typescript
Last synced: 14 May 2025
https://github.com/qihoo360/quicksql
A Flexible, Fast, Federated(3F) SQL Analysis Middleware for Multiple Data Sources
Last synced: 14 Apr 2025
https://github.com/Qihoo360/Quicksql
A Flexible, Fast, Federated(3F) SQL Analysis Middleware for Multiple Data Sources
Last synced: 27 Mar 2025
https://github.com/dropbox/PyHive
Python interface to Hive and Presto. 🐝
dbapi hive hiveserver2 presto python sqlalchemy
Last synced: 04 Apr 2025
https://github.com/dropbox/pyhive
Python interface to Hive and Presto. 🐝
dbapi hive hiveserver2 presto python sqlalchemy
Last synced: 14 May 2025
https://github.com/obenner/data-engineering-interview-questions
More than 2000+ Data engineer interview questions.
airflow avro aws azure cassandra data-engineering data-structures flink flume hadoop hadoop-hdfs hbase hive impala interview interview-questions kafka nifi spark sql
Last synced: 14 May 2025
https://github.com/docs4dev/docs4dev
后端开发常用框架文档及中文翻译,包含 Spring 系列文档(Spring, Spring Boot, Spring Cloud, Spring Security, Spring Session),大数据(Apache Hive, HBase, Apache Flume),日志(Log4j2, Logback),Http Server(NGINX,Apache),Python,数据库(OpenTSDB,MySQL,PostgreSQL)等最新官方文档以及对应的中文翻译。
apache apache-flume hbase hive log4j2 mysql nginx opentsdb postgresql python spring spring-batch spring-boot spring-cloud
Last synced: 25 Oct 2025
https://github.com/OBenner/data-engineering-interview-questions
More than 2000+ Data engineer interview questions.
airflow avro aws azure cassandra data-engineering data-structures flink flume hadoop hadoop-hdfs hbase hive impala interview interview-questions kafka nifi spark sql
Last synced: 10 Apr 2025
https://github.com/dtstack/taier
Taier is a big data development platform for submission, scheduling, operation and maintenance, and indicator information display
azkaban chunjun cronjob-scheduler dag data-schedule distributed-schedule-system flink hadoop hive job-scheduler scheduler spark task-schedule workflow-scheduling-system
Last synced: 15 May 2025
https://github.com/DTStack/Taier
Taier is a big data development platform for submission, scheduling, operation and maintenance, and indicator information display
azkaban chunjun cronjob-scheduler dag data-schedule distributed-schedule-system flink hadoop hive job-scheduler scheduler spark task-schedule workflow-scheduling-system
Last synced: 27 Mar 2025
https://github.com/devlive-community/datacap
DataCap is integrated software for data transformation, integration, and visualization. Support a variety of data sources, file types, big data related database, relational database, NoSQL database, etc. Through the software can realize the management of multiple data sources, the data under the source of various operations conversion ...
clickhouse database db2 dremio druid elasticsearch h2 hive ignite kylin kyuubi monetdb mongodb mysql phoenix postgresql presto redis sqlserver trino
Last synced: 14 May 2025
https://github.com/macbre/sql-metadata
Uses tokenized query returned by python-sqlparse and generates query metadata
database hive hiveql metadata mysql-query parser python-package python3-library sql sql-parser sqlparse
Last synced: 13 May 2025
https://github.com/nielsbasjes/yauaa
Yet Another UserAgent Analyzer
analyzer apache-beam apache-flink apache-hive client-hints flink hive java nifi-processor nifi-processors parse snowflake snowplow snowplowanalytics trino-plugin user-agent user-agent-analysis user-agent-parser useragent-parser useragentparser
Last synced: 13 May 2025
https://github.com/ploomber/jupysql
Better SQL in Jupyter. 📊
bigquery clickhouse data-engineering data-science duckdb hive jupyter mysql polars postgres presto python redshift snowflake spark-sql sql sqlite trino tsql
Last synced: 04 Oct 2025
https://github.com/WeBankFinTech/Scriptis
Scriptis is for interactive data analysis with script development(SQL, Pyspark, HiveQL), task submission(Spark, Hive), UDF, function, resource management and intelligent diagnosis.
errorcode hive hive-table hql hue ide linkis pyspark resouce-management scala spark sql udf zeppelin
Last synced: 15 Jul 2025
https://github.com/WeBankFinTech/WeDataSphere
WeDataSphere is a financial grade, one-stop big data platform suite.
analytics bigdata data-analysis datafabric datagovernance dataspherestudio exchangis flink hadoop hive ide linkis prophecis qualitis schedulis scriptis spark streamis visualis
Last synced: 27 Mar 2025
https://github.com/yanagishima/yanagishima
Web UI for Trino, Hive and SparkSQL
elasticsearch hive spark trino
Last synced: 27 Mar 2025
https://github.com/turboway/spiderman
基于 scrapy-redis 的通用分布式爬虫框架
hbase hive kafka rdbm scapy-redis scrapy spiderman
Last synced: 04 Apr 2025
https://github.com/TurboWay/spiderman
基于 scrapy-redis 的通用分布式爬虫框架
hbase hive kafka rdbm scapy-redis scrapy spiderman
Last synced: 25 Mar 2025
https://github.com/running-elephant/moonbox
Moonbox is a DVtaaS (Data Virtualization as a Service) Platform
data-virtualization hive kudu moonbox spark virtual-database
Last synced: 04 Apr 2025
https://github.com/ErfanRht/MovieLab
An open source movie tracker and movie finder.
android dart flutter flutter-app flutter-apps getx hive imdb imdb-api movie-database movie-tracker movie-tracking movies
Last synced: 05 May 2025
https://github.com/ThreatLabz/ransomware_notes
An Archive of Ransomware Notes Past and Present Collected by Zscaler ThreatLabz
akira alphv blackbasta blackcat blacksuit cactus clop darkangels hive karakurt lockbit mallox malware malware-research medusa notes qilin ransomhub ransomware revil
Last synced: 10 Apr 2025
https://github.com/openhive-network/hive
Fast. Scalable. Powerful. The Blockchain for Web3
blockchain cryptocurrency dapps decentralization decentralized dpos fork hive openhive p2p platform social-network steem web3
Last synced: 09 May 2025
https://github.com/dtstack/dt-sql-parser
SQL Parsers for BigData, built with antlr4.
antlr4 autocompletion bigdata flink hive impala mysql parser postgresql spark sql sql-validation trino
Last synced: 14 May 2025
https://github.com/helicalinsight/helicalinsight
Helical Insight software is world’s first Open Source Business Intelligence framework which helps you to make sense out of your data and make well informed decisions.
amazon-redshift big-data business-intelligence dashboard data-analysis data-visualization druid graph-database hive mongodb mysql neo4j nosql oracle-database postgresql rdbms reporting sql-editor sqllite
Last synced: 06 Apr 2025
https://github.com/DTStack/dt-sql-parser
SQL Parsers for BigData, built with antlr4.
antlr4 autocompletion bigdata flink hive impala mysql parser postgresql spark sql sql-validation trino
Last synced: 01 Apr 2025
https://github.com/ExpediaGroup/waggle-dance
Hive federation service. Enables disparate tables to be concurrently accessed across multiple Hive deployments.
federation hive hive-metastore metastore oss-portal-listed
Last synced: 04 Apr 2025
https://github.com/HiveRunner/HiveRunner
An Open Source unit test framework for Hive queries based on JUnit 4 and 5
hive hive-sql junit klarna-featured test-framework testing
Last synced: 13 May 2025
https://github.com/ecency/ecency-mobile
Ecency Mobile - reimagined social blogging, contribute and get rewarded (for Android and iOS)
android blockchain crypto ecency epoint esteem hive hiveio ios mobile react react-native reward rewarding social-media
Last synced: 20 Jun 2025
https://github.com/deandreamatias/tv-randshow
App to choose a random TV show episode - Made with #Flutter
android dart flare-animation flutter flutter-app flutter-apps hive sqflite streaming tmdb tv-randshow
Last synced: 12 Apr 2025
https://github.com/HariSekhon/HAProxy-configs
80+ HAProxy Configs for Hadoop, Big Data, NoSQL, Docker, Kubernetes, Elasticsearch, SolrCloud, HBase, MySQL, PostgreSQL, Apache Drill, Hive, Presto, Impala, Hue, ZooKeeper, SSH, RabbitMQ, Redis, Riak, Cloudera, OpenTSDB, InfluxDB, Prometheus, Kibana, Graphite, Rancher etc.
apache-drill cassandra cloudera elasticsearch hacktoberfest hadoop haproxy hbase hive influxdb mapr mysql nosql opentsdb postgresql presto prometheus redis solrcloud zookeeper
Last synced: 07 Apr 2025
https://github.com/harisekhon/haproxy-configs
80+ HAProxy Configs for Hadoop, Big Data, NoSQL, Docker, Kubernetes, Elasticsearch, SolrCloud, HBase, MySQL, PostgreSQL, Apache Drill, Hive, Presto, Impala, Hue, ZooKeeper, SSH, RabbitMQ, Redis, Riak, Cloudera, OpenTSDB, InfluxDB, Prometheus, Kibana, Graphite, Rancher etc.
apache-drill cassandra cloudera elasticsearch hacktoberfest hadoop haproxy hbase hive influxdb mapr mysql nosql opentsdb postgresql presto prometheus redis solrcloud zookeeper
Last synced: 09 Apr 2025
https://github.com/isxcode/spark-yun
Ultra-Lightweight AI-Powered Big Data Center | 至轻云-超轻量级智能化大数据中心
apache cdh data-analysis docker flink hadoop hive kubernetes saas spark
Last synced: 01 Sep 2025
https://github.com/qihoo360/xsql
Unified SQL Analytics Engine Based on SparkSQL
datasource elasticsearch federation hive spark sql
Last synced: 09 Apr 2025
https://github.com/damoonrashidi/bitalarm
An app to keep track of different cryptocurrencies, written in dart + flutter
cryptocurrencies dart flutter flutter-provider hacktoberfest hacktoberfest2020 hive
Last synced: 03 May 2025
https://github.com/xnuinside/simple-ddl-parser
Simple DDL Parser to parse SQL (HQL, TSQL, AWS Redshift, BigQuery, Snowflake and other dialects) ddl files to json/python dict with full information about columns: types, defaults, primary keys, etc. & table properties, types, domains, etc.
bigquery columns ddl ddl-parser ddls hive hql mssql mysql oracle-database oracle-db parser postgresql redshift schemas snowflake sql sql-parser tsql types
Last synced: 16 May 2025
https://github.com/anfeichtinger/flutter_production_boilerplate
A flutter boilerplate project containing bloc, lints, hive, easy_translations and more!
bloc boilerplate flutter flutter-apps flutter-demo flutter-examples hive lints localization
Last synced: 01 May 2025
https://github.com/yaooqinn/spark-authorizer
A Spark SQL extension which provides SQL Standard Authorization for Apache Spark | This repo is contributed to Apache Kyuubi | 项目已迁移至 Apache Kyuubi
acl hive ranger ranger-hive-plugin spark
Last synced: 13 Apr 2025
https://github.com/singgel/springboot-templates
springboot和dubbo、netty的集成,redis mongodb的nosql模板, kafka rocketmq rabbit的MQ模板, solr solrcloud elasticsearch查询引擎
dubbo elasticsearch hbase hive kafka logback lucene mongodb mybatis netty participle rabbitmq redis rocketmq solr spring-boot springboot starter swagger zookeeper
Last synced: 03 Apr 2025
https://github.com/elias8/last_fm
A simple app to demonstrate a testable, maintainable, and scalable architecture for flutter. flutter_bloc, get_it, hive, and REST API are some of the tech stacks used in this project.
bloc clean-architecture dart dependency-injection flutter hive lastfm layered-architecture music rest-api test
Last synced: 12 Sep 2025
https://github.com/zuinnote/hadoopcryptoledger
Hadoop Crypto Ledger - Analyzing CryptoLedgers, such as Bitcoin Blockchain, on Big Data platforms, such as Hadoop/Spark/Flink/Hive
bigdata bitcoin blockchain cryptoledger ethereum flink hadoop hive spark
Last synced: 13 Apr 2025
https://github.com/vb10/architecture_template_v2
"Flutter Architecture Template v2"
arcihtecture auto-route bloc easy-localization flutter hive strcutre
Last synced: 05 Aug 2025
https://github.com/yahoo/maha
A framework for rapid reporting API development; with out of the box support for high cardinality dimension lookups with druid.
analytics api-framework big-data druid druid-lookups druid-manager hive oracle postgresql presto scala sql star-schema
Last synced: 06 Apr 2025
https://github.com/smart-data-lake/smart-data-lake
Smart Automation Tool for building modern Data Lakes and Data Pipelines
data-lake data-pipelines deltalake hadoop hive scala smart-data-lake spark transform-data
Last synced: 13 Apr 2025
https://github.com/233zzh/TitanDataOperationSystem
最好的大数据项目。《Titan数据运营系统》,本项目是一个全栈闭环系统,我们有用作数据可视化的web系统,然后用flume-kafaka-flume进行日志的读取,在hive设计数仓,编写spark代码进行数仓表之间的转化以及ads层表到mysql的迁移,使用azkaban进行定时任务的调度,使用技术:Java/Scala语言,Hadoop、Spark、Hive、Kafka、Flume、Azkaban、SpringBoot,Bootstrap, Echart等;
azkaban flume hadoop hive kafka spark
Last synced: 27 Mar 2025
https://github.com/snowch/movie-recommender-demo
This project walks through how you can create recommendations using Apache Spark machine learning. There are a number of jupyter notebooks that you can run on IBM Data Science Experience, and there a live demo of a movie recommendation web application you can interact with. The demo also uses IBM Message Hub (kafka) to push application events to topic where they are consumed by a spark streaming job running on IBM BigInsights (hadoop).
alternating-least-squares biginsights bluemix bokeh cloudant collaborative-filtering dsx hadoop hive ibm-biginsights ibm-bluemix jupyter-notebook kafka machine-learning messagehub notebook python-flask-application redis spark spark-streaming
Last synced: 22 Apr 2025
https://github.com/taogeyt/pyetl
python ETL framework
csv data-analytics data-pipeline data-platform db es etl etl-process excel export hive mysql oracle python sql sqlserver
Last synced: 30 Oct 2025
https://github.com/apache/doris-website
Apache Doris Website
analytics apache big-data data-warehousing database datalake dbms distributed-system doris hadoop hive hudi iceberg mpp olap ssb tpch vectorized
Last synced: 15 May 2025
https://github.com/harisekhon/devops-perl-tools
25+ DevOps CLI Tools - Anonymizer, SQL ReCaser (MySQL, PostgreSQL, AWS Redshift, Snowflake, Apache Drill, Hive, Impala, Cassandra CQL, Microsoft SQL Server, Oracle, Couchbase N1QL, Dockerfiles), Hadoop HDFS & Hive tools, Solr/SolrCloud CLI, Nginx stats & HTTP(S) URL watchers for load-balanced web farms, Linux tools etc.
anonymize apache-drill cassandra couchbase docker hacktoberfest hadoop hbase hdfs hive kerberos linux mysql neo4j nginx recaser solr solrcloud sql
Last synced: 13 Jun 2025
https://github.com/expediagroup/circus-train
Circus Train is a dataset replication tool that copies Hive tables between clusters and clouds.
big-data bigquery hive hive-metastore hive-table replicate-data replication s3
Last synced: 21 Aug 2025
https://github.com/ExpediaGroup/circus-train
Circus Train is a dataset replication tool that copies Hive tables between clusters and clouds.
big-data bigquery hive hive-metastore hive-table replicate-data replication s3
Last synced: 13 May 2025
https://github.com/groupon/luigi-warehouse
A luigi powered analytics / warehouse stack
aws etl google-sheets hive luigi mysql postgresql python redshift salesforce spark teradata typeform workflow
Last synced: 23 Jul 2025
https://github.com/cdapio/hadoop_cookbook
Cookbook to install Hadoop 2.0+ using Chef
chef chef-cookbook cookbooks hadoop hadoop-cookbook hbase hive spark zookeeper
Last synced: 07 Sep 2025