Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/dharmeshkakadia/awesome-hive

Everything about Apache Hive that is awesome
https://github.com/dharmeshkakadia/awesome-hive

List: awesome-hive

awesome hive

Last synced: 16 days ago
JSON representation

Everything about Apache Hive that is awesome

Awesome Lists containing this project

README

        

# awesome-hive
Everything about [Apache Hive](https://hive.apache.org) that is awesome

* [UDFs](#udfs-user-defined-functions)
* [Execution Engines](#execution-engines)
* [Client Libraries](#client-libraries)
* [UI](#ui)
* [Tools](#tools)
* [Metastore tools](#metastore-tools)
* [Data Replication Tools](#data-replication-tools)
* [Deployment](#deployment)
* [Testing](#testing)
* [Unsorted](#unsorted)
* [Integrations](#integrations)
* [Benchmarks](#benchmarks)
* [Benchmark Kits](#benchmark-kits)
* [Cloud offerings](#cloud-offerings)
* [Resources](#resources)

## UDFs (User Defined Functions)
* http://nexr.github.io/hive-udf/
* https://github.com/Netflix/Surus
* https://github.com/deanwampler/HiveUDFs
* https://github.com/karthkk/udfs
* https://github.com/myui/hivemall
* https://github.com/ThinkBigAnalytics/Hive-Extensions-from-Think-Big-Analytics
* https://github.com/twitter/elephant-bird
* https://github.com/lovelysystems/ls-hive
* https://github.com/klout/brickhouse
* https://github.com/stewi2/hive-udfs

## Execution Engines
* [Tez](https://cwiki.apache.org/confluence/display/Hive/Hive+on+Tez)
* MapReduce
* [LLAP](https://cwiki.apache.org/confluence/display/Hive/LLAP)
* [Spark](https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started)

## Storage Hadndlers
* [Kafka](https://github.com/HiveKa/HiveKa)
* [HBase](https://cwiki.apache.org/confluence/display/Hive/HBaseIntegration)
* [JDBC](https://issues.apache.org/jira/browse/HIVE-1555)
* [Google Spreadsheet](https://github.com/balshor/gdata-storagehandler)
* [VoltDB](https://issues.voltdb.com/browse/ENG-10736?page=com.atlassian.jira.plugin.system.issuetabpanels%3Aall-tabpanel)
* [Phoenix](https://phoenix.apache.org/hive_storage_handler.html)
* [Elasticsearch](https://www.elastic.co/guide/en/elasticsearch/hadoop/current/hive.html)
* [Solr](https://github.com/chimpler/hive-solr)
* [MongoDB](https://github.com/yc-huang/Hive-mongo)
* [Azure Tables](https://github.com/mooso/azure-tables-hadoop)
* [Cassandra](https://issues.apache.org/jira/browse/HIVE-1434)
* [HyperTable](https://code.google.com/archive/p/hypertable/wikis/HiveExtension.wiki)
* [Kinesis](https://github.com/qubole/kinesis-storage-handler)
* [Oracle NoSQL](https://github.com/vilcek/HiveKVStorageHandler2)
* [MySQL](https://github.com/sjywying/hive-mysql-storage-handler)
* [Omniture](https://github.com/bartekdobija/hive-omniture-storage-handler)
* [Vertica](https://github.com/bryanherger/VerticaHiveStorageHandler)

## Client Libraries
* JDBC
* ODBC
* [Python](https://github.com/dropbox/PyHive)
* [Clojure ](https://github.com/bmuller/clive)
* [Go Client for Hive Metastore](https://github.com/akolb1/gometastore)
* [hclient - stand alone Thrift HMS client and benchmarking tools](https://github.com/akolb1/hclient)

## UI

## Tools

### Metastore Tools
* [Drone Fly - distributed Hive metastore events forwarder service](https://github.com/ExpediaGroup/drone-fly)
* [Shunting Yard - real-time data replication between Hive Metastores](https://github.com/ExpediaGroup/shunting-yard)

### Data Replication Tools
* [Warehouse replicator](https://github.com/airbnb/reair)
* [Circus Train - dataset replication tool between clusters and clouds](https://github.com/HotelsDotCom/circus-train)
* [Dumping Machine](https://github.com/grupozap/dumping-machine)

### Deployment
* [KDP - Kubernets-Data-Platform](https://github.com/smartcitiesdata/kdp)
* [hive metasotre on staroid(Managed Kubernetes)](https://github.com/open-datastudio/hive-metastore)

### Testing
* [Datafaker](https://github.com/gangly/datafaker)
* [HiveRunner](https://github.com/klarna/HiveRunner)
* [Beeju - JUnit rules for testing Metastore Thrift API](https://github.com/HotelsDotCom/beeju)

### Unsorted
* [Database manager for Hive](https://github.com/flaminem/flamy)
* [Waggle Dance - Hive federation service](https://github.com/HotelsDotCom/waggle-dance)

## Integrations
* [apiary](https://github.com/ExpediaGroup/apiary)
* [beekeeper - service for automatically managing and cleaning up unreferenced data](https://github.com/ExpediaGroup/beekeeper)
* [cube.js](https://github.com/cube-js/cube.js)
* [Quicksql](https://github.com/Qihoo360/Quicksql)

## Benchmarks
* [TPCH](https://github.com/dharmeshkakadia/tpch-datagen-as-hive-query)
* [TPCDS](https://github.com/dharmeshkakadia/tpcds-datagen-as-hive-query)
* [TPCxBB/BigBench](http://www.tpc.org/tpcx-bb/default5.asp)

## Benchmark Kits
* [HivePerformanceAutomation](https://github.com/hdinsight/HivePerformanceAutomation)
* [hive-testbench](https://github.com/hortonworks/hive-testbench)

## Cloud offerings
* [Azure HDInsight](https://azure.microsoft.com/en-us/services/hdinsight/)
* [Amazon EMR](https://aws.amazon.com/emr/)
* [Google Dataproc](https://cloud.google.com/dataproc/)

## Resources
* https://cwiki.apache.org/confluence/display/Hive/Presentations

## Books

## Mailinglists

## Conferences