https://github.com/HiveKa/HiveKa
Kafka as Hive Storage
https://github.com/HiveKa/HiveKa
Last synced: 5 months ago
JSON representation
Kafka as Hive Storage
- Host: GitHub
- URL: https://github.com/HiveKa/HiveKa
- Owner: HiveKa
- License: apache-2.0
- Created: 2014-11-06T18:53:58.000Z (over 10 years ago)
- Default Branch: master
- Last Pushed: 2014-11-13T04:32:08.000Z (over 10 years ago)
- Last Synced: 2024-05-07T18:22:22.845Z (12 months ago)
- Language: Java
- Size: 375 KB
- Stars: 67
- Watchers: 14
- Forks: 21
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- awesome-hive - Kafka
README
# Hive Storage Handler for Kafka
HiveKa is Apache Hive's storage handler that adds support in Apache Hive to query data from Apache Kafka. This provides an opportunity to Kafka users to inspect data ingested by Kafka without writing complex Kafka consumers. Hive makes it possible to run complex analytical queries across various data sources, like, HDFS, Solr, Hbase, etc.. HiveKa extends this support to Kafka.
Visit our [website](http://hiveka.weebly.com/).
To create a Kafka table in Hive run:
```
create external table test_kafka (a int, b string) stored by 'org.apache.hadoop.hive.kafka.KafkaStorageHandler' tblproperties('kafka.service.uri'='hivekafka-1.ent.cloudera.com:9092', 'kafka.whitelist.topics'='test4', 'kafka.avro.schema.file'='/tmp/test.avsc');
```To generate Avro byte data into a topic, run our DemoProducer and pass the topic, number of messages and a kafka broker as parameters.
For example:
```
java -classpath "/opt/cloudera/parcels/CDH/lib/avro/*:hive-kafka-1.0-SNAPSHOT.jar:/usr/lib/hive/*:/usr/lib/hive/*" org.apache.hadoop.hive.kafka.demoproducer.DemoProducer test4 10 hivekafka-1:9092
```