https://github.com/mooso/azure-tables-hadoop
A Hadoop input format and a Hive storage handler so that you can access data stored in Windows Azure Storage tables from within a Hadoop (or HdInsight) cluster.
https://github.com/mooso/azure-tables-hadoop
Last synced: 5 months ago
JSON representation
A Hadoop input format and a Hive storage handler so that you can access data stored in Windows Azure Storage tables from within a Hadoop (or HdInsight) cluster.
- Host: GitHub
- URL: https://github.com/mooso/azure-tables-hadoop
- Owner: mooso
- License: apache-2.0
- Created: 2013-05-20T03:45:31.000Z (almost 12 years ago)
- Default Branch: master
- Last Pushed: 2014-10-20T19:03:21.000Z (over 10 years ago)
- Last Synced: 2024-05-29T17:16:00.743Z (11 months ago)
- Language: Java
- Homepage:
- Size: 360 KB
- Stars: 25
- Watchers: 11
- Forks: 15
- Open Issues: 3
-
Metadata Files:
- Readme: README.md
- License: LICENSE.txt
Awesome Lists containing this project
- awesome-hive - Azure Tables
README
# azure-tables-hadoop
## Description
A Hadoop input format and a Hive storage handler so that you can access data stored in Windows Azure Storage tables from within a Hadoop (or HdInsight) cluster.
## Usage
### From Hive
Make sure the jar is in hive.aux.jars.path (add it in your hive-site.xml). Example:
hive.aux.jars.path
file:///c:/azure-tables-hadoop/target/microsoft-hadoop-azure-0.0.1.jar
Create the table like this:CREATE EXTERNAL TABLE az_table(intField int, stringField string)
STORED BY 'com.microsoft.hadoop.azure.hive.AzureTableHiveStorageHandler'
TBLPROPERTIES(
"azure.table.name" = "",
"azure.table.account.uri" = "http://.table.core.windows.net",
"azure.table.storage.key" = ""
);#### Caveats
* Column names should match (case doesn't matter)
* Only {string, int, bigint, double, boolean} types are supported
* Storing data into the table is not supported### From Map-Reduce
There's an example in the code, SixCounter, that you can use to guide you. Short story: use AzureTableConfiguration.configureInputTable() to configure the input table, and set your input format as AzureTableInputFormat.