An open API service indexing awesome lists of open source software.

https://github.com/implydata/druid-hadoop-inputformat

Hadoop InputFormat for http://druid.io/
https://github.com/implydata/druid-hadoop-inputformat

Last synced: 4 months ago
JSON representation

Hadoop InputFormat for http://druid.io/

Awesome Lists containing this project

README

          

## Druid Hadoop InputFormat

This is a Hadoop InputFormat that can be used to load Druid data from deep storage.

### Installation

To install this library, run `mvn install`. You can then include it in projects with Maven by using the dependency:

```xml

io.imply
druid-hadoop-inputformat
0.1-SNAPSHOT

```

### Example

Here's an example of creating an RDD in Spark:

```java
final JobConf jobConf = new JobConf();
final String coordinatorHost = "localhost:8081";
final String dataSource = "wikiticker";
final List intervals = null; // null to include all time
final DimFilter filter = null; // null to include all rows
final List columns = null; // null to include all columns

DruidInputFormat.setInputs(
jobConf,
coordinatorHost,
dataSource,
intervals,
filter,
columns
);

final JavaPairRDD rdd = jsc.newAPIHadoopRDD(
jobConf,
DruidInputFormat.class,
NullWritable.class,
InputRow.class
);
```