https://github.com/implydata/druid-hadoop-inputformat

Hadoop InputFormat for http://druid.io/
https://github.com/implydata/druid-hadoop-inputformat

Last synced: 4 months ago
JSON representation

Hadoop InputFormat for http://druid.io/

Host: GitHub
URL: https://github.com/implydata/druid-hadoop-inputformat
Owner: implydata
License: apache-2.0
Created: 2016-10-26T14:56:49.000Z (over 9 years ago)
Default Branch: master
Last Pushed: 2016-10-26T14:57:15.000Z (over 9 years ago)
Last Synced: 2024-04-14T20:22:57.915Z (about 2 years ago)
Language: Java
Size: 11.7 KB
Stars: 10
Watchers: 6
Forks: 7
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

## Druid Hadoop InputFormat

This is a Hadoop InputFormat that can be used to load Druid data from deep storage.

### Installation

To install this library, run `mvn install`. You can then include it in projects with Maven by using the dependency:

```xml

io.imply
druid-hadoop-inputformat
0.1-SNAPSHOT

```

### Example

Here's an example of creating an RDD in Spark:

```java
final JobConf jobConf = new JobConf();
final String coordinatorHost = "localhost:8081";
final String dataSource = "wikiticker";
final List intervals = null; // null to include all time
final DimFilter filter = null; // null to include all rows
final List columns = null; // null to include all columns

DruidInputFormat.setInputs(
jobConf,
coordinatorHost,
dataSource,
intervals,
filter,
columns
);

final JavaPairRDD rdd = jsc.newAPIHadoopRDD(
jobConf,
DruidInputFormat.class,
NullWritable.class,
InputRow.class
);
```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/implydata/druid-hadoop-inputformat

Awesome Lists containing this project

README