Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/mnubo/flink-elasticsearch-source-connector

Allow to pipe the result of an Elasticsearch query into a Flink data set
https://github.com/mnubo/flink-elasticsearch-source-connector

Last synced: 6 days ago
JSON representation

Allow to pipe the result of an Elasticsearch query into a Flink data set

Host: GitHub
URL: https://github.com/mnubo/flink-elasticsearch-source-connector
Owner: mnubo
License: apache-2.0
Created: 2016-05-24T14:03:14.000Z (over 8 years ago)
Default Branch: master
Last Pushed: 2016-05-30T15:53:09.000Z (over 8 years ago)
Last Synced: 2024-03-27T03:11:11.709Z (8 months ago)
Language: Scala
Size: 45.9 KB
Stars: 8
Watchers: 10
Forks: 7
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

        # Apache Flink source connector for Elasticsearch

Allow to pipe the result of an Elasticsearch query into a Flink data set. Supports scala & java tuples, case classes, POJO, and a variable length result set called DataRow.

## Usage:

### buil.sbt

    libraryDependencies += "com.mnubo" %% "flink-elasticsearch-source-connector" % "1.0.0-flink1"

### then:

    import com.mnubo.flink.streaming.connectors.DataRow

    import com.mnubo.flink.streaming.connectors.elasticsearch.ElasticsearchDataset

    import org.apache.flink.api.scala._

    val esIndexName = "my_es_index"

    val esNodeHostNames = Set("es_node_1", "es_node_2", "es_node_3")

    val esHttpPort = 9200

    val esQuery = """{"fields": ["some_string","some_boolean","some_long","some_date","sub_doc.sub_doc_id"]}"""

    val dataSet = ElasticsearchDataset.fromElasticsearchQuery[DataRow](

      ExecutionEnvironment.getExecutionEnvironment,

      esIndexName,

      esQuery,

      esNodeHostNamess,

      esHttpPort

    )

    dataSet

      .groupBy("sub_doc.sub_doc_id")

      .sum(2)

      .print

The Elasticsearch query must contain a `fields` field.

Aggregations are not supported.

Tested with Elasticsearch 1.5.2, 1.7.5, and 2.3.3.