Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/mnubo/flink-elasticsearch-source-connector
Allow to pipe the result of an Elasticsearch query into a Flink data set
https://github.com/mnubo/flink-elasticsearch-source-connector
Last synced: 6 days ago
JSON representation
Allow to pipe the result of an Elasticsearch query into a Flink data set
- Host: GitHub
- URL: https://github.com/mnubo/flink-elasticsearch-source-connector
- Owner: mnubo
- License: apache-2.0
- Created: 2016-05-24T14:03:14.000Z (over 8 years ago)
- Default Branch: master
- Last Pushed: 2016-05-30T15:53:09.000Z (over 8 years ago)
- Last Synced: 2024-03-27T03:11:11.709Z (8 months ago)
- Language: Scala
- Size: 45.9 KB
- Stars: 8
- Watchers: 10
- Forks: 7
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Apache Flink source connector for Elasticsearch
Allow to pipe the result of an Elasticsearch query into a Flink data set. Supports scala & java tuples, case classes, POJO, and a variable length result set called DataRow.
## Usage:
### buil.sbt
libraryDependencies += "com.mnubo" %% "flink-elasticsearch-source-connector" % "1.0.0-flink1"
### then:
import com.mnubo.flink.streaming.connectors.DataRow
import com.mnubo.flink.streaming.connectors.elasticsearch.ElasticsearchDataset
import org.apache.flink.api.scala._val esIndexName = "my_es_index"
val esNodeHostNames = Set("es_node_1", "es_node_2", "es_node_3")
val esHttpPort = 9200
val esQuery = """{"fields": ["some_string","some_boolean","some_long","some_date","sub_doc.sub_doc_id"]}"""
val dataSet = ElasticsearchDataset.fromElasticsearchQuery[DataRow](
ExecutionEnvironment.getExecutionEnvironment,
esIndexName,
esQuery,
esNodeHostNamess,
esHttpPort
)dataSet
.groupBy("sub_doc.sub_doc_id")
.sum(2)The Elasticsearch query must contain a `fields` field.
Aggregations are not supported.
Tested with Elasticsearch 1.5.2, 1.7.5, and 2.3.3.