An open API service indexing awesome lists of open source software.

https://github.com/datacleaner/extension_elasticsearch

DataCleaner extension for ElasticSearch
https://github.com/datacleaner/extension_elasticsearch

Last synced: 4 months ago
JSON representation

DataCleaner extension for ElasticSearch

Awesome Lists containing this project

README

          

ElasticSearch for DataCleaner
=======================

This is a DataCleaner (http://datacleaner.org) extension for using the ElasticSearch (http://www.elasticsearch.org/) search engine in indexing and searching reference data.

Currently the extension contains these DataCleaner components:

* ElasticSearch indexer (*Analyze* menu)

This component allows you to build a (new or existing) search index by feeding in records to it. Each record will become a document in the search index. Each column of the record needs to be mapped to a field in the search index.

* ElasticSearch document ID lookup (*Transform* menu)

Performs a document lookup for each record, based on ID. This transformation is the equivalent of looking up records in a database by their primary key.

* ElasticSearch full text search (*Transform* menu)

Performs a search for each record, into a search index. The component allows searching across all fields or by setting a specific field to use for matching. The result of the transformation is a Document ID and a Document (represented as a map), which can further be processed by e.g. the built-in Data structures (*Transform* menu) components of DataCleaner.

Please feel free to fork, and to provide feedback in any form.