An open API service indexing awesome lists of open source software.

https://github.com/frictionlessdata/datapackage-pipelines-elasticsearch

Datapackage-pipelines integration with Elasticsearch
https://github.com/frictionlessdata/datapackage-pipelines-elasticsearch

Last synced: 7 months ago
JSON representation

Datapackage-pipelines integration with Elasticsearch

Awesome Lists containing this project

README

          

# ElasticSearch Extensions for datapackage-pipelines

## Install

```
# use pip install

pip install datapackage-pipelines-elasticsearch

# OR clone the repo and install it with pip

git clone https://github.com/frictionlessdata/datapackage-pipelines-elasticsearch.git
pip install -e .
```

## Usage

You can use datapackage-pipelines-elasticsearch as a plugin for (dpp)[https://github.com/frictionlessdata/datapackage-pipelines#datapackage-pipelines]. In pipeline-spec.yaml it will look like this

```yaml
...
- run: elasticseach.dump.to_index
```

### ***`dump.to_index`***

Saves the datapackage to an ElasticSearch instance.

_Parameters_:

- `engine` - Connection string for connecting to the ElasticSearch instance (URL syntax)
Also supports `env://`, which indicates that the connection string should be fetched from the indicated environment variable.
If not specified, assumes a default of `env://DPP_ELASTICSEARCH`
Environment variable should take the form of 'host:port' or a fully-qualified url (e.g. 'https://user:pass@host:port' or 'https://host:port' etc.)
- `indexes` - Mapping between resources and indexes. Keys are index names, value is a list of objects with the following attributes:
- `resource-name` - name of the resource that should be dumped to the table
- `doc-type` - The document type to use when indexing docuemtns