Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/slub/esmarc
marc21 -> rdf mapping tool
https://github.com/slub/esmarc
json-ld json-ld-context marc21 python3 rdf rdflib
Last synced: 27 days ago
JSON representation
marc21 -> rdf mapping tool
- Host: GitHub
- URL: https://github.com/slub/esmarc
- Owner: slub
- License: apache-2.0
- Created: 2020-01-28T13:22:24.000Z (almost 5 years ago)
- Default Branch: master
- Last Pushed: 2023-12-14T15:02:15.000Z (about 1 year ago)
- Last Synced: 2024-04-14T22:49:12.618Z (8 months ago)
- Topics: json-ld, json-ld-context, marc21, python3, rdf, rdflib
- Language: Python
- Size: 319 KB
- Stars: 0
- Watchers: 6
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Installation
run:
```
pip3 install . --user
```# esmarc.py
esmarc is a python3 tool to read line-delimited MARC21 JSON from an elasticSearch index, perform a mapping and writes the output in a directory with a file for each mapping type.
dependencies:
python3-elasticsearch
efre-lod-elasticsearch-toolsrun:
```
$ esmarc.py
-h, --help show this help message and exit
-host HOST hostname or IP-Address of the ElasticSearch-node to use. If None we try to read ldj from stdin.
-port PORT Port of the ElasticSearch-node to use, default is 9200.
-type TYPE ElasticSearch Type to use
-index INDEX ElasticSearch Index to use
-id ID map single document, given by id
-help print this help
-prefix PREFIX Prefix to use for output data
-debug Dump processed Records to stdout (mostly used for debug-purposes)
-server SERVER use http://host:port/index/type/id?pretty syntax. overwrites host/port/index/id/pretty.
-pretty output tabbed json
-w W how many processes to use
-idfile IDFILE path to a file with IDs to process
-query QUERY prefilter the data based on an elasticsearch-query```
# entityfacts-bot.py
entityfacts-bot.py is a Python3 program that enrichs ("links") your data with more identifiers from entitiyfacts. Prerequisits is that you have a field containing your GND-Identifier.
It connects to an elasticsearch node and outputs the enriched data, which can be put back to the index using esbulk.
## Usage
```
./entityfacts-bot.py
-h, --help show this help message and exit
-host HOST hostname or IP-Address of the ElasticSearch-node to use, default is localhost.
-port PORT Port of the ElasticSearch-node to use, default is 9200.
-index INDEX ElasticSearch Search Index to use
-type TYPE ElasticSearch Search Index Type to use
-id ID retrieve single document (optional)
-searchserver SEARCHSERVER use http://host:port/index/type/id?pretty. overwrites host/port/index/id/pretty
-stdin get data from stdin
-pipeline output every record (even if not enriched) to put this script into a pipeline```
## Requirements
python3-elasticsearch
e.g. (ubuntu)
```
sudo apt-get install python3-elasticsearch
```# wikidata.py
wikidata.py is a Python3 program that enrichs ("links") your data with the wikidata-identifier from wikidata. Prerequisits is that you have a field containing your GND-Identifier. Other identifiers are planned to be used in future.
It connects to an elasticsearch node and outputs the enriched data, which can be put back to the index using esbulk.
## Usage
```
./wikidata.py
-h, --help show this help message and exit
-host HOST hostname or IP-Address of the ElasticSearch-node to use, default is localhost.
-port PORT Port of the ElasticSearch-node to use, default is 9200.
-index INDEX ElasticSearch Search Index to use
-type TYPE ElasticSearch Search Index Type to use
-id ID retrieve single document (optional)
-stdin get data from stdin
-pipeline output every record (even if not enriched) to put this script into a pipeline
-server SERVER use http://host:port/index/type/id?pretty. overwrites host/port/index/id/pretty
```