Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/izuna385/jel
Japanese Entity Linker.
https://github.com/izuna385/jel
allennlp entity-linking jel natural-language-processing python pytorch question-answering transformers
Last synced: about 1 month ago
JSON representation
Japanese Entity Linker.
- Host: GitHub
- URL: https://github.com/izuna385/jel
- Owner: izuna385
- License: apache-2.0
- Created: 2021-03-08T16:14:17.000Z (almost 4 years ago)
- Default Branch: main
- Last Pushed: 2021-07-25T04:01:46.000Z (over 3 years ago)
- Last Synced: 2024-12-12T10:23:08.057Z (about 1 month ago)
- Topics: allennlp, entity-linking, jel, natural-language-processing, python, pytorch, question-answering, transformers
- Language: Python
- Homepage: https://pypi.org/project/jel/
- Size: 46.9 MB
- Stars: 11
- Watchers: 1
- Forks: 1
- Open Issues: 3
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# jel: Japanese Entity Linker
* jel - Japanese Entity Linker - is Bi-encoder based entity linker for japanese.# Usage
* Currently, `link` and `question` methods are supported.## `el.link`
* This returnes named entity and its candidate ones from Wikipedia titles.
```python
from jel import EntityLinker
el = EntityLinker()el.link('今日は東京都のマックにアップルを買いに行き、スティーブジョブスとドナルドに会い、堀田区に引っ越した。')
>> [
{
"text": "東京都",
"label": "GPE",
"span": [
3,
6
],
"predicted_normalized_entities": [
[
"東京都庁",
0.1084
],
[
"東京",
0.0633
],
[
"国家地方警察東京都本部",
0.0604
],
[
"東京都",
0.0598
],
...
]
},
{
"text": "アップル",
"label": "ORG",
"span": [
11,
15
],
"predicted_normalized_entities": [
[
"アップル",
0.2986
],
[
"アップル インコーポレイテッド",
0.1792
],
…
]
}
```## `el.question`
* This returnes candidate entity for any question from Wikipedia titles.
```python
>>> linker.question('日本の総理大臣は?')
[('菅内閣', 0.05791765857101555), ('枢密院', 0.05592481946602986), ('党', 0.05430194711042564), ('総選挙', 0.052795400668513175)]
```## Setup
```
$ pip install jel
$ python -m spacy download ja_core_news_md
```## Run as API
```
$ uvicorn jel.api.server:app --reload --port 8000 --host 0.0.0.0 --log-level trace
```### Example
```
# link
$ curl localhost:8000/link -X POST -H "Content-Type: application/json" \
-d '{"sentence": "日本の総理は菅総理だ。"}'# question
$ curl localhost:8000/question -X POST -H "Content-Type: application/json" \
-d '{"sentence": "日本で有名な総理は?"}
```## Test
`$ python pytest`## Notes
* faiss==1.5.3 from pip causes error _swigfaiss.
* To solve this, see [this issue](https://github.com/facebookresearch/faiss/issues/821#issuecomment-573531694).## LICENSE
Apache 2.0 License.## CITATION
```
@INPROCEEDINGS{manabe2019chive,
author = {真鍋陽俊, 岡照晃, 海川祥毅, 髙岡一馬, 内田佳孝, 浅原正幸},
title = {複数粒度の分割結果に基づく日本語単語分散表現},
booktitle = "言語処理学会第25回年次大会(NLP2019)",
year = "2019",
pages = "NLP2019-P8-5",
publisher = "言語処理学会",
}
```