https://github.com/dcavar/nltk-json-nlp
NLTK wrapper to JSON-NLP.
https://github.com/dcavar/nltk-json-nlp
json natural-language-processing nltk
Last synced: about 1 year ago
JSON representation
NLTK wrapper to JSON-NLP.
- Host: GitHub
- URL: https://github.com/dcavar/nltk-json-nlp
- Owner: dcavar
- License: apache-2.0
- Created: 2019-03-26T21:37:44.000Z (over 7 years ago)
- Default Branch: master
- Last Pushed: 2019-05-25T17:42:00.000Z (about 7 years ago)
- Last Synced: 2025-04-09T17:16:10.660Z (about 1 year ago)
- Topics: json, natural-language-processing, nltk
- Language: Python
- Homepage: https://nlp-lab.org/nltkjsonnlp/
- Size: 45.9 KB
- Stars: 5
- Watchers: 4
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# NLTK-JSON-NLP
(C) 2019 by [Damir Cavar], [Oren Baldinger], Maanvitha Gongalla, Anurag Kumar, Murali Kammili, Boli Fang
Brought to you by the [NLP-Lab.org]!
## Introduction
[NLTK] wrapper to [JSON-NLP]. [NLTK] has a wide variety of capabilities, but for our purposes
we are limiting it to [WordNet], [VerbNet], and [FrameNet]. Other packages such as [spaCy] and
[Flair] are more accurately able to annotate things like part of speech tags and dependency
parses. See below for instruction on how to unify outputs from multiple packages.
## Microservice
The [JSON-NLP] repository provides a Microservice class, with a pre-built implementation of [Flask]. To run it, execute:
python nltkjsonnlp/server.py
Since `server.py` extends the [Flask] app, a WSGI file would contain:
from nltkjsonnlp.server import app as application
## Pipeline
[JSON-NLP] provides a simple `Pipeline` interface that we implement as `NltkPipeline`:
pipeline = nltkjsonnlp.NltkPipeline()
print(pipeline.process(text='I am a sentence.'))
## Unification
To make the best use of this pipeline, it is best to unify it with a more accurate and complete
pipeline such as [spaCy-NLP-Json]:
class UnifiedPipeline(pyjsonnlp.pipeline.Pipeline):
def __init__(self):
super(UnifiedPipeline, self).__init__()
self.spacy = spacynlpjson.SpacyPipeline()
self.nltk = nltkjsonnlp.NltkPipeline()
def process(self, text='', coreferences=True, constituents=False, dependencies=True, expressions=True,
**kwargs) -> OrderedDict:
# start with a spacy parse
spacy_json = self.spacy.process(text, spacy_model='en_core_web_md', constituents=False,
coreferences=coreferences, dependencies=dependencies, expressions=False)
# the get an nltk parse
nltk_json = self.nltk.process(text)
# unify the parses
return pyjsonnlp.unification.unifier.add_annotation_to_a_from_b(a=spacy_json,
b=nltk_json, annotation='tokens')
[Damir Cavar]: http://damir.cavar.me/ "Damir Cavar"
[Oren Baldinger]: https://oren.baldinger.me/ "Oren Baldinger"
[NLP-Lab.org]: http://nlp-lab.org/ "NLP-Lab.org"
[JSON-NLP]: https://github.com/dcavar/JSON-NLP "JSON-NLP"
[Flair]: https://github.com/zalandoresearch/flair "Flair"
[spaCy]: https://spacy.io/ "spaCy"
[NLTK]: http://nltk.org/ "Natural Language Processing Toolkit"
[Polyglot]: https://github.com/aboSamoor/polyglot "Polyglot"
[Xrenner]: https://github.com/amir-zeldes/xrenner "Xrenner"
[CONLL-U]: https://universaldependencies.org/format.html "CONLL-U"
[spaCy-NLP-Json]: https://github.com/dcavar/spaCy-JSON-NLP "spaCy-JSON-NLP"
[WordNet]: https://wordnet.princeton.edu/ "Wordnet"
[FrameNet]: https://framenet.icsi.berkeley.edu/fndrupal/ "FrameNet"
[VerbNet]: https://verbs.colorado.edu/~mpalmer/projects/verbnet.html "VerbNet"