Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/livingbio/syntaxnet_wrapper
A Python Wrapper for Google SyntaxNet
https://github.com/livingbio/syntaxnet_wrapper
google-syntaxnet nlp python python-wrapper syntaxnet
Last synced: 7 days ago
JSON representation
A Python Wrapper for Google SyntaxNet
- Host: GitHub
- URL: https://github.com/livingbio/syntaxnet_wrapper
- Owner: livingbio
- License: mit
- Created: 2016-09-05T04:32:18.000Z (about 8 years ago)
- Default Branch: master
- Last Pushed: 2023-11-25T05:32:38.000Z (12 months ago)
- Last Synced: 2024-05-15T20:14:07.030Z (6 months ago)
- Topics: google-syntaxnet, nlp, python, python-wrapper, syntaxnet
- Language: Python
- Size: 72.3 KB
- Stars: 34
- Watchers: 12
- Forks: 10
- Open Issues: 9
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# A Python Wrapper for Google SyntaxNet
## Installation
### Prerequisites
#### Install OpenJDK8.
```shell-script
add-apt-repository -y ppa:openjdk-r/ppa
apt-get -y update
apt-get -y install openjdk-8-jdk
```#### Install `bazel` and include `bazel` in `$PATH`.
**Note:** Only bazel 0.4.3 is runnable. bazel 0.4.4 may cause errors.
```shell-script
wget https://github.com/bazelbuild/bazel/releases/download/0.4.3/bazel-0.4.3-installer-linux-x86_64.sh
chmod +x bazel-0.4.3-installer-linux-x86_64.sh
./bazel-0.4.3-installer-linux-x86_64.sh --user
rm bazel-0.4.3-installer-linux-x86_64.sh
export PATH="$PATH:$HOME/bin"
```#### Install system package dependencies.
```shell-script
apt-get -y install swig unzip
```#### Install Python packages
**Note:** Current version of syntaxnet must be used with tensorflow r1.0.
```shell-script
pip install tensorflow protobuf asciitree mock
```#### Start Installing
```shell-script
pip install git+ssh://[email protected]/livingbio/syntaxnet_wrapper.git#egg=syntaxnet_wrapper
```#### If installation failed...
Execute [test.sh](https://github.com/livingbio/syntaxnet_wrapper/blob/master/syntaxnet_wrapper/test.sh), you should see following outputs:
```
1 Bob _ PROPN NNP Number=Sing|fPOS=PROPN++NNP 2 nsubj _ _
2 brought _ VERB VBD Mood=Ind|Tense=Past|VerbForm=Fin|fPOS=VERB++VBD 0 ROOT _ _
3 the _ DET DT Definite=Def|PronType=Art|fPOS=DET++DT 4 det _ _
4 pizza _ NOUN NN Number=Sing|fPOS=NOUN++NN 2 dobj _ _
5 to _ ADP IN fPOS=ADP++IN 6 case _ _
6 Alice. _ PROPN NNP Number=Sing|fPOS=PROPN++NNP 2 nmod _ _1 球 _ PROPN NNP fPOS=PROPN++NNP 4 nsubj _ _
2 從 _ ADP IN fPOS=ADP++IN 3 case _ _
3 天上 _ NOUN NN fPOS=NOUN++NN 4 nmod _ _
4 掉 _ VERB VV fPOS=VERB++VV 0 ROOT _ _
5 下來 _ VERB VV fPOS=VERB++VV 4 mark _ _球 從天 上 掉 下 來
```If the outputs are correct, problems are caused by the wrapper. If the outputs are wrong, compilation of syntaxnet may be failed.
## Usage
```python
from syntaxnet_wrapper import tagger, parserprint tagger['en'].query('this is a good day', returnRaw=True)
# 1 this _ DET DT _ 0 _ _ _
# 2 is _ VERB VBZ _ 0 _ _ _
# 3 a _ DET DT _ 0 _ _ _
# 4 good _ ADJ JJ _ 0 _ _ _
# 5 day _ NOUN NN _ 0 _ _ _
tagger['en'].query('this is a good day') # in default, return splitted textprint parser['en'].query('Alice drove down the street in her car', returnRaw=True)
# 1 Alice _ NOUN NNP _ 2 nsubj _ _
# 2 drove _ VERB VBD _ 0 ROOT _ _
# 3 down _ ADP IN _ 2 prep _ _
# 4 the _ DET DT _ 5 det _ _
# 5 street _ NOUN NN _ 3 pobj _ _
# 6 in _ ADP IN _ 2 prep _ _
# 7 her _ PRON PRP$ _ 8 poss _ _
# 8 car _ NOUN NN _ 6 pobj _ _# use Chinese model
print tagger['zh'].query(u'今天 天氣 很 好', returnRaw=True)
# 1 今天 _ NOUN NN fPOS=NOUN++NN 0 _ _ _
# 2 天氣 _ NOUN NN fPOS=NOUN++NN 0 _ _ _
# 3 很 _ ADV RB fPOS=ADV++RB 0 _ _ _
# 4 好 _ ADJ JJ fPOS=ADJ++JJ 0 _ _ _print parser['zh'].query(u'今天 天氣 很 好', returnRaw=True)
# 1 今天 _ NOUN NN fPOS=NOUN++NN 4 nmod:tmod _ _
# 2 天氣 _ NOUN NN fPOS=NOUN++NN 4 nsubj _ _
# 3 很 _ ADV RB fPOS=ADV++RB 4 advmod _ _
# 4 好 _ ADJ JJ fPOS=ADJ++JJ 0 ROOT _ _
```### Language Selection
The default model is `'English-Parsey'`. This is
[announced by Google](https://research.googleblog.com/2016/05/announcing-syntaxnet-worlds-most.html)
on May, 2016.
Other models, includes `'English'`, are trained by [Universal Dependencies](http://universaldependencies.org/),
[announced by Google](https://research.googleblog.com/2016/08/meet-parseys-cousins-syntax-for-40.html)
on August, 2016.```python
from syntaxnet_wrapper import language_code_to_model_name
language_code_to_model_name
# {'ar': 'Arabic',
# 'bg': 'Bulgarian',
# 'ca': 'Catalan',
# 'cs': 'Czech',
# 'da': 'Danish',
# 'de': 'German',
# 'el': 'Greek',
# 'en': 'English-Parsey',
# 'en-uni': 'English',
# 'es': 'Spanish',
# 'et': 'Estonian',
# 'eu': 'Basque',
# 'fa': 'Persian',
# 'fi': 'Finnish',
# 'fr': 'French',
# 'ga': 'Irish',
# 'gl': 'Galician',
# 'hi': 'Hindi',
# 'hr': 'Croatian',
# 'hu': 'Hungarian',
# 'id': 'Indonesian',
# 'it': 'Italian',
# 'iw': 'Hebrew',
# 'kk': 'Kazakh',
# 'la': 'Latin',
# 'lv': 'Latvian',
# 'nl': 'Dutch',
# 'no': 'Norwegian',
# 'pl': 'Polish',
# 'pt': 'Portuguese',
# 'ro': 'Romanian',
# 'ru': 'Russian',
# 'sl': 'Slovenian',
# 'sv': 'Swedish',
# 'ta': 'Tamil',
# 'tr': 'Turkish',
# 'zh': 'Chinese',
# 'zh-cn': 'Chinese',
# 'zh-tw': 'Chinese'}
```