https://github.com/stanfordnlp/nn-depparser
A re-implementation of nndep using PyTorch.
https://github.com/stanfordnlp/nn-depparser
Last synced: about 1 month ago
JSON representation
A re-implementation of nndep using PyTorch.
- Host: GitHub
- URL: https://github.com/stanfordnlp/nn-depparser
- Owner: stanfordnlp
- Created: 2016-04-27T05:43:33.000Z (about 9 years ago)
- Default Branch: main
- Last Pushed: 2024-10-17T20:24:36.000Z (8 months ago)
- Last Synced: 2024-12-30T20:16:04.298Z (6 months ago)
- Language: Python
- Homepage:
- Size: 417 KB
- Stars: 1
- Watchers: 3
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# nn-depparser
A re-implementation of `nndep` using PyTorch.
Currently used for training CoreNLP dependency parsing models.
Requires Stanza for some features (auto-tagging with CoreNLP via server).
Originally by Danqi Chen. Leave a GitHub Issue if you have any questions!
## Example Usage
Train a model:
```bash
python train.py -l universal -d /path/to/data --train_file it-train.conllu --dev_file it-dev.conllu --test_file it-test.conllu --embedding_file /path/to/it-embeddings.txt --embedding_size 100 --random_seed 21 --learning_rate .005 --l2_reg .01 --epsilon .001 --optimizer adamw --save_path /path/to/experiment-dir --job_id experiment-name --corenlp_tags --corenlp_tag_lang italian --n_epoches 2000
```Note that the above command will automatically tag the input data with the CoreNLP tagger.
Thus you need to have CoreNLP and the Italian models (for this example) in your CLASSPATH,
and you need the latest version of Stanza installed.Why is this done? When CoreNLP runs a dependency parser, it relies on part of speech tags,
so the training and development data used during training need to have the predicted tags
CoreNLP will use for optimal performance.Convert to CoreNLP format:
```bash
python gen_model.py -o /path/to/italian-corenlp-parser.txt /path/to/experiment-dir/experiment-name
```