https://github.com/tikquuss/nlp_tools
Natural language processing tools (tokenizer, ...) and evaluation metrics (BLUE, ...) for morphologically complex languages such as those of Africa.
https://github.com/tikquuss/nlp_tools
african-languages evaluation-metrics nlp tokenizer
Last synced: 7 months ago
JSON representation
Natural language processing tools (tokenizer, ...) and evaluation metrics (BLUE, ...) for morphologically complex languages such as those of Africa.
- Host: GitHub
- URL: https://github.com/tikquuss/nlp_tools
- Owner: Tikquuss
- License: apache-2.0
- Created: 2020-09-29T16:30:57.000Z (about 5 years ago)
- Default Branch: master
- Last Pushed: 2020-09-29T18:48:04.000Z (about 5 years ago)
- Last Synced: 2025-01-18T13:41:14.744Z (9 months ago)
- Topics: african-languages, evaluation-metrics, nlp, tokenizer
- Language: Perl
- Homepage:
- Size: 11.7 KB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
```bash
pip install -r requirements.txt
```# BLEU evaluation
```bash
python bleu.py --ref my/ref.txt --hyp my/hyp.txt --max_order 4 --smooth False
```
```python
import os, subprocessref = "my/ref.txt"
hyp = "my/hyp.txt"command = "multi-bleu.perl %s < %s"
if os.name == "nt" :
command = "perl %s" % command
p = subprocess.Popen(command % (ref, hyp), stdout=subprocess.PIPE, shell=True)
result = p.communicate()[0].decode("utf-8")
if result.startswith('BLEU'):
bleu = float(result[7:result.index(',')])
else:
print('Impossible to parse BLEU score! "%s"' % result)
bleu = -1
```