Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/ggordonhall/measurement_tagger
Spacy Measurement Tagger
https://github.com/ggordonhall/measurement_tagger
dependency-parser measurement-tagger nlp python spacy tagger
Last synced: 18 days ago
JSON representation
Spacy Measurement Tagger
- Host: GitHub
- URL: https://github.com/ggordonhall/measurement_tagger
- Owner: ggordonhall
- Created: 2018-10-12T13:01:13.000Z (over 6 years ago)
- Default Branch: master
- Last Pushed: 2018-12-13T15:40:16.000Z (about 6 years ago)
- Last Synced: 2024-11-07T01:39:04.530Z (2 months ago)
- Topics: dependency-parser, measurement-tagger, nlp, python, spacy, tagger
- Language: Python
- Homepage:
- Size: 84 KB
- Stars: 1
- Watchers: 1
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Measurement Tagger
[![Build Status](https://travis-ci.org/ggordonhall/measurement_tagger.png)](https://travis-ci.org/ggordonhall/measurement_tagger)
[![codecov](https://codecov.io/gh/ggordonhall/measurement_tagger/branch/master/graph/badge.svg)](https://codecov.io/gh/ggordonhall/measurement_tagger)## A dependency parse based measurement tagger.
Text to be tagged should be stored in the `text/` directory. The file to tag is specific with `-t`.
Run in mode (`-m`):
| Mode | Measurement |
| --------|:-------------:|
| `d` | Distance |
| `t` | Time |
| `m` | Mass |
| `e` | Energy |
| `v` | Volume |By default, tags measurements then convertes them to their standard unit. Unconverted measurements can be returned if run with the `--return_unconverted` flag. The maximum n-gram to search for measurement units, i.e. `nautical miles`, can be set with the
`--max_gram` flag.Run `pipenv install && python -m spacy download en && python -m nltk.downloader wordnet` to setup, then test by running `main.py -m d -t wiki.txt`.
## TO DO
- [ ] Fix `--parallel` flag
- [x] Improve handling of n-gram measurement units