https://github.com/tokestermw/spacy_grammar
:black_nib: Language Tool style grammar handling with spaCy 2.0
https://github.com/tokestermw/spacy_grammar
grammar nlp spacy spacy-nlp
Last synced: 23 days ago
JSON representation
:black_nib: Language Tool style grammar handling with spaCy 2.0
- Host: GitHub
- URL: https://github.com/tokestermw/spacy_grammar
- Owner: tokestermw
- License: gpl-3.0
- Created: 2017-10-21T21:39:43.000Z (over 7 years ago)
- Default Branch: master
- Last Pushed: 2018-07-27T19:00:37.000Z (almost 7 years ago)
- Last Synced: 2025-03-27T11:43:42.558Z (about 1 month ago)
- Topics: grammar, nlp, spacy, spacy-nlp
- Language: Python
- Size: 33.2 KB
- Stars: 42
- Watchers: 4
- Forks: 3
- Open Issues: 5
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
## spacy_grammar: rule-based grammar detection for spaCy
This packages uses the [spaCy 2.0 alpha](https://alpha.spacy.io/usage/v2)
which provide support for adding custom attributes to `Doc`, `Span`, and
`Token` objects. It also leverages the [Matcher API](https://spacy.io/docs/usage/rule-based-matching)
in spaCy to quickly match on spaCy tokens not dissimilar to regex. It
reads a `grammar.yml` file to load up custom patterns and returns the
results inside `Doc`, `Span`, and `Token`.It is extensible through adding rules to `grammar.yml` (though currently
only the simple string matching is implemented).```
doc = nlp('I can haz cheeseburger.')
doc._.has_grammar_error # True
```A lot of thanks to [spacymoji](https://github.com/ines/spacymoji) and
[languagetool](https://www.languagetool.org) for inspiration.## Install
This package uses Python 3.6.
```
python3.6 -m venv .
source bin/activate
pip install -r requirements.txt
```You will also need to install a spaCy model by:
```
python -m spacy download en_core_web_sm
```## Usage
From the root directory, you can check to see if the package is working.
```
python -m spacy_grammar.grammar
```This code checks to see if someone wrote `as follow` instead of `as follows`.
```
import spacy
nlp = spacy.load('en_core_web_sm')
grammar = Grammar(nlp)
nlp.add_pipe(grammar)
doc = nlp('We can elaborate this distinction as follow.')
print([i._.g_as_follow_as_follows for i in doc])
# [False, False, False, False, False, True, True, False]
```## Adding rules
In `grammar.yml`, you can add rules following this template:
```
CATEGORY_NAME:
RULE_NAME:
description:
patterns:
- PATTERN_1
- PATTERN_2
...
corrections:
- CORRECTION_1
- CORRECTION_2
...
examples:
- EXAMPLE_1
- EXAMPLE_2
...```
A pattern could match on a plain string or a list of token attributes.
```
# match on string
- as follow
# match on list of token attributes
-
- LOWER: follow
- LOWER: follow
```