Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/miweru/vrt_spacy
https://github.com/miweru/vrt_spacy
corpora linguistic-corpora linguistics nlp spacy vrt wrapper
Last synced: 18 days ago
JSON representation
- Host: GitHub
- URL: https://github.com/miweru/vrt_spacy
- Owner: miweru
- License: gpl-3.0
- Created: 2019-12-19T20:36:14.000Z (almost 5 years ago)
- Default Branch: master
- Last Pushed: 2019-12-19T23:41:50.000Z (almost 5 years ago)
- Last Synced: 2024-10-08T01:31:12.170Z (29 days ago)
- Topics: corpora, linguistic-corpora, linguistics, nlp, spacy, vrt, wrapper
- Language: Python
- Size: 20.5 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# vrt_spacy
Python class for creating vrt-annotated corpora.
Still in very early testing stage.Install by typing:
```bash
pip install vrt_spacy
```Usage Example:
```python
from vrt import Corpus, S, Text
from vrt_spacy import Annotate
with Corpus("~","meinkorpus",4,"text_name") as c:
annotate = Annotate(c, spacymodel="de_core_news_md")
annotate("Das hier ist mein Text", text_name="Text1")
with Text(c, text_name="Text2") as t:
with S(c) as s:
s.writep("Test","TAG","TAG","Lemma")
```Features:
-
- Represent Corpus, Text, P and S Attributes
- Integration of spacy for automatic generation of a vrt-representation of texts
- Using Context Manager for xml-hierarchy representation
- Reduces to utf8mb3 and checks formatting compatibility