An open API service indexing awesome lists of open source software.

https://github.com/miweru/vrt_generator

Python class for creating vrt-annotated corpora
https://github.com/miweru/vrt_generator

corpora linguistic-corpora linguistics vrt wrapper

Last synced: 7 months ago
JSON representation

Python class for creating vrt-annotated corpora

Awesome Lists containing this project

README

          

# vrt_generator
Python class for creating vrt-annotated corpora.
Still in very early testing stage.

Install by typing:
```bash
pip install vrt_generator
```

Usage Example:
```python
from vrt import Corpus, S, Text
with Corpus("~","meinkorpus",4,"text_name") as c:
with Text(c, text_name="Text2") as t:
with S(c) as s:
s.writep("Test","TAG","TAG","Lemma")
```

Features:
-
- Represent Corpus, Text, P and S Attributes
- Integration of spacy for automatic generation of a vrt-representation of texts
- Using Context Manager for xml-hierarchy representation
- Reduces to utf8mb3 and checks formatting compatibility
- If you want to add texts that are automatically POS-Tagged with Spacy, you might look at vrt_spacy