https://github.com/miweru/vrt_generator
Python class for creating vrt-annotated corpora
https://github.com/miweru/vrt_generator
corpora linguistic-corpora linguistics vrt wrapper
Last synced: 7 months ago
JSON representation
Python class for creating vrt-annotated corpora
- Host: GitHub
- URL: https://github.com/miweru/vrt_generator
- Owner: miweru
- Created: 2019-07-26T09:24:32.000Z (about 6 years ago)
- Default Branch: master
- Last Pushed: 2019-12-19T22:58:33.000Z (almost 6 years ago)
- Last Synced: 2025-02-22T12:35:43.711Z (8 months ago)
- Topics: corpora, linguistic-corpora, linguistics, vrt, wrapper
- Language: Python
- Size: 13.7 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# vrt_generator
Python class for creating vrt-annotated corpora.
Still in very early testing stage.Install by typing:
```bash
pip install vrt_generator
```Usage Example:
```python
from vrt import Corpus, S, Text
with Corpus("~","meinkorpus",4,"text_name") as c:
with Text(c, text_name="Text2") as t:
with S(c) as s:
s.writep("Test","TAG","TAG","Lemma")
```Features:
-
- Represent Corpus, Text, P and S Attributes
- Integration of spacy for automatic generation of a vrt-representation of texts
- Using Context Manager for xml-hierarchy representation
- Reduces to utf8mb3 and checks formatting compatibility
- If you want to add texts that are automatically POS-Tagged with Spacy, you might look at vrt_spacy