Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/jean-baptiste-camps/wauchier_stylo
Stylometric analysis of Old French légendiers
https://github.com/jean-baptiste-camps/wauchier_stylo
Last synced: about 1 month ago
JSON representation
Stylometric analysis of Old French légendiers
- Host: GitHub
- URL: https://github.com/jean-baptiste-camps/wauchier_stylo
- Owner: Jean-Baptiste-Camps
- Created: 2019-10-11T12:50:21.000Z (about 5 years ago)
- Default Branch: master
- Last Pushed: 2021-03-08T09:13:14.000Z (almost 4 years ago)
- Last Synced: 2024-10-12T21:44:05.244Z (2 months ago)
- Language: HTML
- Homepage:
- Size: 85.8 MB
- Stars: 0
- Watchers: 2
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Wauchier_stylo
Stylometric analysis of Old French _légendiers_.
N.B.: the `data` folder contains data generated with `stylo` from the raw text data.
If you need to generate new data, you will need to run the (commented-out) commands using stylo,
and then use `write.csv` to save them / `read.csv` to load them.**/!\ WORKFLOW IS POSSIBLY BUGGED. THERE IS A FIX IN THE newComputs branch**
## Generate analysis data
To generate the analysis, you need to clone: https://github.com/PonteIneptique/dh-meier-data
### Generate lemma and pos data
Use `generate_lemma_pos.py`:
```shell
python3 generate_lemma_pos.py [PrefixNameOfTheOutput] [Path to the directory containing texts]
```For example, `python3 generate_lemma_pos.py transkribus ../dh-meier-data/output/transkribus/lemmatized/ocr/` will produce
`./data/transkribus_lemmas.csv` and `./data/transkribus_pos3.csv`.### Generate the Character 3-Grams
Done with the Rmd files in the first few cells
### Generate the comparison between Gold and output data
ToDo