Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/frankier/fiwn
Temporary fixes to FinnWordNet 2.0
https://github.com/frankier/fiwn
Last synced: 8 days ago
JSON representation
Temporary fixes to FinnWordNet 2.0
- Host: GitHub
- URL: https://github.com/frankier/fiwn
- Owner: frankier
- Created: 2018-06-14T15:33:15.000Z (over 6 years ago)
- Default Branch: master
- Last Pushed: 2022-12-08T02:22:51.000Z (almost 2 years ago)
- Last Synced: 2024-04-17T12:20:25.623Z (7 months ago)
- Language: C
- Homepage:
- Size: 47.2 MB
- Stars: 0
- Watchers: 4
- Forks: 1
- Open Issues: 4
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# FinnWordNet
This repository contains some changes/fixes to FinnWordNet.
The `data` directory contains the FiWN data files, and the `WNgrind-3.0-FiWN`
directory contains the FiWN version of WNgrind.## Mapping/adjusting script
There is a script which can either create a false/en based synset id => true fi
synset id mapping tsv, or apply the mapping to the tsvs in data. It needs
[pipenv](https://github.com/pypa/pipenv).Assuming you put the original data in `data` rather than the already mapped
data included here, you can make a map tsv like so:$ pipenv run python adjust-fiwn-offsets.py dump data synset_map.tsv
And you can also modify the original data with the new offsets (i.e. the
following is the command which has been run to change the data in `data` to its
current state):$ pipenv run python adjust-fiwn-offsets.py fix data
## Fake word count data script
You can create count data based on the counts in the English data like so:
$ pipenv run python mk-cntlist.py > data/dict/cntlist.rev