Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/dpriskorn/lexfolketslexicon


https://github.com/dpriskorn/lexfolketslexicon

Last synced: 20 days ago
JSON representation

Awesome Lists containing this project

README

        

# LexFolketsLexicon
This is a project to analyze and add identifiers for words
from Folkets Lexicon to Wikidata.
A unique UUID-based identifier have been added to facilitate linking with Wikidata.

## API
An API has been added for easy lookup. The goal is to upload it to Toolforge to facilitate
integration of the data into Wikidata via a property since
this might be a good freely licensed source for Swedish and English lexemes to use.

## Versioning
The output jsonl is versioned v1, v2, etc.
The latest number is always consistent with the model in the code,
old versions are kept for backwards compatibility.
For each new version, completely random ids are generated.

## Statistics
Number of words: 39407
Number of words with sound file: 21947
Number of words missing a lexical category: 7441
Number of words with a lexical category: 31966
Number of words with a lexical category and sound file: 21280
Number of idioms: 1944
Number of examples: 12574

## License
The source code is licensed under GPLv3+
The data directory files are CC-BY-SA 2.5

## Source
Original source is http://folkets-lexikon.csc.kth.se/folkets/
The data was downloaded during may 2023.
No updates to the dump seem to happen over time since the project is not maintained anymore.

## Development
Run the api:
`$ uvicorn api:app --reload`

Build pack:

Make sure pack-cli-bin and docker is installed from AUR:
```
$ pack build --builder tools-harbor.wmcloud.org/toolforge/heroku-builder:22 myimage
$ docker run -e PORT=8000 -p 8000:8000 --rm --entrypoint web myimage
# navigate to http://127.0.0.1:8000 to check that it works
```