https://github.com/bennokr/minimel
Minimalist Entity Linking
https://github.com/bennokr/minimel
entity-linking nlp
Last synced: 5 months ago
JSON representation
Minimalist Entity Linking
- Host: GitHub
- URL: https://github.com/bennokr/minimel
- Owner: bennokr
- Created: 2022-07-02T18:31:31.000Z (almost 4 years ago)
- Default Branch: master
- Last Pushed: 2025-09-24T14:38:42.000Z (9 months ago)
- Last Synced: 2025-09-24T16:36:44.107Z (9 months ago)
- Topics: entity-linking, nlp
- Language: Jupyter Notebook
- Homepage: https://minimel.readthedocs.io/
- Size: 2.1 MB
- Stars: 2
- Watchers: 1
- Forks: 1
- Open Issues: 4
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# MinimEL: Minimalist Entity Linking
The `minimel` package provides a framework to create and evaluate small Entity Linking models.
> **Warning**
> This package is still under construction. A release is planned for the summer of 2024.
## App
To run the app, run `cd app` and then `flask run`.
## Evaluation Datasets
- [VoxEL: A Benchmark Dataset for Multilingual Entity Linking](https://figshare.com/articles/dataset/VoxEL/6539675)
- [Entity Linking in 100 Languages](https://github.com/google-research/google-research/tree/master/dense_representations_for_entity_retrieval/mel)
- [Tsai & Roth 2016](https://cogcomp.seas.upenn.edu/page/resource_view/102)
## IDEAS
- per surfaceform, ignore entities that are an instanceOf the top entity
- NER features
- global binary classifier with (ent feat, sent feat) tuples
## Tokenization, Stemming & Lemmatization
- Multi: https://github.com/mingruimingrui/ICU-tokenizer
- Multi: https://pypi.org/project/snowballstemmer/
- Japanese: https://github.com/SamuraiT/tinysegmenter
- Persian: https://github.com/htaghizadeh/PersianStemmer-Python
- Korean: https://pypi.org/project/soylemma/