{"id":27426473,"url":"https://github.com/mrpeerat/mrefined","last_synced_at":"2025-06-19T09:39:55.211Z","repository":{"id":214977859,"uuid":"736120245","full_name":"mrpeerat/mReFinED","owner":"mrpeerat","description":"mReFinED: An Efficient End-to-End Multilingual Entity Linking System","archived":false,"fork":false,"pushed_at":"2024-01-03T14:36:28.000Z","size":230,"stargazers_count":4,"open_issues_count":0,"forks_count":1,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-04-14T12:55:10.787Z","etag":null,"topics":["deep-learning","entity-linking","machine-learning","multilingual","multilingual-entity-linking","nlp"],"latest_commit_sha":null,"homepage":"https://aclanthology.org/2023.findings-emnlp.1007/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mrpeerat.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2023-12-27T03:30:48.000Z","updated_at":"2025-01-11T00:59:52.000Z","dependencies_parsed_at":null,"dependency_job_id":"598d4bdd-8481-4635-825a-4fa9a8dfadbf","html_url":"https://github.com/mrpeerat/mReFinED","commit_stats":{"total_commits":19,"total_committers":1,"mean_commits":19.0,"dds":0.0,"last_synced_commit":"4b592f48d1b73015674b75da2f6f6c227a1d477a"},"previous_names":["mrpeerat/mrefined"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/mrpeerat/mReFinED","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mrpeerat%2FmReFinED","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mrpeerat%2FmReFinED/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mrpeerat%2FmReFinED/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mrpeerat%2FmReFinED/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mrpeerat","download_url":"https://codeload.github.com/mrpeerat/mReFinED/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mrpeerat%2FmReFinED/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":260726301,"owners_count":23053111,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["deep-learning","entity-linking","machine-learning","multilingual","multilingual-entity-linking","nlp"],"created_at":"2025-04-14T12:40:09.828Z","updated_at":"2025-06-19T09:39:50.189Z","avatar_url":"https://github.com/mrpeerat.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Overview\nWe propose mReFinED, the first end-to-end MEL model. mReFinED supports 9 languages: AR, EN, ES, DE, FA, JA, TA, and TR. Our experimental results in the research paper demonstrated that mReFinED outperformed the best existing work in the end-to-end MEL task while being 44 times faster compared to existing state-of-the-art (mGENRE).\n\n# mReFinED's Paper\nThe mReFinED model architecture is described in the paper below (https://aclanthology.org/2023.findings-emnlp.1007):\n```bibtex\n@inproceedings{limkonchotiwat-etal-2023-mrefined,\n    title = \"m{R}e{F}in{ED}: An Efficient End-to-End Multilingual Entity Linking System\",\n    author = \"Limkonchotiwat, Peerat  and\n      Cheng, Weiwei  and\n      Christodoulopoulos, Christos  and\n      Saffari, Amir  and\n      Lehmann, Jens\",\n    booktitle = \"Findings of the Association for Computational Linguistics: EMNLP 2023\",\n    month = dec,\n    year = \"2023\",\n    address = \"Singapore\",\n    publisher = \"Association for Computational Linguistics\",\n    url = \"https://aclanthology.org/2023.findings-emnlp.1007\",\n    doi = \"10.18653/v1/2023.findings-emnlp.1007\",\n    pages = \"15080--15089\",\n}\n```\n\n## mReFinED\n- This is the replica of mReFinED from [Amazon's mReFinED](https://github.com/amazon-science/ReFinED/tree/mrefined).\n- We improve the training and inference codes to make them easier to reproduce.\n- We also provide the mReFinED model and training data :) \n\n## Hardware Requirements\n- mReFinED has a low hardware requirement. For fast inference speed, a GPU should be used, but this is not a strict requirement.\n- We create training data for 15 days (CPU only). However, the process can be sped up using GPUs (~2 days).\n- We use 8 V100 in the training step for ~10 days.\n- For the inference setting, we use only a single V100.\n\n# Model, Data, and Codes\n\n## Materials\n- **Model**: XXXXXXX\n- **Training data**: XXXXXXXX\n\n## Example Script\n- mReFinED: Creating training data\n```\ncd mReFinED/src/\nexport PYTHONPATH=$PYTHONPATH:src\npython refined/offline_data_generation/preprocess_all_multilingual_combine.py\n```\n- mReFinED: Training\n```\ncd mReFinED/src/\nexport PYTHONPATH=$PYTHONPATH:src\nbash refined/training/train/multilingual_train.sh\n```\n- Mention Detection For Unlabeled Entity in Wikipedia. However, we can skip this step using [WikiNN](https://huggingface.co/Babelscape/wikineural-multilingual-ner) instead.\n```\ncd mReFinED/src/refined/training/train\npython multilingual_md_train_xtreme.py\npython md_on_wiki.py\npython multilingual_md_train_xtreme_wikipedia.py\n```\n- mReFinED: Inference\n```python\nprint('hi')\n```\n- mReFinED on Mewsli-9\n```python\nprint('hi')\n```  \n \n## Security\n\nSee [CONTRIBUTING](CONTRIBUTING.md#security-issue-notifications) for more information.\n\n## License\n\nThis library is licensed under the CC-BY-NC 4.0 License.\n\n## Contact us\nIf you have questions please open Github issues instead of sending us emails, as some of the listed email addresses are no longer active.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmrpeerat%2Fmrefined","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmrpeerat%2Fmrefined","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmrpeerat%2Fmrefined/lists"}