{"id":13754298,"url":"https://github.com/seukgcode/MELBench","last_synced_at":"2025-05-09T22:31:47.507Z","repository":{"id":104924023,"uuid":"382542174","full_name":"seukgcode/MELBench","owner":"seukgcode","description":"Multimodal entity linking (MEL) aims to utilize multimodal information to map mentions to corresponding entities defined in knowledge bases. We release three MEL datasets: Weibo-MEL, Wikidata-MEL and Richpedia-MEL, containing 25,602, 18,880 and 17,806 samples from social media, encyclopedia and multimodal knowledge graphs respectively. A MEL dataset construction approach is proposed, including five stages: multimodal information extraction, mention extraction, entity extraction, triple construction and dataset construction. Experiment results demonstrate the usability of the datasets and the distinguishability between baseline models.","archived":false,"fork":false,"pushed_at":"2021-07-31T14:46:23.000Z","size":9219,"stargazers_count":80,"open_issues_count":4,"forks_count":4,"subscribers_count":4,"default_branch":"main","last_synced_at":"2024-11-16T07:33:18.768Z","etag":null,"topics":["entity-linking","knowledge-graph","multimodal"],"latest_commit_sha":null,"homepage":"","language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/seukgcode.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-07-03T06:23:27.000Z","updated_at":"2024-10-31T23:00:45.000Z","dependencies_parsed_at":"2023-11-30T11:00:24.087Z","dependency_job_id":null,"html_url":"https://github.com/seukgcode/MELBench","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/seukgcode%2FMELBench","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/seukgcode%2FMELBench/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/seukgcode%2FMELBench/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/seukgcode%2FMELBench/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/seukgcode","download_url":"https://codeload.github.com/seukgcode/MELBench/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":253335751,"owners_count":21892727,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["entity-linking","knowledge-graph","multimodal"],"created_at":"2024-08-03T09:01:53.601Z","updated_at":"2025-05-09T22:31:42.466Z","avatar_url":"https://github.com/seukgcode.png","language":null,"funding_links":[],"categories":["知识图谱"],"sub_categories":["其他_文本生成、文本对话"],"readme":"# Multimodal Entity Linking Datasets Benchmark\n\n## 1. Abstract\n\nMultimodal entity linking (MEL) aims to utilize multimodal information to map mentions to corresponding entities defined in knowledge bases. We release three MEL datasets: Weibo-MEL, Wikidata-MEL and Richpedia-MEL, containing 25,602, 18,880 and 17,806 samples from social media, encyclopedia and multimodal knowledge graphs respectively. A MEL dataset construction approach is proposed, including five stages: multimodal information extraction, mention extraction, entity extraction, triple construction and dataset construction. Experiment results demonstrate the usability of the datasets and the distinguishability between baseline models.\n\n## 2. Introduction\n\nMultimodal entity linking (MEL) aims to utilize multimodal information to map mentions to corresponding entities defined in knowledge bases. We release three MEL datasets: Weibo-MEL, Wikidata-MEL and Richpedia-MEL, containing 25,602, 18,880 and 17,806 samples from social media, encyclopedia and multimodal knowledge graphs respectively. A MEL dataset construction approach is proposed, including five stages: multimodal information extraction, mention extraction, entity extraction, triple construction and dataset construction. Experiment results demonstrate the usability of the datasets and the distinguishability between baseline models.\n\nTo construct large-scale MEL datasets, we propose a MEL dataset construction approach, including five stages. In **Multimodal Information Extraction**, we select multimodal data sources and extract textual and visual information. In **Mention Extraction**, we extract mentions from textual information and keep the mentions which corresponding entities may exist. In **Entity Extraction**, we query the knowledge bases with the filtered mentions, gather the entity lists, and save the correct entities. In **Triple Construction**, we merge the corresponding mentions and entities into mention-entity (M-E) pairs, and combine them into triples with textual and visual information. Then, we keep the correct triples as the samples of the MEL dataset. Finally, in **Dataset  Construction** stage, we partition the dataset into training set (70%), validation set (10%) and testing set (20%). The overview of the approach is illustrated in figure below.\n\n  ![image-20210704105704949](https://markdown-bluestragglers.oss-cn-beijing.aliyuncs.com/image-20210704105704949.png)\n\n## 3. Visual Information and Knowledge Graph Resources\n\nDue to the large size of the visual information and knowledge graph resources, we deposited these resources in the Baidu Cloud Disk.\n\nVisual information download addresses:\n* [Weibo-MEL Visual Information](https://pan.baidu.com/s/1VTzzKXpORziookJiHKwWKw)\n* [Wikidata-MEL Visual Information](https://pan.baidu.com/s/1FbhgMZ-w2DdAPLgCBDvKtQ)\n* [Richpedia-MEL Visual Information](https://pan.baidu.com/s/1lt-SmWUX5GAmLRNWggDkXQ?from=init#list/path=%2Fsharelink653312845-459959024382112%2Fimage\u0026parentPath=%2Fsharelink653312845-459959024382112)\n\nKnowledge graph download addresses:\n* [Wikidata Knowledge Graph for Weibo-MEL and Wikidata-MEL](https://dumps.wikimedia.org/wikidatawiki/entities/latest-all.json.bz2)\n* [Richpedia Knowledge Graph for Richpedia-MEL](https://pan.baidu.com/s/1FTEwSV6CystQHT_wgdjYEA)\n\nThe extraction codes are 2021.\n\n## 4. Samples of the MEL datasets\n\n### 4.1 Weibo-MEL dataset\n\n**Visual information:**\n\n* ![image-20210703210612299](https://markdown-bluestragglers.oss-cn-beijing.aliyuncs.com/image-20210703210612299.png)\n\n**Textual information:**\n\n* 【感谢30年的坚守！#辽宁舰首位一级军士长退役# 】近日，辽宁舰为首位一级军士长刘德波举行了隆重的退役仪式。刘德波，1990年12月入伍，服役期间，荣获全军士官优秀人才奖二等奖1次，荣立二等功1次，三等功4次。入伍30年，他坚守在不见阳光的深舱，始终与高温和热浪相伴，穿梭于管路和设备之间，把最美好的青春年华都献给了心爱的战舰。致敬！（北海舰队）#我为中国军人点赞# [组图共9张] 原图\n\n**Mention-entity pairs:**\n\n* \"刘德波\": \"刘德波\"\n* \"中国\": \"中国（世界四大文明古国之一）\"\n* \"辽宁舰\": \"中国人民解放军海军辽宁舰\"\n\n### 4.2 Wikidata-MEL dataset\n\n**Visual information:**\n\n* ![image-20210703212658556](https://markdown-bluestragglers.oss-cn-beijing.aliyuncs.com/image-20210703212658556.png)\n\n**Textual information:**\n\n* Seattle Mayor Charles L. Smith (left) with Will Rogers, circa 1935.\n\n**Mention-entity pairs:**\n\n* \"Charles L. Smith\": \"Charles L. Smith (Seattle politician)\"\n* \"Will Rogers\": \"Will Rogers\"\n\n### 4.3 Richpedia-MEL dataset\n\n**Visual information:**\n\n* ![image-20210703212710693](https://markdown-bluestragglers.oss-cn-beijing.aliyuncs.com/image-20210703212710693.png)\n\n**Textual information:**\n\n* Washington resigned his commission after the Treaty of Paris in 1783.\n\n**Mention-entity pairs:**\n\n* \"Washington\": \"George Washington\"\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fseukgcode%2FMELBench","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fseukgcode%2FMELBench","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fseukgcode%2FMELBench/lists"}