{"id":18768610,"url":"https://github.com/poteminr/multiconer","last_synced_at":"2026-05-01T17:35:53.870Z","repository":{"id":104649314,"uuid":"565924737","full_name":"poteminr/MultiCoNER","owner":"poteminr","description":"Contrastive learning for multilingual complex named entity recognition. Bert + CRF model.","archived":false,"fork":false,"pushed_at":"2023-03-01T13:29:46.000Z","size":39,"stargazers_count":4,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-05-20T23:13:03.857Z","etag":null,"topics":["bert","conll","contrastive-learning","crf","named-entity-recognition","natural-language-processing","ner","pytorch","wandb"],"latest_commit_sha":null,"homepage":"https://multiconer.github.io/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/poteminr.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2022-11-14T16:00:26.000Z","updated_at":"2025-03-26T12:13:14.000Z","dependencies_parsed_at":null,"dependency_job_id":"f46c323c-83da-4846-85d7-186a3e55fd75","html_url":"https://github.com/poteminr/MultiCoNER","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/poteminr/MultiCoNER","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/poteminr%2FMultiCoNER","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/poteminr%2FMultiCoNER/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/poteminr%2FMultiCoNER/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/poteminr%2FMultiCoNER/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/poteminr","download_url":"https://codeload.github.com/poteminr/MultiCoNER/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/poteminr%2FMultiCoNER/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32507091,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-30T13:12:12.517Z","status":"online","status_checked_at":"2026-05-01T02:00:05.856Z","response_time":64,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bert","conll","contrastive-learning","crf","named-entity-recognition","natural-language-processing","ner","pytorch","wandb"],"created_at":"2024-11-07T19:13:16.729Z","updated_at":"2026-05-01T17:35:53.861Z","avatar_url":"https://github.com/poteminr.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# MultiCoNER\n\u003eComplex named entities (NE), like the titles of creative works, are not simple nouns and pose challenges for NER systems (Ashwini and Choi, 2014). They can take the form of any linguistic constituent, like an imperative clause (“Dial M for Murder”), and do not look like traditional NEs (Persons, Locations, etc.).\n\nThis repository contains solution for *[SemEval 2023 Task 2: MultiCoNER II\nMultilingual Complex Named Entity Recognition](https://multiconer.github.io/)* and **will contain additional research of Multilingual Named Entity Recognition approaches**.\n\n## Dataset\n\n\nThe tagset of MultiCoNER is a fine-grained tagset.\n\nThe fine to coarse level mapping of the tags are as follows:\n\n    **Location (LOC) : Facility, OtherLOC, HumanSettlement, Station\n    Creative Work (CW) : VisualWork, MusicalWork, WrittenWork, ArtWork, Software\n    Group (GRP) : MusicalGRP, PublicCORP, PrivateCORP, AerospaceManufacturer, SportsGRP, CarManufacturer, ORG\n    Person (PER) : Scientist, Artist, Athlete, Politician, Cleric, SportsManager, OtherPER\n    Product (PROD) : Clothing, Vehicle, Food, Drink, OtherPROD\n    Medical (MED) : Medication/Vaccine, MedicalProcedure, AnatomicalStructure, Symptom, Disease\n\n**Example**\n\u003eEnglish: [wes anderson | Artist]'s film [the grand budapest hotel | VisualWork] opened the festival .\n\n\u003eUkrainian: назва альбому походить з роману « [кінець дитинства | WrittenWork] » англійського письменника [артура кларка | Artist] .\n\n## Approach\nTwo-stage fine-tuning of Transformer was performed.\n### Contrastive learning \nThe first stage is a contrastive learning aimed at changing the distance between embeddings of words/sub-words, that was produced by Transformer model. \nFor example, named entities of different types have a large distance and small distance for same types. \n\nThis stage based on ideas from [**Contrastive fine-tuning to improve generalization in deep NER**](https://www.dialog-21.ru/media/5751/bondarenkoi113.pdf) (see 3.1 Contrastive fine-tuning)\n\n\u003eYou can find SiameseDataset class from *utils/dataset.py* and ContrastiveTrainer class from *trainer.py*\n\n### Fine-tuned BERT + Conditional Random Field  (CoBertCRF)\nThe second stage is a learning fine-tuned BERT model with CRF from first stage for token classification task (NER). \n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpoteminr%2Fmulticoner","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpoteminr%2Fmulticoner","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpoteminr%2Fmulticoner/lists"}