{"id":28795732,"url":"https://github.com/ukplab/semeval2017-scienceie","last_synced_at":"2025-10-11T03:06:43.842Z","repository":{"id":66147426,"uuid":"83324655","full_name":"UKPLab/semeval2017-scienceie","owner":"UKPLab","description":"Code for keyphrase classification systems submitted to the SemEval 2017 shared task ScienceIE.","archived":false,"fork":false,"pushed_at":"2018-06-12T11:41:26.000Z","size":95,"stargazers_count":36,"open_issues_count":1,"forks_count":10,"subscribers_count":32,"default_branch":"master","last_synced_at":"2025-06-18T03:10:03.967Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/UKPLab.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2017-02-27T15:19:53.000Z","updated_at":"2025-01-21T04:00:38.000Z","dependencies_parsed_at":null,"dependency_job_id":"edc00d35-a587-45ac-9013-1406b3048625","html_url":"https://github.com/UKPLab/semeval2017-scienceie","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/UKPLab/semeval2017-scienceie","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/UKPLab%2Fsemeval2017-scienceie","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/UKPLab%2Fsemeval2017-scienceie/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/UKPLab%2Fsemeval2017-scienceie/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/UKPLab%2Fsemeval2017-scienceie/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/UKPLab","download_url":"https://codeload.github.com/UKPLab/semeval2017-scienceie/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/UKPLab%2Fsemeval2017-scienceie/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":279006060,"owners_count":26084026,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-11T02:00:06.511Z","response_time":55,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-06-18T03:10:04.136Z","updated_at":"2025-10-11T03:06:43.836Z","avatar_url":"https://github.com/UKPLab.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# EELECTION at SemEval-2017 Task 10: Ensemble of nEural Learners for kEyphrase ClassificaTION\n## SemEval 2017 Task 10: ScienceIE - Extracting Keyphrases and Relations from Scientific Publications.\n\n\nThis repository contains the code needed to reproduce our results for the shared task [ScienceIE] [science-ie] reported in Eger et al., *[EELECTION at SemEval-2017 Task 10: Ensemble of nEural Learners for kEyphrase ClassificaTION](https://www.aclweb.org/anthology/S/S17/S17-2163.pdf)*. \n\nPlease cite the paper as:\n\n```\n@InProceedings{semeval2017-eger-eelection,\n  author    = {Eger, Steffen and Do Dinh, Erik-Lân and Kutsnezov, Ilia and Kiaeeha, Masoud and Gurevych, Iryna},\n  title     = {{EELECTION at SemEval-2017 Task 10: Ensemble of nEural Learners for kEyphrase ClassificaTION}},\n  booktitle = {Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval 2017)},\n  month     = {August},\n  year      = {2017},\n  address   = {Vancouver, Canada},\n  publisher = {Association for Computational Linguistics},\n  pages     = {942--946},\n  url       = {https://github.com/UKPLab/semeval2017-scienceie}\n}\n```\n\n\u003e **Abstract:** This paper describes our approach to the SemEval 2017 Task 10: \"Extracting Keyphrases and Relations from Scientific Publications\", specifically to Subtask (B): \"Classification of identified keyphrases\".\n\u003e We explored three different deep learning approaches: a character-level convolutional neural network (CNN), a stacked learner with an MLP meta-classifier, and an attention based Bi-LSTM. From these approaches, we created an ensemble of differently hyper-parameterized systems, achieving a micro-F1-score of 0.63 on the test data. Our approach ranks 2nd (score of 1st placed system: 0.64) out of four according to this official score. \n\u003e However, we erroneously trained 2 out of 3 neural nets (the stacker and the CNN) on only roughly 15% of the full data, namely, the original development set. When trained on the full data (training+development), our ensemble has a micro-F1-score of 0.69.\n\nContact persons: \n  * Steffen Eger, eger@ukp.informatik.tu-darmstadt.de\n  * Erik-Lân Do Dinh, dodinh@ukp.informatik.tu-darmstadt.de\n  * Ilia Kutsnezov, kutsnezov@ukp.informatik.tu-darmstadt.de\n  * Masoud Kiaeeha, kiaeeha@ukp.informatik.tu-darmstadt.de\n\nhttps://www.ukp.tu-darmstadt.de/\n\nhttps://www.tu-darmstadt.de/\n\nDon't hesitate to contact us if something is broken (and it shouldn't be) or if you have further questions.\n\n\u003e This repository contains experimental software and \nis published for the sole purpose of giving additional \nbackground details on the respective publication. \n\n## Project structure\n\n* `code/`\n   * `crawl/` -- this folder contains scripts to crawl additional Elsevier articles\n   * `skip-thoughts/` -- document classifier, incorporating code from https://bitbucket.org/TomKenter/siamese-cbow/src\n* `data/` -- the data can be obtained from the shared task website: https://scienceie.github.io/resources.html\n* `scripts_submission/` -- shell scripts for running the individual systems\n* `scripts/` -- evaluation scripts provided by the task organizers\n* `requirements.txt` -- a text file with the names of the required Python modules\n\n## Requirements\n\n* 64-bit Linux versions (not tested on other platforms)\n* Python 2.7\n* Python modules in the `requirements.txt` file\n* [keras] with [tensorflow] or [theano]\n* Suitable word embeddings in **text format** (see below)\n\n## Running the experiments\n\nTo run the experiments described in our paper you have to aquire following resources.\n\nPut the following embeddings into `data/embeddings`:\n* Glove word embeddings: [glove] (glove.6B.zip and glove.42B.300d.zip)\n* Komninos word embeddings: [komninos] (wiki_extvec.gz)\n* Levy word embeddings: [levy] (Bag of Words (k = 2) [words])\n\nPut the training, dev and test data into `data/train`, `data/dev` and `data/test`, respectively. For running the experiment scripts below, also create `data/combined`, and copy the `train` and `dev` data into it.\n* Training and test data: [science-ie-data]\n\nFurther, the keras version has a bug regarding unicode, which has to be fixed as e.g. described in [keras-fix].\n\nThe scripts to start the experiments can be found in `scripts_submission`.\n\n   [keras]: \u003chttps://keras.io/\u003e\n   [tensorflow]: \u003chttps://www.tensorflow.org/\u003e\n   [theano]: \u003chttps://github.com/Theano/Theano\u003e\n   [glove]: \u003chttp://nlp.stanford.edu/projects/glove\u003e\n   [komninos]: \u003chttps://www.cs.york.ac.uk/nlp/extvec/\u003e\n   [levy]: \u003chttps://levyomer.wordpress.com/2014/04/25/dependency-based-word-embeddings/\u003e\n   [science-ie]: \u003chttps://scienceie.github.io/\u003e\n   [science-ie-data]: \u003chttps://scienceie.github.io/resources.html\u003e\n   [keras-fix]: \u003chttps://github.com/fchollet/keras/issues/1072#issuecomment-241682313\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fukplab%2Fsemeval2017-scienceie","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fukplab%2Fsemeval2017-scienceie","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fukplab%2Fsemeval2017-scienceie/lists"}