{"id":13566882,"url":"https://github.com/LIAAD/yake","last_synced_at":"2025-04-04T00:32:23.097Z","repository":{"id":34259420,"uuid":"157896616","full_name":"LIAAD/yake","owner":"LIAAD","description":"Single-document unsupervised keyword extraction","archived":false,"fork":false,"pushed_at":"2024-01-05T18:42:36.000Z","size":861,"stargazers_count":1642,"open_issues_count":33,"forks_count":227,"subscribers_count":30,"default_branch":"master","last_synced_at":"2024-10-29T15:35:04.921Z","etag":null,"topics":["corpus-independent","domain-and-language-independent","keyword-extraction","single-document","unsupervised-approach"],"latest_commit_sha":null,"homepage":"https://liaad.github.io/yake","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/LIAAD.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.rst","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":"AUTHORS.rst"}},"created_at":"2018-11-16T16:55:43.000Z","updated_at":"2024-10-29T06:22:39.000Z","dependencies_parsed_at":"2022-07-20T03:03:20.252Z","dependency_job_id":"b34024b2-a00b-43cf-9d46-cacfb091a2d9","html_url":"https://github.com/LIAAD/yake","commit_stats":{"total_commits":129,"total_committers":17,"mean_commits":7.588235294117647,"dds":0.6666666666666667,"last_synced_commit":"374fc1c1c19eb080d5b6115cbb8d4a4324392e54"},"previous_names":[],"tags_count":6,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LIAAD%2Fyake","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LIAAD%2Fyake/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LIAAD%2Fyake/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LIAAD%2Fyake/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/LIAAD","download_url":"https://codeload.github.com/LIAAD/yake/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247061883,"owners_count":20877176,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["corpus-independent","domain-and-language-independent","keyword-extraction","single-document","unsupervised-approach"],"created_at":"2024-08-01T13:02:18.789Z","updated_at":"2025-04-04T00:32:23.069Z","avatar_url":"https://github.com/LIAAD.png","language":"Python","funding_links":[],"categories":["Python","Algorithms"],"sub_categories":["Other Algorithms"],"readme":"# Yet Another Keyword Extractor (Yake)\n\nUnsupervised Approach for Automatic Keyword Extraction using Text Features.\n\nYAKE! is a light-weight unsupervised automatic keyword extraction method which rests on text statistical features extracted from single documents to select the most important keywords of a text. Our system does not need to be trained on a particular set of documents, neither it depends on dictionaries, external-corpus, size of the text, language or domain. To demonstrate the merits and the significance of our proposal, we compare it against ten state-of-the-art unsupervised approaches (TF.IDF, KP-Miner, RAKE, TextRank, SingleRank, ExpandRank, TopicRank, TopicalPageRank, PositionRank and MultipartiteRank), and one supervised method (KEA). Experimental results carried out on top of twenty datasets (see Benchmark section below) show that our methods significantly outperform state-of-the-art methods under a number of collections of different sizes, languages or domains. In addition to the python package here described, we also make available a \u003ca href=\"http://yake.inesctec.pt\" target=\"_blank\"\u003edemo\u003c/a\u003e, an \u003ca href=\"http://yake.inesctec.pt/apidocs/#!/available_methods/post_yake_v2_extract_keywords\" target=\"_blank\"\u003eAPI\u003c/a\u003e and a \u003ca href=\"https://play.google.com/store/apps/details?id=com.yake.yake\" target=\"_blank\"\u003emobile app\u003c/a\u003e.\n\n## Main Features\n\n* Unsupervised approach\n* Corpus-Independent\n* Domain and Language Independent\n* Single-Document\n\n## Benchmark\n\nFor Benchmark results check out our paper published on Information Science Journal (see the references section). \n\n## Rationale\n\nExtracting keywords from texts has become a challenge for individuals and organizations as the information grows in complexity and size. The need to automate this task so that texts can be processed in a timely and adequate manner has led to the emergence of automatic keyword extraction tools. Despite the advances, there is a clear lack of multilingual online tools to automatically extract keywords from single documents. Yake! is a novel feature-based system for multi-lingual keyword extraction, which supports texts of different sizes, domain or languages. Unlike other approaches, Yake! does not rely on dictionaries nor thesauri, neither is trained against any corpora. Instead, it follows an unsupervised approach which builds upon features extracted from the text, making it thus applicable to documents written in different languages without the need for further knowledge. This can be beneficial for a large number of tasks and a plethora of situations where the access to training corpora is either limited or restricted.\n\n## Where can I find YAKE!?\nYAKE! is available online [http://yake.inesctec.pt], on [Google Play](https://play.google.com/store/apps/details?id=com.yake.yake), as an open source Python package [https://github.com/LIAAD/yake] and as an [API](http://yake.inesctec.pt/apidocs/#/available_methods/post_yake_v2_extract_keywords).\n\n## Installing YAKE!\n\nThere are three installation alternatives.\n\n- To run YAKE! in the command line (say, to integrate in a script), but you do not need an HTTP server on top, you can use our [simple YAKE! Docker image](#cli-image). This container will allow you to run text extraction as a command, and then exit.\n- To run YAKE! as an HTTP server featuring a RESTful API (say to integrate in a web application or host your own YAKE!), you can use our [RESTful API server image](#rest-api-image). This container/server *will run in the background*.\n- To install YAKE! straight \"on the metal\" or you want to integrate it in your Python app, you can [install it and its dependencies](#standalone-installation).\n\n\u003ca name=\"cli-image\"\u003e\u003c/a\u003e\n### Option 1. YAKE as a CLI utility inside a Docker container\n\nFirst, install Docker. Ubuntu users, please see our [script below](#installing-docker) for a complete installation script.\n\nThen, run:\n\n```bash\ndocker run liaad/yake:latest -ti \"Caffeine is a central nervous system (CNS) stimulant of the methylxanthine class.[10] It is the world's most widely consumed psychoactive drug. Unlike many other psychoactive substances, it is legal and unregulated in nearly all parts of the world. There are several known mechanisms of action to explain the effects of caffeine. The most prominent is that it reversibly blocks the action of adenosine on its receptor and consequently prevents the onset of drowsiness induced by adenosine. Caffeine also stimulates certain portions of the autonomic nervous system.\"\n```\n*Example text from Wikipedia*\n\n\u003ca name=\"rest-api-image\"\u003e\u003c/a\u003e\n### Option 2. REST API Server in a Docker container\n\nThis install will provide you a mirror of the original REST API of YAKE! available [here](https://boiling-castle-88317.herokuapp.com).\n\n```bash\ndocker run -p 5000:5000 -d liaad/yake-server:latest\n```\n\nAfter it starts up, the container will run in the background, at http://127.0.0.1:5000. To access the YAKE! API documentation, go to http://127.0.0.1:5000/apidocs/.\n\nYou can test the RESTful API using `curl`:\n\n```bash\ncurl -X POST \"http://localhost:5000/yake/\" -H \"accept: application/json\" -H \"Content-Type: application/json\" \\\n-d @- \u003c\u003c'EOF'\n{\n  \"language\": \"en\",\n  \"max_ngram_size\": 3,\n  \"number_of_keywords\": 10,\n  \"text\": \"Sources tell us that Google is acquiring Kaggle, a platform that hosts data science and machine learning competitions. Details about the transaction remain somewhat vague , but given that Google is hosting its Cloud Next conference in San Francisco this week, the official announcement could come as early as tomorrow. Reached by phone, Kaggle co-founder CEO Anthony Goldbloom declined to deny that the acquisition is happening. Google itself declined 'to comment on rumors'. Kaggle, which has about half a million data scientists on its platform, was founded by Goldbloom and Ben Hamner in 2010. The service got an early start and even though it has a few competitors like DrivenData, TopCoder and HackerRank, it has managed to stay well ahead of them by focusing on its specific niche. The service is basically the de facto home for running data science and machine learning competitions. With Kaggle, Google is buying one of the largest and most active communities for data scientists ...\"\n}\nEOF\n```\n*Example text from Wikipedia*\n\n\u003ca name=\"standalone-installation\"\u003e\u003c/a\u003e\n### Option 3. Standalone Installation (for development or integration)\n\n#### Requirements\n\nPython3\n\n#### Installation\n\nTo install Yake using pip:\n\n``` bash\npip install git+https://github.com/LIAAD/yake\n```\n\nTo upgrade using pip:\n\n``` bash\npip install git+https://github.com/LIAAD/yake –-upgrade\n```\n#### Usage (Command line)\n\nHow to use it on your favorite command line\n``` bash\nUsage: yake [OPTIONS]\n\nOptions:\n\t-ti, --text_input TEXT          Input text, SURROUNDED by single quotes(')\n\t-i, --input_file TEXT           Input file\n\t-l, --language TEXT             Language\n\t-n, --ngram-size INTEGER        Max size of the ngram.\n\t-df, --dedup-func [leve|jaro|seqm]\n\t\t\t\t\t\t\t\t\tDeduplication function.\n\t-dl, --dedup-lim FLOAT          Deduplication limiar.\n\t-ws, --window-size INTEGER      Window size.\n\t-t, --top INTEGER               Number of keyphrases to extract\n\t-v, --verbose\t\t\tGets detailed information (such as the score)\n\t--help                          Show this message and exit.\n``` \n### Usage (Python)\n\nHow to use it on Python\n\n``` python\nimport yake\n\ntext = \"Sources tell us that Google is acquiring Kaggle, a platform that hosts data science and machine learning \"\\\n\"competitions. Details about the transaction remain somewhat vague, but given that Google is hosting its Cloud \"\\\n\"Next conference in San Francisco this week, the official announcement could come as early as tomorrow. \"\\\n\"Reached by phone, Kaggle co-founder CEO Anthony Goldbloom declined to deny that the acquisition is happening. \"\\\n\"Google itself declined 'to comment on rumors'. Kaggle, which has about half a million data scientists on its platform, \"\\\n\"was founded by Goldbloom  and Ben Hamner in 2010. \"\\\n\"The service got an early start and even though it has a few competitors like DrivenData, TopCoder and HackerRank, \"\\\n\"it has managed to stay well ahead of them by focusing on its specific niche. \"\\\n\"The service is basically the de facto home for running data science and machine learning competitions. \"\\\n\"With Kaggle, Google is buying one of the largest and most active communities for data scientists - and with that, \"\\\n\"it will get increased mindshare in this community, too (though it already has plenty of that thanks to Tensorflow \"\\\n\"and other projects). Kaggle has a bit of a history with Google, too, but that's pretty recent. Earlier this month, \"\\\n\"Google and Kaggle teamed up to host a $100,000 machine learning competition around classifying YouTube videos. \"\\\n\"That competition had some deep integrations with the Google Cloud Platform, too. Our understanding is that Google \"\\\n\"will keep the service running - likely under its current name. While the acquisition is probably more about \"\\\n\"Kaggle's community than technology, Kaggle did build some interesting tools for hosting its competition \"\\\n\"and 'kernels', too. On Kaggle, kernels are basically the source code for analyzing data sets and developers can \"\\\n\"share this code on the platform (the company previously called them 'scripts'). \"\\\n\"Like similar competition-centric sites, Kaggle also runs a job board, too. It's unclear what Google will do with \"\\\n\"that part of the service. According to Crunchbase, Kaggle raised $12.5 million (though PitchBook says it's $12.75) \"\\\n\"since its   launch in 2010. Investors in Kaggle include Index Ventures, SV Angel, Max Levchin, Naval Ravikant, \"\\\n\"Google chief economist Hal Varian, Khosla Ventures and Yuri Milner \"\n```\n\n#### assuming default parameters\n```bash\nkw_extractor = yake.KeywordExtractor()\nkeywords = kw_extractor.extract_keywords(text)\n\nfor kw in keywords:\n\tprint(kw)\n```\n\n#### specifying parameters\n```bash\nlanguage = \"en\"\nmax_ngram_size = 3\ndeduplication_threshold = 0.9\ndeduplication_algo = 'seqm'\nwindowSize = 1\nnumOfKeywords = 20\n\ncustom_kw_extractor = yake.KeywordExtractor(lan=language, n=max_ngram_size, dedupLim=deduplication_threshold, dedupFunc=deduplication_algo, windowsSize=windowSize, top=numOfKeywords, features=None)\nkeywords = custom_kw_extractor.extract_keywords(text)\n\nfor kw in keywords:\n    print(kw)\n```\n\n#### Output\nThe lower the score, the more relevant the keyword is.\n``` bash\n('google', 0.026580863364597897)\n('kaggle', 0.0289005976239829)\n('ceo anthony goldbloom', 0.029946071606210194)\n('san francisco', 0.048810837074825336)\n('anthony goldbloom declined', 0.06176910090701819)\n('google cloud platform', 0.06261974476422487)\n('co-founder ceo anthony', 0.07357749587020043)\n('acquiring kaggle', 0.08723571551039863)\n('ceo anthony', 0.08915156857226395)\n('anthony goldbloom', 0.09123482372372106)\n('machine learning', 0.09147989238151344)\n('kaggle co-founder ceo', 0.093805063905847)\n('data', 0.097574333771058)\n('google cloud', 0.10260128641464673)\n('machine learning competitions', 0.10773000650607861)\n('francisco this week', 0.11519915079240485)\n('platform', 0.1183512305596321)\n('conference in san', 0.12392066376108138)\n('service', 0.12546743261462942)\n('goldbloom', 0.14611408778815776)\n```\n\n### Highlighting Feature\nHighlighting feature will tag every keyword in the text with the default tag `\u003ckw\u003e`.\n\n``` python\n\nfrom yake.highlight import TextHighlighter\n\nth = TextHighlighter(max_ngram_size = 3)\nth.highlight(text, keywords)\n\n```\n#### Output\nBy default, keywords will be highlighted using the tag 'kw'.\n``` \nSources tell us that \u003ckw\u003egoogle\u003c/kw\u003e is \u003ckw\u003eacquiring kaggle\u003c/kw\u003e, a platform that \u003ckw\u003ehosts data science\u003c/kw\u003e and \u003ckw\u003emachine learning\u003c/kw\u003e competitions. Details about the transaction remain somewhat vague , but given that \u003ckw\u003egoogle\u003c/kw\u003e is hosting its Cloud Next conference in \u003ckw\u003esan francisco\u003c/kw\u003e this week, the official announcement could come as early as tomorrow.  Reached by phone, Kaggle co-founder \u003ckw\u003eceo anthony goldbloom\u003c/kw\u003e declined to deny that the acquisition is happening. \u003ckw\u003egoogle\u003c/kw\u003e itself declined 'to comment on rumors'.\n.....\n.....\n```\n\n\n### Custom Highlighting Feature\nBesides tagging a text with the default tag, users can also specify their own custom highlight. In the following text, the tag `\u003cspan class='my_class' \u003e` makes use of an hyphotetical function `my_class` whose purpose would be to highlight in white colour or the relevant keywords.\n\n#### Output\n```python\n\nfrom yake.highlight import TextHighlighter\nth = TextHighlighter(max_ngram_size = 3, highlight_pre = \"\u003cspan class='my_class' \u003e\", highlight_post= \"\u003c/span\u003e\")\nth.highlight(text, keywords)\n```\n\n```\nself.highlight_postSources tell us that \u003cspan class='my_class' \u003egoogle\u003c/span\u003e is \u003cspan class='my_class' \u003eacquiring kaggle\u003c/span\u003e, a platform that \u003cspan class='my_class' \u003ehosts data science\u003c/span\u003e and \u003cspan class='my_class' \u003emachine learning\u003c/span\u003e self.highlight_postcompetitions. Details about the transaction remain somewhat vague , but given that \u003cspan class='my_class' \u003egoogle\u003c/span\u003e is hosting self.highlight_postits Cloud Next conference in \u003cspan class='my_class' \u003esan francisco\u003c/span\u003e this week, the official announcement could come as early self.highlight_postas tomorrow.  Reached by phone, Kaggle co-founder \u003cspan class='my_class' \u003eceo anthony goldbloom\u003c/span\u003e declined to deny that the self.highlight_postacquisition is happening. \u003cspan class='my_class' \u003egoogle\u003c/span\u003e itself declined 'to comment on rumors'.\n.....\n.....\n```\n\n### Languages others than English\nWhile English (`en`) is the default language, users can use YAKE! to extract keywords from whatever language they want to by specifying the the corresponding language universal code. The below example shows how to extract keywords from a portuguese text.\n``` bash\ntext = '''\n\"Conta-me Histórias.\" Xutos inspiram projeto premiado. A plataforma \"Conta-me Histórias\" foi distinguida com o Prémio Arquivo.pt, atribuído a trabalhos inovadores de investigação ou aplicação de recursos preservados da Web, através dos serviços de pesquisa e acesso disponibilizados publicamente pelo Arquivo.pt . Nesta plataforma em desenvolvimento, o utilizador pode pesquisar sobre qualquer tema e ainda executar alguns exemplos predefinidos. Como forma de garantir a pluralidade e diversidade de fontes de informação, esta são utilizadas 24 fontes de notícias eletrónicas, incluindo a TSF. Uma versão experimental (beta) do \"Conta-me Histórias\" está disponível aqui.\nA plataforma foi desenvolvida por Ricardo Campos investigador do LIAAD do INESC TEC e docente do Instituto Politécnico de Tomar, Arian Pasquali e Vitor Mangaravite, também investigadores do LIAAD do INESC TEC, Alípio Jorge, coordenador do LIAAD do INESC TEC e docente na Faculdade de Ciências da Universidade do Porto, e Adam Jatwot docente da Universidade de Kyoto.\n'''\n\ncustom_kw_extractor = yake.KeywordExtractor(lan=\"pt\")\nkeywords = custom_kw_extractor.extract_keywords(text)\n\nfor kw in keywords:\n    print(kw)\n```\n\n#### Output\n``` bash\n('conta-me histórias', 0.006225012963810038)\n('liaad do inesc', 0.01899063587015275)\n('inesc tec', 0.01995432290332246)\n('conta-me', 0.04513273690417472)\n('histórias', 0.04513273690417472)\n('prémio arquivo.pt', 0.05749361520927859)\n('liaad', 0.07738867367929901)\n('inesc', 0.07738867367929901)\n('tec', 0.08109398065524037)\n('xutos inspiram projeto', 0.08720742489353424)\n('inspiram projeto premiado', 0.08720742489353424)\n('adam jatwot docente', 0.09407053486771558)\n('arquivo.pt', 0.10261392141666957)\n('alípio jorge', 0.12190479662535166)\n('ciências da universidade', 0.12368384021490342)\n('ricardo campos investigador', 0.12789997272332762)\n('politécnico de tomar', 0.13323587141127738)\n('arian pasquali', 0.13323587141127738)\n('vitor mangaravite', 0.13323587141127738)\n('preservados da web', 0.13596322680882506)\n```\n\n## Related projects\n\n### YAKE! Mobile APP\nYAKE! is now available on [Google Play](https://play.google.com/store/apps/details?id=com.yake.yake)\n\n### `pke` - python keyphrase extraction\n\nhttps://github.com/boudinfl/pke - `pke` is an **open source** python-based **keyphrase extraction** toolkit. It\nprovides an end-to-end keyphrase extraction pipeline in which each component can\nbe easily modified or extended to develop new models. `pke` also allows for\neasy benchmarking of state-of-the-art keyphrase extraction models, and\nships with supervised models trained on the SemEval-2010 dataset (http://aclweb.org/anthology/S10-1004).\n\nCredits to https://github.com/boudinfl\n\n### `SparkNLP` - State of the Art Natural Language Processing framework\nhttps://github.com/JohnSnowLabs/spark-nlp - `SparkNLP` from [John Snow Labs](https://www.johnsnowlabs.com/) is an open source framework with full Python, Scala, and Java Support. Check [their documentation](https://nlp.johnsnowlabs.com/docs/en/annotators#yakekeywordextraction), [demo](https://demo.johnsnowlabs.com/public/KEYPHRASE_EXTRACTION/) and [google colab](https://colab.research.google.com/github/JohnSnowLabs/spark-nlp-workshop/blob/master/tutorials/streamlit_notebooks/KEYPHRASE_EXTRACTION.ipynb). A video on how to use spark nlp with yake can also be found here: https://events.johnsnowlabs.com/john-snow-labs-nlu-become-a-data-science-superhero-with-one-line-of-python-code\n\n### `General Index` by Archive.org\nhttps://archive.org/details/GeneralIndex - A catalogue of 19 billions of YAKE keywords extracted from 107 million papers. An article about the General Index project can also be found in [Nature](https://www.nature.com/articles/d41586-021-02895-8).\n\n### `textacy` - NLP, before and after spaCy\n\nhttps://github.com/chartbeat-labs/textacy - `textacy` is a Python library for performing a variety of natural language processing (NLP) tasks, built on the high-performance spaCy library. among other features it supports keyword extration using YAKE.\n\nCredits to https://github.com/chartbeat-labs\n\n\u003ca name=\"installing-docker\"\u003e\u003c/a\u003e\n\n### `Annif` - Tool for automated subject indexing and classification\nhttps://github.com/NatLibFi/Annif/ - `Annif` is a multi-algorithm automated subject indexing tool for libraries, archives and museums. This repository is used for developing a production version of the system, based on ideas from the initial prototype. Official website http://annif.org/.\n\n### `Portulan Clarin` - Services and data for researchers, innovators, students and language professionals\nhttps://portulanclarin.net/workbench/liaad-yake/ - `Portulan Clarin` is a Research Infrastructure for the Science and Technology of Language, belonging to the Portuguese National Roadmap of  Research Infrastructures of Strategic Relevance, and part of the international research infrastructure CLARIN ERIC. It includes a demo of YAKE! among many other language technologies. Official website https://portulanclarin.net/.\n\n## How to install Docker\n\nHere is the \"just copy and paste\" installations script for Docker in Ubuntu. Enjoy.\n\n```bash\n# Install dependencies\nsudo apt-get update\nsudo apt-get install \\\n    apt-transport-https \\\n    ca-certificates \\\n    curl \\\n    software-properties-common\n\n# Add Docker repo\ncurl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -\nsudo apt-key fingerprint 0EBFCD88\nsudo add-apt-repository \\\n   \"deb [arch=amd64] https://download.docker.com/linux/ubuntu \\\n   $(lsb_release -cs) \\\n   stable\"\nsudo apt-get update\n\n# Install Docker\nsudo apt-get install -y docker-ce\n\n# Start Docker Daemon\nsudo service docker start\n\n# Add yourself to the Docker user group, otherwise docker will complain that\n# it does not know if the Docker Daemon is running\nsudo usermod -aG docker ${USER}\n\n# Install docker-compose\nsudo curl -L \"https://github.com/docker/compose/releases/download/1.23.1/docker-compose-$(uname -s)-$(uname -m)\" -o /usr/local/bin/docker-compose\nsudo chmod +x /usr/local/bin/docker-compose\nsource ~/.bashrc\ndocker-compose --version\necho \"Done!\"\n```\n\nCredits to https://github.com/silvae86 for the Docker scripts.\n\n## References\nPlease cite the following works when using YAKE\n\n\u003cb\u003eIn-depth journal paper at Information Sciences Journal\u003c/b\u003e\n\nCampos, R., Mangaravite, V., Pasquali, A., Jatowt, A., Jorge, A., Nunes, C. and Jatowt, A. (2020). YAKE! Keyword Extraction from Single Documents using Multiple Local Features. In Information Sciences Journal. Elsevier, Vol 509, pp 257-289. [pdf](https://doi.org/10.1016/j.ins.2019.09.013)\n\n\u003cb\u003eECIR'18 Best Short Paper\u003c/b\u003e\n\nCampos R., Mangaravite V., Pasquali A., Jorge A.M., Nunes C., and Jatowt A. (2018). A Text Feature Based Automatic Keyword Extraction Method for Single Documents. In: Pasi G., Piwowarski B., Azzopardi L., Hanbury A. (eds). Advances in Information Retrieval. ECIR 2018 (Grenoble, France. March 26 – 29). Lecture Notes in Computer Science, vol 10772, pp. 684 - 691. [pdf](https://link.springer.com/chapter/10.1007/978-3-319-76941-7_63)\n\nCampos R., Mangaravite V., Pasquali A., Jorge A.M., Nunes C., and Jatowt A. (2018). YAKE! Collection-independent Automatic Keyword Extractor. In: Pasi G., Piwowarski B., Azzopardi L., Hanbury A. (eds). Advances in Information Retrieval. ECIR 2018 (Grenoble, France. March 26 – 29). Lecture Notes in Computer Science, vol 10772, pp. 806 - 810. [pdf](https://link.springer.com/chapter/10.1007/978-3-319-76941-7_80)\n\n## Awards\n[ECIR'18](http://ecir2018.org) Best Short Paper\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FLIAAD%2Fyake","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FLIAAD%2Fyake","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FLIAAD%2Fyake/lists"}