{"id":13467973,"url":"https://github.com/DerwenAI/pytextrank","last_synced_at":"2025-03-26T03:31:18.246Z","repository":{"id":37396588,"uuid":"69814684","full_name":"DerwenAI/pytextrank","owner":"DerwenAI","description":"Python implementation of TextRank algorithms (\"textgraphs\") for phrase extraction","archived":false,"fork":false,"pushed_at":"2024-07-16T08:39:07.000Z","size":1689,"stargazers_count":2172,"open_issues_count":17,"forks_count":333,"subscribers_count":64,"default_branch":"main","last_synced_at":"2025-03-26T00:12:48.134Z","etag":null,"topics":["graph-algorithms","machine-learning","natural-language","natural-language-processing","nlp","python","spacy","spacy-extension","summarization","textgraphs","textrank"],"latest_commit_sha":null,"homepage":"https://derwen.ai/docs/ptr/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/DerwenAI.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":".github/FUNDING.yml","license":"LICENSE","code_of_conduct":"code_of_conduct.md","threat_model":null,"audit":null,"citation":"CITATION","codeowners":null,"security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null},"funding":{"github":"ceteri"}},"created_at":"2016-10-02T18:39:12.000Z","updated_at":"2025-03-21T23:51:42.000Z","dependencies_parsed_at":"2022-08-10T17:41:55.446Z","dependency_job_id":"b44d6852-572f-4c5f-8fb4-fb4ce26742fa","html_url":"https://github.com/DerwenAI/pytextrank","commit_stats":{"total_commits":311,"total_committers":20,"mean_commits":15.55,"dds":0.3022508038585209,"last_synced_commit":"7cc079b1856e59cc3e4b53268a01b5e8893ca1ae"},"previous_names":["ceteri/pytextrank"],"tags_count":22,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DerwenAI%2Fpytextrank","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DerwenAI%2Fpytextrank/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DerwenAI%2Fpytextrank/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DerwenAI%2Fpytextrank/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/DerwenAI","download_url":"https://codeload.github.com/DerwenAI/pytextrank/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":245584748,"owners_count":20639621,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["graph-algorithms","machine-learning","natural-language","natural-language-processing","nlp","python","spacy","spacy-extension","summarization","textgraphs","textrank"],"created_at":"2024-07-31T15:01:03.484Z","updated_at":"2025-03-26T03:31:18.210Z","avatar_url":"https://github.com/DerwenAI.png","language":"Python","readme":"# PyTextRank\n\n[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.4637885.svg)](https://doi.org/10.5281/zenodo.4637885)\n![Licence](https://img.shields.io/github/license/DerwenAI/pytextrank)\n![Repo size](https://img.shields.io/github/repo-size/DerwenAI/pytextrank)\n![GitHub commit activity](https://img.shields.io/github/commit-activity/w/DerwenAI/pytextrank?style=plastic)\n[![Checked with mypy](http://www.mypy-lang.org/static/mypy_badge.svg)](http://mypy-lang.org/)\n[![security: bandit](https://img.shields.io/badge/security-bandit-yellow.svg)](https://github.com/PyCQA/bandit)\n![CI](https://github.com/DerwenAI/pytextrank/workflows/CI/badge.svg)\n![downloads](https://img.shields.io/pypi/dm/pytextrank)\n![sponsor](https://img.shields.io/github/sponsors/ceteri)\n\n**PyTextRank** is a Python implementation of *TextRank* as a\n[spaCy pipeline extension](https://spacy.io/universe/project/spacy-pytextrank),\nfor graph-based natural language work -- and related knowledge graph practices.\nThis includes the family of \n[*textgraph*](https://derwen.ai/docs/ptr/glossary/#textgraphs) algorithms:\n\n  - *TextRank* by [[mihalcea04textrank]](https://derwen.ai/docs/ptr/biblio/#mihalcea04textrank)\n  - *PositionRank* by [[florescuc17]](https://derwen.ai/docs/ptr/biblio/#florescuc17)\n  - *Biased TextRank* by [[kazemi-etal-2020-biased]](https://derwen.ai/docs/ptr/biblio/#kazemi-etal-2020-biased)\n  - *TopicRank* by [[bougouin-etal-2013-topicrank]](https://derwen.ai/docs/ptr/biblio/#bougouin-etal-2013-topicrank)\n\nPopular use cases for this library include:\n\n  - *phrase extraction*: get the top-ranked phrases from a text document\n  - low-cost *extractive summarization* of a text document\n  - help infer concepts from unstructured text into more structured representation\n\nSee our full documentation at: \u003chttps://derwen.ai/docs/ptr/\u003e\n\n\n## Getting Started\n\nSee the [\"Getting Started\"](https://derwen.ai/docs/ptr/start/)\nsection of the online documentation.\n\nTo install from [PyPi](https://pypi.python.org/pypi/pytextrank):\n```\npython3 -m pip install pytextrank\npython3 -m spacy download en_core_web_sm\n```\n\nIf you work directly from this Git repo, be sure to install the\ndependencies as well:\n```\npython3 -m pip install -r requirements.txt\n```\n\nAlternatively, to install dependencies using `conda`:\n```\nconda env create -f environment.yml\nconda activate pytextrank\n```\n\nThen to use the library with a simple use case:\n```python\nimport spacy\nimport pytextrank\n\n# example text\ntext = \"Compatibility of systems of linear constraints over the set of natural numbers. Criteria of compatibility of a system of linear Diophantine equations, strict inequations, and nonstrict inequations are considered. Upper bounds for components of a minimal set of solutions and algorithms of construction of minimal generating sets of solutions for all types of systems are given. These criteria and the corresponding algorithms for constructing a minimal supporting set of solutions can be used in solving all the considered types systems and systems of mixed types.\"\n\n# load a spaCy model, depending on language, scale, etc.\nnlp = spacy.load(\"en_core_web_sm\")\n\n# add PyTextRank to the spaCy pipeline\nnlp.add_pipe(\"textrank\")\ndoc = nlp(text)\n\n# examine the top-ranked phrases in the document\nfor phrase in doc._.phrases:\n    print(phrase.text)\n    print(phrase.rank, phrase.count)\n    print(phrase.chunks)\n```\n\nSee the **tutorial notebooks** in the `examples` subdirectory for\nsample code and patterns to use in integrating **PyTextTank** with\nrelated libraries in Python:\n\u003chttps://derwen.ai/docs/ptr/tutorial/\u003e\n\n\n\u003cdetails\u003e\n  \u003csummary\u003eContributing Code\u003c/summary\u003e\n\nWe welcome people getting involved as contributors to this open source\nproject!\n\nFor detailed instructions please see:\n[CONTRIBUTING.md](https://github.com/DerwenAI/pytextrank/blob/main/CONTRIBUTING.md)\n\u003c/details\u003e\n\n\u003cdetails\u003e\n  \u003csummary\u003eBuild Instructions\u003c/summary\u003e\n\n\u003cstrong\u003e\nNote: unless you are contributing code and updates,\nin most use cases won't need to build this package locally.\n\u003c/strong\u003e\n\nInstead, simply install from\n[PyPi](https://pypi.python.org/pypi/pytextrank)\nor use [Conda](https://docs.conda.io/).\n\nTo set up the build environment locally, see the \n[\"Build Instructions\"](https://derwen.ai/docs/ptr/build/)\nsection of the online documentation.\n\u003c/details\u003e\n\n\u003cdetails\u003e\n  \u003csummary\u003eSemantic Versioning\u003c/summary\u003e\n\nGenerally speaking the major release number of \u003cstrong\u003ePyTextRank\u003c/strong\u003e \nwill track with the major release number of the associated \u003ccode\u003espaCy\u003c/code\u003e\nversion.\n\nSee:\n[CHANGELOG.md](https://github.com/DerwenAI/pytextrank/blob/main/CHANGELOG.md)\n\u003c/details\u003e\n\n\u003cimg\n alt=\"thanks noam!\"\n src=\"https://raw.githubusercontent.com/DerwenAI/pytextrank/main/docs/assets/noam.jpg\"\n width=\"231\"\n/\u003e\n\n\n## License and Copyright\n\nSource code for **PyTextRank** plus its logo, documentation, and examples\nhave an [MIT license](https://spdx.org/licenses/MIT.html) which is\nsuccinct and simplifies use in commercial applications.\n\nAll materials herein are Copyright \u0026copy; 2016-2024 Derwen, Inc.\n\n\n## Attribution\n\nPlease use the following BibTeX entry for citing **PyTextRank** if you \nuse it in your research or software:\n```bibtex\n@software{PyTextRank,\n  author = {Paco Nathan},\n  title = {{PyTextRank, a Python implementation of TextRank for phrase extraction and summarization of text documents}},\n  year = 2016,\n  publisher = {Derwen},\n  doi = {10.5281/zenodo.4637885},\n  url = {https://github.com/DerwenAI/pytextrank}\n}\n```\n\nCitations are helpful for the continued development and maintenance of\nthis library.\nFor example, see our citations listed on\n[Google Scholar](https://scholar.google.com/scholar?q=related:5tl6J4xZlCIJ:scholar.google.com/\u0026scioq=\u0026hl=en\u0026as_sdt=0,5).\n\n\n## Kudos\n\nMany thanks to our open source [sponsors](https://github.com/sponsors/ceteri);\nand to our contributors:\n[@ceteri](https://github.com/ceteri),\n[@louisguitton](https://github.com/louisguitton),\n[@Ankush-Chander](https://github.com/Ankush-Chander),\n[@tomaarsen](https://github.com/tomaarsen),\n[@CaptXiong](https://github.com/CaptXiong),\n[@Lord-V15](https://github.com/Lord-V15),\n[@anna-droid-beep](https://github.com/anna-droid-beep),\n[@dvsrepo](https://github.com/dvsrepo),\n[@clabornd](https://github.com/clabornd),\n[@dayalstrub-cma](https://github.com/dayalstrub-cma),\n[@kavorite](https://github.com/kavorite),\n[@0dB](https://github.com/0dB),\n[@htmartin](https://github.com/htmartin),\n[@williamsmj](https://github.com/williamsmj/),\n[@mattkohl](https://github.com/mattkohl),\n[@vanita5](https://github.com/vanita5),\n[@HarshGrandeur](https://github.com/HarshGrandeur),\n[@mnowotka](https://github.com/mnowotka),\n[@kjam](https://github.com/kjam),\n[@SaiThejeshwar](https://github.com/SaiThejeshwar),\n[@laxatives](https://github.com/laxatives),\n[@dimmu](https://github.com/dimmu), \n[@JasonZhangzy1757](https://github.com/JasonZhangzy1757), \n[@jake-aft](https://github.com/jake-aft),\n[@junchen1992](https://github.com/junchen1992),\n[@shyamcody](https://github.com/shyamcody),\n[@chikubee](https://github.com/chikubee);\nalso to [@mihalcea](https://github.com/mihalcea) who leads outstanding NLP research work,\nencouragement from the wonderful folks at Explosion who develop [spaCy](https://github.com/explosion/spaCy),\nplus general support from [Derwen, Inc.](https://derwen.ai/)\n\n## Star History\n\n[![Star History Chart](https://api.star-history.com/svg?repos=derwenai/pytextrank\u0026type=Date)](https://star-history.com/#derwenai/pytextrank\u0026Date)\n","funding_links":["https://github.com/sponsors/ceteri","https://github.com/sponsors/ceteri);"],"categories":["Python","Natural Language Processing","文本数据和NLP"],"sub_categories":["General Purpose NLP"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FDerwenAI%2Fpytextrank","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FDerwenAI%2Fpytextrank","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FDerwenAI%2Fpytextrank/lists"}