
An open API service indexing awesome lists of open source software.

Python implementation of TextRank algorithms ("textgraphs") for phrase extraction

graph-algorithms machine-learning natural-language natural-language-processing nlp python spacy spacy-extension summarization textgraphs textrank

Last synced: 29 days ago
JSON representation

Python implementation of TextRank algorithms ("textgraphs") for phrase extraction




# PyTextRank

![Repo size](
![GitHub commit activity](
[![Checked with mypy](](
[![security: bandit](](

**PyTextRank** is a Python implementation of *TextRank* as a
[spaCy pipeline extension](,
for graph-based natural language work -- and related knowledge graph practices.
This includes the family of
[*textgraph*]( algorithms:

- *TextRank* by [[mihalcea04textrank]](
- *PositionRank* by [[florescuc17]](
- *Biased TextRank* by [[kazemi-etal-2020-biased]](
- *TopicRank* by [[bougouin-etal-2013-topicrank]](

Popular use cases for this library include:

- *phrase extraction*: get the top-ranked phrases from a text document
- low-cost *extractive summarization* of a text document
- help infer concepts from unstructured text into more structured representation

See our full documentation at:

## Getting Started

See the ["Getting Started"](
section of the online documentation.

To install from [PyPi](
python3 -m pip install pytextrank
python3 -m spacy download en_core_web_sm

If you work directly from this Git repo, be sure to install the
dependencies as well:
python3 -m pip install -r requirements.txt

Alternatively, to install dependencies using `conda`:
conda env create -f environment.yml
conda activate pytextrank

Then to use the library with a simple use case:
import spacy
import pytextrank

# example text
text = "Compatibility of systems of linear constraints over the set of natural numbers. Criteria of compatibility of a system of linear Diophantine equations, strict inequations, and nonstrict inequations are considered. Upper bounds for components of a minimal set of solutions and algorithms of construction of minimal generating sets of solutions for all types of systems are given. These criteria and the corresponding algorithms for constructing a minimal supporting set of solutions can be used in solving all the considered types systems and systems of mixed types."

# load a spaCy model, depending on language, scale, etc.
nlp = spacy.load("en_core_web_sm")

# add PyTextRank to the spaCy pipeline
doc = nlp(text)

# examine the top-ranked phrases in the document
for phrase in doc._.phrases:
print(phrase.rank, phrase.count)

See the **tutorial notebooks** in the `examples` subdirectory for
sample code and patterns to use in integrating **PyTextTank** with
related libraries in Python:

Contributing Code

We welcome people getting involved as contributors to this open source

For detailed instructions please see:

Build Instructions

Note: unless you are contributing code and updates,
in most use cases won't need to build this package locally.

Instead, simply install from
or use [Conda](

To set up the build environment locally, see the
["Build Instructions"](
section of the online documentation.

Semantic Versioning

Generally speaking the major release number of PyTextRank
will track with the major release number of the associated spaCy


thanks noam!

## License and Copyright

Source code for **PyTextRank** plus its logo, documentation, and examples
have an [MIT license]( which is
succinct and simplifies use in commercial applications.

All materials herein are Copyright © 2016-2024 Derwen, Inc.

## Attribution

Please use the following BibTeX entry for citing **PyTextRank** if you
use it in your research or software:
author = {Paco Nathan},
title = {{PyTextRank, a Python implementation of TextRank for phrase extraction and summarization of text documents}},
year = 2016,
publisher = {Derwen},
doi = {10.5281/zenodo.4637885},
url = {}

Citations are helpful for the continued development and maintenance of
this library.
For example, see our citations listed on
[Google Scholar](,5).

## Kudos

Many thanks to our open source [sponsors](;
and to our contributors:
also to [@mihalcea]( who leads outstanding NLP research work,
encouragement from the wonderful folks at Explosion who develop [spaCy](,
plus general support from [Derwen, Inc.](

## Star History

[![Star History Chart](](