An open API service indexing awesome lists of open source software.

https://github.com/prosegrinder/python-prosegrinder

A relatively fast, functional prose text counter with readability scoring.
https://github.com/prosegrinder/python-prosegrinder

prose readability-scores statistics text-analysis

Last synced: about 2 months ago
JSON representation

A relatively fast, functional prose text counter with readability scoring.

Awesome Lists containing this project

README

          

# Prosegrinder

[![Latest PyPI version](https://img.shields.io/pypi/v/prosegrinder.svg)](https://pypi.python.org/pypi/prosegrinder)
[![Python Poetry CI](https://github.com/prosegrinder/python-prosegrinder/actions/workflows/python-ci.yml/badge.svg)](https://github.com/prosegrinder/python-prosegrinder/actions/workflows/python-ci.yml)

A relatively fast, functional prose text counter with readability scoring.

## Installation

`prosegrinder` is available on PyPI. Simply install it with `pip`:

```bash
pip install prosegrinder
```

## Usage

The main use is via the `prosegrinder.Prose` object.

```python
>>> from prosegrinder import Prose
>>> p = Prose("Some lengthy text that's actual prose, like a novel or article.")
```

The Prose object will parse everything down and compute basic statistics,
including word count, sentence count, paragraph count, syllable count, point of
view, dialogue, narrative, and a set of readability scores. All objects and
attributes should be treated as immutable.

I know this isn't great documentation, but it should be enough to get you going.

### Command Line Interface

Prosegrinder now includes a simple CLI for analyzing text in a file:

```bash
$ prosegrinder --help
Usage: prosegrinder [OPTIONS] FILES...

Setup the command line interface

Options:
-i, --indent INTEGER Python pretty-print json indent level.
-s, --save FILENAME File to save output to.
--help Show this message and exit.
```

Will provide basic statistics on text from a file or set of files including the
filename and sh256 of text in each file analyzed. Output is json to help
facilitate use in automation::

```json
[
{
"filename": "shortstory.txt",
"statistics": {
"sha256": "5b756dea7c7f0088ff3692e402466af7f4fc493fa357c1ae959fa4493943fc03",
"word_character_count": 7008,
"phone_count": 5747,
"syllable_count": 2287,
"word_count": 1528,
"sentence_count": 90,
"paragraph_count": 77,
"complex_word_count": 202,
"long_word_count": 275,
"pov_word_count": 113,
"first_person_word_count": 8,
"second_person_word_count": 74,
"third_person_word_count": 31,
"pov": "first",
"readability_scores": {
"automated_readability_index": 0.281,
"coleman_liau_index": 9.425,
"flesch_kincaid_grade_level": 8.693,
"flesch_reading_ease": 62.979,
"gunning_fog_index": 12.079,
"linsear_write": 10.733,
"lix": 34.975,
"rix": 3.056,
"smog": 11.688
}
}
},
{
"filename": "copyright.txt",
"statistics": {
"sha256": "553bfd087a2736e4bbe2f312e3d3a5b763fb57caa54e3626da03b0fd3f42e017",
"word_character_count": 222,
"phone_count": 169,
"syllable_count": 78,
"word_count": 46,
"sentence_count": 7,
"paragraph_count": 16,
"complex_word_count": 10,
"long_word_count": 12,
"pov_word_count": 1,
"first_person_word_count": 1,
"second_person_word_count": 0,
"third_person_word_count": 0,
"pov": "first",
"readability_scores": {
"automated_readability_index": 1.404,
"coleman_liau_index": 8.073,
"flesch_kincaid_grade_level": 6.982,
"flesch_reading_ease": 56.713,
"gunning_fog_index": 11.324,
"linsear_write": 3.714,
"lix": 32.658,
"rix": 1.714,
"smog": 9.957
}
}
}
]
```

### Readability scores

The set of scores automatically calculated:

- Automated Readability Index
- Coleman Liau Index
- Flesch Kincaid Grade Level
- Flesch Reading Ease
- Gunning Fog Index
- Linsear Write
- LIX
- RIX
- SMOG