Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/rocky/p5-lead-sentence

Find leading sentence in some text
https://github.com/rocky/p5-lead-sentence

Last synced: about 1 month ago
JSON representation

Find leading sentence in some text

Awesome Lists containing this project

README

        

Some Perl code I wrote when I worked at the Associated Press to find the leading sentence for some text (news stories).

It uses heuristics for sentence ends:

* Space or quote and space after a dot
* Not space before punctuation mark
* Word with punctuation mark isn't capitalized (could be an abbreviation)
* Word after space is capitalized (Next word starts a sentence)
* Word with punctuation mark isn't capitalized (could be an abbreviation)
* Word with punctuation mark isn't known abbreviation
* Following word with capitalization is known sentence begin word, e.g. The
* Sentence length is at least so many characters
* Sentence length is no more than so many characters.

We rank possible endings and pick the highest one.