Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/diplodoc-platform/sentenizer
sentenizer — rule-based NLP library for sentence segmentation with russian language support
https://github.com/diplodoc-platform/sentenizer
Last synced: about 2 months ago
JSON representation
sentenizer — rule-based NLP library for sentence segmentation with russian language support
- Host: GitHub
- URL: https://github.com/diplodoc-platform/sentenizer
- Owner: diplodoc-platform
- License: mit
- Created: 2023-04-25T10:07:20.000Z (over 1 year ago)
- Default Branch: master
- Last Pushed: 2024-09-24T23:24:45.000Z (3 months ago)
- Last Synced: 2024-10-31T17:32:49.621Z (2 months ago)
- Language: TypeScript
- Size: 487 KB
- Stars: 0
- Watchers: 13
- Forks: 0
- Open Issues: 2
-
Metadata Files:
- Readme: README.MD
- Contributing: CONTRIBUTING.md
- License: LICENSE.txt
- Authors: AUTHORS
Awesome Lists containing this project
README
# sentenizer
sentenizer — rule-based NLP library for sentence segmentation with **russian language** support## api
### sentenize
takes text of type `string` and returns segmented sentences as `string[]`#### type
```
sentenize :: string -> string[]
```## usage
```
const {sentenize} = require('sentenizer');const text = 'Он купил фрукты - яблоки, бананы, и т. д. все были очень рады угощению. Вот такой он добродушный наш родственник И. В. Иванов.';
const sentences = sentenize(text);
// sentences:
// [
// 'Он купил фрукты - яблоки, бананы, и т. д. все были очень рады угощению.',
// 'Вот такой он добродушный наш родственник И. В. Иванов.'
// ]
```