An open API service indexing awesome lists of open source software.

Projects in Awesome Lists tagged with sentence-tokenizer

A curated list of projects in awesome lists tagged with sentence-tokenizer .

https://github.com/nipunsadvilkar/pysbd

🐍💯pySBD (Python Sentence Boundary Disambiguation) is a rule-based sentence boundary detection that works out-of-the-box.

python rule-based segmentation sentence sentence-boundary-detection sentence-tokenizer

Last synced: 14 May 2025

https://github.com/nipunsadvilkar/pySBD

🐍💯pySBD (Python Sentence Boundary Disambiguation) is a rule-based sentence boundary detection that works out-of-the-box.

python rule-based segmentation sentence sentence-boundary-detection sentence-tokenizer

Last synced: 12 Apr 2025

https://github.com/neurosnap/sentences

A multilingual command line sentence tokenizer in Golang

cli sentence-tokenizer sentences tokenizer

Last synced: 16 May 2025

https://github.com/megagonlabs/bunkai

Sentence boundary disambiguation tool for Japanese texts (日本語文境界判定器)

japanese python sentence-boundary-detection sentence-tokenizer

Last synced: 05 Apr 2025

https://github.com/cbilgili/zemberek-nlp-server

Zemberek Türkçe NLP Java Kütüphanesi üzerine REST Docker Sunucu

docker javascript nlp part-of-speech-tagger rest sentence-tokenizer spark turkish turkish-language zemberek

Last synced: 03 May 2025

https://github.com/Flight-School/sentences

A command-line utility that splits natural language text into sentences.

cli macos nlp sentence-tokenizer swift

Last synced: 23 Nov 2024

https://github.com/ikegami-yukino/sengiri

Yet another sentence-level tokenizer for the Japanese text

japanese-language japanese-sentences sentence-tokenizer tokenizer

Last synced: 21 Mar 2025

https://github.com/kmint21/html2sent

HTML2SENT modifies HTML to improve sentences tokenizer quality

nlp nltk python sentence-segmentation sentence-tokenizer text-mining tokenizer

Last synced: 12 May 2025

https://github.com/elifftosunn/textdataclean

Kirli veri çekildiğinde ön işleme adımlarına gerek kalmadan model eğitimi için hazır hale getirmek amacıyla yapılan uygulamadır.

corpus deasciifier morphological-analysis ngram nltk numpy pandas sentence-embedding sentence-tokenizer stemmer stopwords string turkish turkish-sentence-tokenizer word-tokenizer

Last synced: 15 Mar 2025

https://github.com/aburraq/stanfordcorenlp

My legal background gave me a deep appreciation for language's importance. It's not just words; it's a profound understanding woven into every case. This connection led me to coding, where I coded a potent pipeline system with Stanford CoreNLP.

java lemmatizer named-entity-recognition nlp oop partofspeech-tagger sentence-tokenizer sentiment-analysis stanfordnlp tokenizer

Last synced: 27 Feb 2025