An open API service indexing awesome lists of open source software.

https://github.com/roma-glushko/sift


https://github.com/roma-glushko/sift

Last synced: about 1 month ago
JSON representation

Awesome Lists containing this project

README

          

# Sift NLP

Sift is a NLP library for normalizing real-world text based on the statistic language model.

Sift focuses on two main cases:

- joined words
- words written with typos

## Resources

- https://en.wikipedia.org/wiki/Viterbi_algorithm
- http://norvig.com/spell-correct.html
- https://stackoverflow.com/questions/195010/how-can-i-split-multiple-joined-words
- https://catalog.ldc.upenn.edu/LDC2006T13