An open API service indexing awesome lists of open source software.

https://github.com/lincerely/tfidf

Calculate TF-IDF cosine similarity of files
https://github.com/lincerely/tfidf

tf-idf

Last synced: 2 months ago
JSON representation

Calculate TF-IDF cosine similarity of files

Awesome Lists containing this project

README

        

# TFIDF

Given a list of filenames, calculate TF-IDF cosine similarity matrix for all entries.

## Reference

- https://www.sejuku.net/blog/26420
- https://atmarkit.itmedia.co.jp/ait/articles/2112/23/news028.html

## License

The stemming code is from https://tartarus.org/martin/PorterStemmer/, by Martine Porter.

> "The software is completely free for any purpose, unless notes at the head of the program text indicates otherwise (which is rare)..."

For the rest of the code, see the license attached.