https://github.com/lincerely/tfidf
Calculate TF-IDF cosine similarity of files
https://github.com/lincerely/tfidf
tf-idf
Last synced: 2 months ago
JSON representation
Calculate TF-IDF cosine similarity of files
- Host: GitHub
- URL: https://github.com/lincerely/tfidf
- Owner: lincerely
- License: mit
- Created: 2024-01-31T06:27:50.000Z (over 1 year ago)
- Default Branch: termap
- Last Pushed: 2024-02-08T19:09:15.000Z (over 1 year ago)
- Last Synced: 2025-01-26T12:42:26.872Z (4 months ago)
- Topics: tf-idf
- Language: C
- Homepage:
- Size: 26.4 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: readme
- License: license
Awesome Lists containing this project
README
# TFIDF
Given a list of filenames, calculate TF-IDF cosine similarity matrix for all entries.
## Reference
- https://www.sejuku.net/blog/26420
- https://atmarkit.itmedia.co.jp/ait/articles/2112/23/news028.html## License
The stemming code is from https://tartarus.org/martin/PorterStemmer/, by Martine Porter.
> "The software is completely free for any purpose, unless notes at the head of the program text indicates otherwise (which is rare)..."
For the rest of the code, see the license attached.