An open API service indexing awesome lists of open source software.

https://github.com/harshpatel44/cosine-text-similarity-algorithm

This repository contains Cosine text similarity algorithm to compare 2 documents
https://github.com/harshpatel44/cosine-text-similarity-algorithm

Last synced: 2 months ago
JSON representation

This repository contains Cosine text similarity algorithm to compare 2 documents

Awesome Lists containing this project

README

        

# Cosine-Text-Similarity-Algorithm
This repository contains Cosine text similarity algorithm to compare 2 documents.


So if we want to compare 2 files i.e. 'file1' and 'file2', and we have 3rd file containing all the tokens in file 'term_list'

This algorithm works on the formula : (a*b) / ||a|| . ||b||

Here a and b is the frequency of 'file1' and 'file2' respectively.