An open API service indexing awesome lists of open source software.

https://github.com/ansebi/ansnorm_similarity

Determining the distance (dissimilarity) between two vectors: showcase of problems in classic approaches and the solution - AnsNorm distance combining the advantages of euclidean and cosine distances.
https://github.com/ansebi/ansnorm_similarity

Last synced: about 2 months ago
JSON representation

Determining the distance (dissimilarity) between two vectors: showcase of problems in classic approaches and the solution - AnsNorm distance combining the advantages of euclidean and cosine distances.

Awesome Lists containing this project

README

        

# AnsNorm_Similarity
Determining the distance (dissimilarity) between two vectors: showcase of problems in classic approaches and the solution - AnsNorm distance combining the advantages of euclidean and cosine distances.

Consider computing the distance (non-similarity) between the two given vectors

of length normalized from -1 to 1 in user preference applications.

Distance of 0 between the two must mean that they are completely the same,

larger distances mean less similarity.

Classical methods such as Euclidean distance and Cosine similarity

have their limitations and downsides.

Calculating the dot product as a similarity mesaure is even farther from being ideal.

Euclidean distance does not pay attention to the direction of vectors

while cosine distance ignores the magnitude. Meanwhile dot product as a measure

ignores the common sense.

One possible solution to the problem of evaluating how similar two vectorised personalities are

is AnsNorm Similarity method given below.

Here you can find several cases when the classics aren't ideal

and test ansnorm() function on the same given examples.

FYI:

cosine returns values from 0 to 2

euclidean returns values from 0 to +inf

dot product is any number from -inf to +inf


As for ansnorm,

for input vectors normalised from -1 to 1

it outputs distance values from 0 to 1:

0 for identical entities and 1 for the most dissimilar.