Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/UniversalAvenue/simhash-ex

Elixir implementation of Simhash
https://github.com/UniversalAvenue/simhash-ex

Last synced: about 1 month ago
JSON representation

Elixir implementation of Simhash

Awesome Lists containing this project

README

        

# Simhash
An Elixir implementation of [Moses Charikar's](http://www.cs.princeton.edu/courses/archive/spring04/cos598B/bib/CharikarEstim.pdf) Simhash.

## Examples

```elixir
iex> Simhash.similarity("Universal Avenue", "Universe Avenue")
0.71875
iex> Simhash.similarity("hocus pocus", "pocus hocus")
0.8125
iex> Simhash.similarity("Sankt Eriksgatan 1", "S:t Eriksgatan 1")
0.8125
iex> Simhash.similarity("Purple flowers", "Green grass")
0.5625
```

By default trigrams (N-gram of size 3) are used as language features, but you can set a different N-gram size:

```elixir
iex> Simhash.similarity("hocus pocus", "pocus hocus", 1)
1.0
iex> Simhash.similarity("Sankt Eriksgatan 1", "S:t Eriksgatan 1", 6)
0.859375
iex> Simhash.similarity("Purple flowers", "Green grass", 6)
0.546875
```

## Installation

The package can be installed
by adding `simhash` to your list of dependencies in `mix.exs`:

```elixir
def deps do
[
{:simhash, "~> 0.1.2"}
]
end
```