Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/smashedtoatoms/the_fuzz
String metrics and phonetic algorithms for Elixir (e.g. Dice/Sorensen, Hamming, Jaccard, Jaro, Jaro-Winkler, Levenshtein, Metaphone, N-Gram, NYSIIS, Overlap, Ratcliff/Obershelp, Refined NYSIIS, Refined Soundex, Soundex, Weighted Levenshtein)
https://github.com/smashedtoatoms/the_fuzz
Last synced: 2 months ago
JSON representation
String metrics and phonetic algorithms for Elixir (e.g. Dice/Sorensen, Hamming, Jaccard, Jaro, Jaro-Winkler, Levenshtein, Metaphone, N-Gram, NYSIIS, Overlap, Ratcliff/Obershelp, Refined NYSIIS, Refined Soundex, Soundex, Weighted Levenshtein)
- Host: GitHub
- URL: https://github.com/smashedtoatoms/the_fuzz
- Owner: smashedtoatoms
- License: apache-2.0
- Created: 2014-10-10T19:55:08.000Z (over 10 years ago)
- Default Branch: master
- Last Pushed: 2019-01-09T17:14:55.000Z (about 6 years ago)
- Last Synced: 2024-02-23T00:24:23.554Z (11 months ago)
- Language: Elixir
- Size: 1.43 MB
- Stars: 76
- Watchers: 7
- Forks: 8
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE.txt
Awesome Lists containing this project
- freaking_awesome_elixir - Elixir - Fuzzy string-matching algorithm implementations. (Algorithms and Data structures)
- fucking-awesome-elixir - the_fuzz - Fuzzy string-matching algorithm implementations. (Algorithms and Data structures)
- awesome-elixir - the_fuzz - Fuzzy string-matching algorithm implementations. (Algorithms and Data structures)
README
TheFuzz
=======**Fuzzy string matching algorithm implementations**
TheFuzz is a collection of metrics and phonetic (fuzzy string matching) algorithms for Elixir. It is based entirely on the [rockymadden stringmetric library](https://github.com/rockymadden/stringmetric) written by Rocky Madden for Scala. There will eventually be Elixir implementations of all of the string metric and phonetic algorithms implemented in his library. The library provides facilities to perform approximate string matching, measurement of string similarity/distance, indexing by word pronunciation, and sounds-like comparisons. The best way to see usage is to check out the tests and the [documentation](http://smashedtoatoms.github.io/the_fuzz/api-reference.html).
The following algorithms are currently implemented.
1. DiceSorensenMetric
1. HammingDistance
1. JaccardSimilarityMetric
1. JaroMetric
1. JaroWinklerMetric
1. LevenshteinDistance
1. NGramSimilarityMetric
1. OverlapMetric
1. TanimotoCoefficientMetric
1. TverskyIndexMetric
1. WeightedLevenshteinDistance
1. MetaphoneAlgorithmI implemented these ones first because I needed them for another project. I will be adding more as time progresses. If you need one of them for a project, and I haven't implemented it yet, please let me know so that I can give it a higher priority.