Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/flori/amatch
Approximate String Matching library
https://github.com/flori/amatch
Last synced: about 2 months ago
JSON representation
Approximate String Matching library
- Host: GitHub
- URL: https://github.com/flori/amatch
- Owner: flori
- License: apache-2.0
- Created: 2009-08-26T00:18:33.000Z (over 15 years ago)
- Default Branch: master
- Last Pushed: 2024-10-03T14:15:10.000Z (3 months ago)
- Last Synced: 2024-11-08T16:53:45.422Z (about 2 months ago)
- Language: C
- Homepage:
- Size: 148 KB
- Stars: 376
- Watchers: 9
- Forks: 35
- Open Issues: 5
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGES
- License: COPYING
Awesome Lists containing this project
README
# amatch - Approximate Matching Extension for Ruby
## Description
This is a collection of classes that can be used for Approximate
matching, searching, and comparing of Strings. They implement algorithms
that compute the Levenshtein edit distance, Sellers edit distance, the
Hamming distance, the longest common subsequence length, the longest common
substring length, the pair distance metric, the Jaro-Winkler metric.## Installation
To install this extension as a gem type
# gem install amatch
into the shell.
## Download
The homepage of this library is located at
* https://github.com/flori/amatch
## Examples
require 'amatch'
# => true
include Amatch
# => Object
m = Sellers.new("pattern")
# => #
m.match("pattren")
# => 2.0
m.substitution = m.insertion = 3
# => 3
m.match("pattren")
# => 4.0
m.reset_weights
# => #
m.match(["pattren","parent"])
# => [2.0, 4.0]
m.search("abcpattrendef")
# => 2.0
m = Levenshtein.new("pattern")
# => #
m.match("pattren")
# => 2
m.search("abcpattrendef")
# => 2
"pattern language".levenshtein_similar("language of patterns")
# => 0.2
m = Amatch::DamerauLevenshtein.new("pattern")
# => #
m.match("pattren")
# => 1
"pattern language".damerau_levenshtein_similar("language of patterns")
# => 0.19999999999999996
m = Hamming.new("pattern")
# => #
m.match("pattren")
# => 2
"pattern language".hamming_similar("language of patterns")
# => 0.1
m = PairDistance.new("pattern")
# => #
m.match("pattr en")
# => 0.545454545454545
m.match("pattr en", nil)
# => 0.461538461538462
m.match("pattr en", /t+/)
# => 0.285714285714286
"pattern language".pair_distance_similar("language of patterns")
# => 0.928571428571429
m = LongestSubsequence.new("pattern")
# => #
m.match("pattren")
# => 6
"pattern language".longest_subsequence_similar("language of patterns")
# => 0.4
m = LongestSubstring.new("pattern")
# => #
m.match("pattren")
# => 4
"pattern language".longest_substring_similar("language of patterns")
# => 0.4
m = Jaro.new("pattern")
# => #
m.match("paTTren")
# => 0.952380952380952
m.ignore_case = false
m.match("paTTren")
# => 0.742857142857143
"pattern language".jaro_similar("language of patterns")
# => 0.672222222222222
m = JaroWinkler.new("pattern")
# #
m.match("paTTren")
# => 0.971428571712403
m.ignore_case = false
m.match("paTTren")
# => 0.79428571505206
m.scaling_factor = 0.05
m.match("pattren")
# => 0.961904762046678
"pattern language".jarowinkler_similar("language of patterns")
# => 0.672222222222222## Author
Florian Frank mailto:[email protected]
## License
Apache License, Version 2.0 – See the COPYING file in the source archive.