https://github.com/hamedfathi/simmetricscore
A text similarity metric library, e.g. from edit distance's (Levenshtein, Gotoh, Jaro, etc) to other metrics, (e.g Soundex, Chapman). This library is compiled based on the .NET standard with a lot of useful extension methods.
https://github.com/hamedfathi/simmetricscore
csharp dotnet fuzzy fuzzy-search jaro jaro-distance jaro-winkler jaro-winkler-distance levenshtein levenshtein-distance metric metrics similarity similarity-score
Last synced: 3 months ago
JSON representation
A text similarity metric library, e.g. from edit distance's (Levenshtein, Gotoh, Jaro, etc) to other metrics, (e.g Soundex, Chapman). This library is compiled based on the .NET standard with a lot of useful extension methods.
- Host: GitHub
- URL: https://github.com/hamedfathi/simmetricscore
- Owner: HamedFathi
- License: mit
- Created: 2021-11-09T05:07:02.000Z (almost 4 years ago)
- Default Branch: main
- Last Pushed: 2021-11-21T12:29:07.000Z (almost 4 years ago)
- Last Synced: 2025-02-01T14:51:12.509Z (8 months ago)
- Topics: csharp, dotnet, fuzzy, fuzzy-search, jaro, jaro-distance, jaro-winkler, jaro-winkler-distance, levenshtein, levenshtein-distance, metric, metrics, similarity, similarity-score
- Language: C#
- Homepage:
- Size: 37.1 KB
- Stars: 9
- Watchers: 3
- Forks: 6
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README

### [Nuget](https://www.nuget.org/packages/SimMetricsCore)
[](https://opensource.org/licenses/MIT)

```
Install-Package SimMetricsCoredotnet add package SimMetricsCore
```
`SimMetricsCore` supports the following algorithms
```
BlockDistance
ChapmanLengthDeviation
ChapmanMeanLength
CosineSimilarity
DiceSimilarity
EuclideanDistance
JaccardSimilarity
Jaro
JaroWinkler
Levenstein // Default
MatchingCoefficient
MongeElkan
NeedlemanWunch
OverlapCoefficient
QGramsDistance
SmithWaterman
SmithWatermanGotoh
SmithWatermanGotohWindowedAffine
```### Extension Methods
```cs
// GetSimilarity
// [0-1] => [0%-100%] similarity
double GetSimilarity(this string firstWord, string secondWord, SimMetricType simMetricType = SimMetricType.Levenstein, bool convertToPercentage = false)
SimMetricResult GetMinSimilarityInfo(this string first, IEnumerable second, SimMetricType simMetricType = SimMetricType.Levenstein, bool convertToPercentage = false)// GetSimilarities
// Get similarity score for each input.
IEnumerable GetSimilarities(this string first, IEnumerable second, SimMetricType simMetricType = SimMetricType.Levenstein, bool convertToPercentage = false)
IEnumerable GetSimilarities(this string first, string[] second, SimMetricType simMetricType = SimMetricType.Levenstein, bool convertToPercentage = false)// GetMinSimilarity
// Returns the first item that has the least similarity.
string GetMinSimilarity(this string first, IEnumerable second, SimMetricType simMetricType = SimMetricType.Levenstein, bool convertToPercentage = false)
SimMetricResult GetMinSimilarityInfo(this string first, IEnumerable second, SimMetricType simMetricType = SimMetricType.Levenstein, bool convertToPercentage = false)// GetMinSimilarities
// Returns the items that have the least similarity.
// A list can contain unique items with the same similarity score.
IEnumerable GetMinSimilarities(this string first, IEnumerable second, SimMetricType simMetricType = SimMetricType.Levenstein, bool convertToPercentage = false)
IEnumerable GetMinSimilaritiesInfo(this string first, IEnumerable second, SimMetricType simMetricType = SimMetricType.Levenstein, bool convertToPercentage = false)// GetMaxSimilarity
// Returns the first item that has the most similarity.
string GetMaxSimilarity(this string first, IEnumerable second, SimMetricType simMetricType = SimMetricType.Levenstein, bool convertToPercentage = false)
SimMetricResult GetMaxSimilarityInfo(this string first, IEnumerable second, SimMetricType simMetricType = SimMetricType.Levenstein, bool convertToPercentage = false)// GetMaxSimilarities
// Returns the items that have the most similarity.
// A list can contain unique items with the same similarity score.
IEnumerable GetMaxSimilarities(this string first, IEnumerable second, SimMetricType simMetricType = SimMetricType.Levenstein, bool convertToPercentage = false)
IEnumerable GetMaxSimilaritiesInfo(this string first, IEnumerable second, SimMetricType simMetricType = SimMetricType.Levenstein, bool convertToPercentage = false)// Contains
// Getting closer to '1.0' for the 'threshold' increases the accuracy of the comparison.
bool ContainsFuzzy(this string source, string search, double threshold = 0.7, SimMetricType simMetricType = SimMetricType.Levenstein)
// Returns approved values from the 'source' items.
IEnumerable ContainsFuzzy(this IEnumerable source, string search, double threshold = 0.7, SimMetricType simMetricType = SimMetricType.Levenstein)
IEnumerable ContainsFuzzy(this string[] source, string search, double threshold = 0.7, SimMetricType simMetricType = SimMetricType.Levenstein)
````SimMetricResult` class contains the following data:
```cs
public class SimMetricResult
{
public string Item { get; set; }
// [0-1] => [0%-100%] similarity
public double Score { get; set; }
}
```