https://github.com/soenneker/soenneker.utils.strings.dicecoefficient
A utility library for comparing strings via the Dice Coefficient algorithm
https://github.com/soenneker/soenneker.utils.strings.dicecoefficient
coefficient comparison csharp dice dicecoefficient dicecoefficientstringutil distance dotnet sorensen string strings util utils
Last synced: 30 days ago
JSON representation
A utility library for comparing strings via the Dice Coefficient algorithm
- Host: GitHub
- URL: https://github.com/soenneker/soenneker.utils.strings.dicecoefficient
- Owner: soenneker
- License: mit
- Created: 2024-12-04T18:49:45.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2026-03-10T23:23:46.000Z (3 months ago)
- Last Synced: 2026-03-11T01:54:29.587Z (3 months ago)
- Topics: coefficient, comparison, csharp, dice, dicecoefficient, dicecoefficientstringutil, distance, dotnet, sorensen, string, strings, util, utils
- Language: C#
- Homepage: https://soenneker.com
- Size: 876 KB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Contributing: .github/CONTRIBUTING.md
- Funding: .github/FUNDING.yml
- License: LICENSE
- Code of conduct: .github/CODE_OF_CONDUCT.md
- Security: .github/SECURITY.md
Awesome Lists containing this project
README
[](https://www.nuget.org/packages/soenneker.utils.strings.dicecoefficient/)
[](https://github.com/soenneker/soenneker.utils.strings.dicecoefficient/actions/workflows/publish-package.yml)
[](https://www.nuget.org/packages/soenneker.utils.strings.dicecoefficient/)
[](https://github.com/soenneker/soenneker.utils.strings.dicecoefficient/actions/workflows/codeql.yml)
#  Soenneker.Utils.Strings.DiceCoefficient
### A utility library for comparing strings via the Dice Coefficient algorithm
## Installation
```
dotnet add package Soenneker.Utils.Strings.DiceCoefficient
```
## Why?
The Dice Coefficient is a powerful way to measure similarity between strings or other sequences. It's particularly effective for comparing text fragments, identifying duplicates, and matching approximate content. Here's why it stands out:
### Pairwise Comparison:
It evaluates based on overlapping character pairs (bigrams), focusing on shared elements without considering their order.
### Balanced by Size:
It considers both the number of matches and the total size of the compared strings, ensuring a fair similarity measure.
### Suitable for Real-World Data:
Its sensitivity to shared sequences makes it effective for noisy or partially matching data.
### Fast and Scalable:
It's computationally efficient, making it applicable for large datasets or repeated comparisons.
## Usage
```csharp
var text1 = "This is a test";
var text2 = "This is another test";
double result = DiceCoefficientStringUtil.CalculatePercentage(text1, text2); // 74.07
```