{"id":15629683,"url":"https://github.com/turnerj/quickenshtein","last_synced_at":"2025-04-07T11:08:01.910Z","repository":{"id":39711239,"uuid":"238347826","full_name":"Turnerj/Quickenshtein","owner":"Turnerj","description":"Making the quickest and most memory efficient implementation of Levenshtein Distance with SIMD and Threading support","archived":false,"fork":false,"pushed_at":"2023-11-02T08:09:35.000Z","size":332,"stargazers_count":294,"open_issues_count":11,"forks_count":14,"subscribers_count":5,"default_branch":"main","last_synced_at":"2025-03-31T09:08:20.422Z","etag":null,"topics":["edit-distance","hardware-intrinsics","levenshtein","levenshtein-distance","simd","string-distance","threading"],"latest_commit_sha":null,"homepage":"","language":"C#","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Turnerj.png","metadata":{"files":{"readme":"readme.md","changelog":null,"contributing":null,"funding":".github/FUNDING.yml","license":"License.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null},"funding":{"github":"Turnerj"}},"created_at":"2020-02-05T02:01:46.000Z","updated_at":"2025-03-20T22:24:39.000Z","dependencies_parsed_at":"2024-01-02T22:38:38.640Z","dependency_job_id":"c9612d10-fb07-4fc3-bdad-1e6c856d0abf","html_url":"https://github.com/Turnerj/Quickenshtein","commit_stats":null,"previous_names":[],"tags_count":8,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Turnerj%2FQuickenshtein","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Turnerj%2FQuickenshtein/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Turnerj%2FQuickenshtein/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Turnerj%2FQuickenshtein/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Turnerj","download_url":"https://codeload.github.com/Turnerj/Quickenshtein/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247640463,"owners_count":20971557,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["edit-distance","hardware-intrinsics","levenshtein","levenshtein-distance","simd","string-distance","threading"],"created_at":"2024-10-03T10:28:08.488Z","updated_at":"2025-04-07T11:08:01.885Z","avatar_url":"https://github.com/Turnerj.png","language":"C#","readme":"\u003cdiv align=\"center\"\u003e\n\n![Icon](images/icon.png)\n\n# Quickenshtein\n\nA quick and memory efficient Levenshtein Distance calculator for .NET\n\n[![AppVeyor](https://img.shields.io/appveyor/ci/Turnerj/Quickenshtein/master.svg)](https://ci.appveyor.com/project/Turnerj/Quickenshtein)\n[![Codecov](https://img.shields.io/codecov/c/github/Turnerj/Quickenshtein/master.svg)](https://codecov.io/gh/Turnerj/Quickenshtein)\n[![NuGet](https://img.shields.io/nuget/v/Quickenshtein.svg)](https://www.nuget.org/packages/Quickenshtein/)\n\u003c/div\u003e\n\n## Performance\n\nQuickenshtein gets its speed and memory efficiency from a number of different optimizations.\nTo get the most performance out of the library, you will need .NET Core 3 or higher as this has support for hardware intrinsics.\n\nQuickenshtein takes advantage of the following hardware intrinsics. On any recent x86 system, you will likely have these available.\n- [SSE2](https://en.wikipedia.org/wiki/SSE2#CPU_support)\n- [SSE4.1](https://en.wikipedia.org/wiki/SSE4#Supporting_CPUs)\n- [AVX2](https://en.wikipedia.org/wiki/Advanced_Vector_Extensions#CPUs_with_AVX2)\n\nIf your computer doesn't have one of the hardware intrinsics available, Quickenshtein will still work - just slower than optimal.\n\n## Multi-Threading\n\nBy default, Quickenshtein performs in single-threaded mode as this mode performs best for small to medium size strings while having no memory allocations.\nWhen dealing with huge strings of 8000 characters or more, it may be useful to switch to multi-threaded mode.\nIn this mode, the calculation is broken up and shared between multiple cores in a system.\n\nMulti-threading is especially useful for systems without hardware intrinsics or for .NET Framework as shown in the table below where it provided a 3x performance improvement.\n\n|                 Method |       Runtime | NumberOfCharacters |       Mean |     Error |     StdDev |      Gen 0 |      Gen 1 |     Gen 2 |   Allocated |\n|----------------------- |-------------- |------------------- |-----------:|----------:|-----------:|-----------:|-----------:|----------:|------------:|\n|          Quickenshtein |    .NET 4.7.2 |               8000 | 110.686 ms | 10.118 ms |   0.554 ms |          - |          - |         - |           - |\n| Quickenshtein_Threaded |    .NET 4.7.2 |               8000 |  36.601 ms | 16.121 ms |   0.883 ms |          - |          - |         - |      1260 B |\n\nTo enable threading, you can pass in `CalculationOptions.DefaultWithThreading` to `Levenshtein.GetDistance()` or configure your own `CalculationOptions` with settings that work best for you.\n\n_Note: Multi-threading is not allocation free (unlike single-threading mode) and will allocate a small amount depending on the number of threads used._\n\n## Benchmarking\n\nThere are a number of benchmarks in the repository that you can run on your system to see how well Quickenshtein performs.\n\nMost of these benchmarks...\n- Run .NET Framework and .NET Core so you can see how the performance changes between them\n- Compare against a simple baseline Levenshtein Distance implementation with no specific optimizations\n- Compare against [Fastenshtein](https://github.com/DanHarltey/Fastenshtein/), one of the other fast .NET Levenshtein Distance implementations\n\nYou can view results to these benchmarks at the links below:\n- [Benchmarking against Fastenshtein](/docs/OverallComparison.md)\n- [Isolated Benchmarks](/docs/StringScenarios.md)\n- [Intrinsics Performance](/docs/IntrinsicsPerformance.md)\n\n## Example Usage\n\n```csharp\nusing Quickenshtein;\n\n// Common usage (uses default CalculationOptions with threading disabled)\nvar distance1 = Levenshtein.GetDistance(\"Saturday\", \"Sunday\");\n\n// Enable threading\nvar distance2 = Levenshtein.GetDistance(\"Saturday\", \"Sunday\", CalculationOptions.DefaultWithThreading);\n\n// Custom calculation options (helps with tuning for your specific workload and environment - you should benchmark your configurations on your system)\nvar distance3 = Levenshtein.GetDistance(\"Saturday\", \"Sunday\", new CalculationOptions {\n    EnableThreadingAfterXCharacters = 10000,\n    MinimumCharactersPerThread = 25000\n});\n```\n\n## Learning Resources\n\nI've written quite a bit about Levenshtein Distance and various ways you can extract performance from it:\n\n- [Levenshtein Distance Part 1: What is it?](https://turnerj.com/blog/levenshtein-distance-part-1-what-is-it)\n- [Levenshtein Distance Part 2: Gotta Go Fast](https://turnerj.com/blog/levenshtein-distance-part-2-gotta-go-fast)\n- [Levenshtein Distance Part 3: Optimize Everything!](https://turnerj.com/blog/levenshtein-distance-part-3-optimize-everything)\n- [Levenshtein Distance with SIMD](https://turnerj.com/blog/levenshtein-distance-with-simd)\n\nIf you prefer video:\n\n- [Maximising Algorithm Performance in .NET: Levenshtein Distance](https://www.youtube.com/watch?v=JiOYajl2Mds) at .NET Conf 2020","funding_links":["https://github.com/sponsors/Turnerj"],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fturnerj%2Fquickenshtein","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fturnerj%2Fquickenshtein","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fturnerj%2Fquickenshtein/lists"}