{"id":16259750,"url":"https://github.com/rapidfuzz/rapidfuzz-cpp","last_synced_at":"2025-12-12T02:13:14.632Z","repository":{"id":38185776,"uuid":"255237601","full_name":"rapidfuzz/rapidfuzz-cpp","owner":"rapidfuzz","description":"Rapid fuzzy string matching in C++ using the Levenshtein Distance","archived":false,"fork":false,"pushed_at":"2024-10-24T14:08:54.000Z","size":2969,"stargazers_count":240,"open_issues_count":3,"forks_count":38,"subscribers_count":10,"default_branch":"main","last_synced_at":"2024-10-25T08:16:07.042Z","etag":null,"topics":["cpp","hacktoberfest","levenshtein","string-comparison","string-matching","string-similarity"],"latest_commit_sha":null,"homepage":"https://rapidfuzz.github.io/rapidfuzz-cpp","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/rapidfuzz.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":".github/FUNDING.yml","license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null},"funding":{"github":"maxbachmann","custom":["https://www.paypal.com/donate/?hosted_button_id=VGWQBBD5CTWJU"]}},"created_at":"2020-04-13T05:16:31.000Z","updated_at":"2024-10-24T14:05:15.000Z","dependencies_parsed_at":"2023-10-30T20:39:06.624Z","dependency_job_id":"3dc80032-803b-4166-96f3-a71ea2404ee2","html_url":"https://github.com/rapidfuzz/rapidfuzz-cpp","commit_stats":{"total_commits":380,"total_committers":20,"mean_commits":19.0,"dds":0.3868421052631579,"last_synced_commit":"cbdf84388cea0f12d8a02d9bac28d806b178302a"},"previous_names":["rapidfuzz/rapidfuzz-cpp","maxbachmann/rapidfuzz-cpp"],"tags_count":41,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rapidfuzz%2Frapidfuzz-cpp","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rapidfuzz%2Frapidfuzz-cpp/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rapidfuzz%2Frapidfuzz-cpp/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rapidfuzz%2Frapidfuzz-cpp/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/rapidfuzz","download_url":"https://codeload.github.com/rapidfuzz/rapidfuzz-cpp/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254471061,"owners_count":22076585,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cpp","hacktoberfest","levenshtein","string-comparison","string-matching","string-similarity"],"created_at":"2024-10-10T16:04:46.976Z","updated_at":"2025-12-12T02:13:09.595Z","avatar_url":"https://github.com/rapidfuzz.png","language":"C++","readme":"  \u003ch1 align=\"center\"\u003e\n\u003cimg src=\"https://raw.githubusercontent.com/rapidfuzz/rapidfuzz/master/docs/img/RapidFuzz.svg?sanitize=true\" alt=\"RapidFuzz\" width=\"400\"\u003e\n\u003c/h1\u003e\n\u003ch4 align=\"center\"\u003eRapid fuzzy string matching in C++ using the Levenshtein Distance\u003c/h4\u003e\n\n\u003cp align=\"center\"\u003e\n  \u003ca href=\"https://github.com/rapidfuzz/rapidfuzz-cpp/actions\"\u003e\n    \u003cimg src=\"https://github.com/rapidfuzz/rapidfuzz-cpp/workflows/CMake/badge.svg\"\n         alt=\"Continuous Integration\"\u003e\n  \u003c/a\u003e\n  \u003ca href=\"https://rapidfuzz.github.io/rapidfuzz-cpp\"\u003e\n    \u003cimg src=\"https://img.shields.io/badge/-documentation-blue\"\n         alt=\"Documentation\"\u003e\n  \u003c/a\u003e\n  \u003ca href=\"https://github.com/rapidfuzz/rapidfuzz-cpp/blob/dev/LICENSE\"\u003e\n    \u003cimg src=\"https://img.shields.io/github/license/rapidfuzz/rapidfuzz-cpp\"\n         alt=\"GitHub license\"\u003e\n  \u003c/a\u003e\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n  \u003ca href=\"#description\"\u003eDescription\u003c/a\u003e •\n  \u003ca href=\"#installation\"\u003eInstallation\u003c/a\u003e •\n  \u003ca href=\"#usage\"\u003eUsage\u003c/a\u003e •\n  \u003ca href=\"#license\"\u003eLicense\u003c/a\u003e\n\u003c/p\u003e\n\n---\n## Description\nRapidFuzz is a fast string matching library for Python and C++, which is using the string similarity calculations from [FuzzyWuzzy](https://github.com/seatgeek/fuzzywuzzy). However, there are two aspects that set RapidFuzz apart from FuzzyWuzzy:\n1) It is MIT licensed so it can be used whichever License you might want to choose for your project, while you're forced to adopt the GPL license when using FuzzyWuzzy\n2) It is mostly written in C++ and on top of this comes with a lot of Algorithmic improvements to make string matching even faster, while still providing the same results. More details on these performance improvements in the form of benchmarks can be found [here](https://github.com/rapidfuzz/rapidfuzz/blob/master/Benchmarks.md)\n\nThe Library is split across multiple repositories for the different supported programming languages:\n- The C++ version is versioned in this repository\n- The Python version can be found at [rapidfuzz/rapidfuzz](https://github.com/rapidfuzz/rapidfuzz)\n\n\n## CMake Integration\n\nThere are severals ways to integrate `rapidfuzz` in your CMake project.\n\n### By Installing it\n```bash\ngit clone https://github.com/rapidfuzz/rapidfuzz-cpp.git rapidfuzz-cpp\ncd rapidfuzz-cpp\nmkdir build \u0026\u0026 cd build\ncmake .. -DCMAKE_BUILD_TYPE=Release\ncmake --build .\ncmake --build . --target install\n```\n\nThen in your CMakeLists.txt:\n```cmake\nfind_package(rapidfuzz REQUIRED)\nadd_executable(foo main.cpp)\ntarget_link_libraries(foo rapidfuzz::rapidfuzz)\n```\n\n### Add this repository as a submodule\n```bash\ngit submodule add https://github.com/rapidfuzz/rapidfuzz-cpp.git 3rdparty/RapidFuzz\n```\nThen you can either:\n\n1. include it as a subdirectory\n    ```cmake\n    add_subdirectory(3rdparty/RapidFuzz)\n    add_executable(foo main.cpp)\n    target_link_libraries(foo rapidfuzz::rapidfuzz)\n    ```\n2. build it at configure time with `FetchContent`\n    ```cmake\n    FetchContent_Declare(\n      rapidfuzz\n      SOURCE_DIR ${CMAKE_SOURCE_DIR}/3rdparty/RapidFuzz\n      PREFIX ${CMAKE_CURRENT_BINARY_DIR}/rapidfuzz\n      CMAKE_ARGS -DCMAKE_INSTALL_PREFIX:PATH=\u003cINSTALL_DIR\u003e \"${CMAKE_OPT_ARGS}\"\n    )\n    FetchContent_MakeAvailable(rapidfuzz)\n    add_executable(foo main.cpp)\n    target_link_libraries(foo PRIVATE rapidfuzz::rapidfuzz)\n    ```\n### Download it at configure time\n\nIf you don't want to add `rapidfuzz-cpp` as a submodule, you can also download it with `FetchContent`:\n```cmake\nFetchContent_Declare(rapidfuzz\n  GIT_REPOSITORY https://github.com/rapidfuzz/rapidfuzz-cpp.git\n  GIT_TAG main)\nFetchContent_MakeAvailable(rapidfuzz)\nadd_executable(foo main.cpp)\ntarget_link_libraries(foo PRIVATE rapidfuzz::rapidfuzz)\n```\nIt will be downloaded each time you run CMake in a blank folder.\n\n## CMake option\n\nThere are CMake options available:\n\n1. `RAPIDFUZZ_BUILD_TESTING` : to build test (default OFF and requires [Catch2](https://github.com/catchorg/Catch2))\n2. `RAPIDFUZZ_BUILD_BENCHMARKS` : to build benchmarks (default OFF and requires [Google Benchmark](https://github.com/google/benchmark))\n3. `RAPIDFUZZ_INSTALL` : to install the library to local computer\n    - When configured independently, installation is on.\n    - When used as a subproject, the installation is turned off by default.\n    - For library developers, you might want to toggle the behavior depending on your project.\n    - If your project is exported via `CMake`, turn installation on or export error will result.\n    - If your project publicly depends on `RapidFuzz` (includes `rapidfuzz.hpp` in header),\n      turn installation on or apps depending on your project would face include errors.\n\n## Usage\n```cpp\n#include \u003crapidfuzz/fuzz.hpp\u003e\n```\n\n### Simple Ratio\n```cpp\nusing rapidfuzz::fuzz::ratio;\n\n// score is 96.55171966552734\ndouble score = rapidfuzz::fuzz::ratio(\"this is a test\", \"this is a test!\");\n```\n\n### Partial Ratio\n```cpp\n// score is 100\ndouble score = rapidfuzz::fuzz::partial_ratio(\"this is a test\", \"this is a test!\");\n```\n\n### Token Sort Ratio\n```cpp\n// score is 90.90908813476562\ndouble score = rapidfuzz::fuzz::ratio(\"fuzzy wuzzy was a bear\", \"wuzzy fuzzy was a bear\")\n\n// score is 100\ndouble score = rapidfuzz::fuzz::token_sort_ratio(\"fuzzy wuzzy was a bear\", \"wuzzy fuzzy was a bear\")\n```\n\n### Token Set Ratio\n```cpp\n// score is 83.8709716796875\ndouble score = rapidfuzz::fuzz::token_sort_ratio(\"fuzzy was a bear\", \"fuzzy fuzzy was a bear\")\n\n// score is 100\ndouble score = rapidfuzz::fuzz::token_set_ratio(\"fuzzy was a bear\", \"fuzzy fuzzy was a bear\")\n```\n\n### Process\nIn the Python implementation, there is a module process, which is used to compare e.g. a string to a list of strings.\nIn Python, this both saves the time to implement those features yourself and can be a lot more efficient than repeated type\nconversions between Python and C++. Implementing a similar function in C++ using templates is not easily possible and probably slower than implementing them on your own. That's why this section describes how users can implement those features with a couple of lines of code using the C++ library.\n\n### extract\n\nThe following function compares a query string to all strings in a list of choices. It returns all\nelements with a similarity over score_cutoff. Generally make use of the cached implementations when comparing\na string to multiple strings.\n\n\n```cpp\ntemplate \u003ctypename Sentence1,\n          typename Iterable, typename Sentence2 = typename Iterable::value_type\u003e\nstd::vector\u003cstd::pair\u003cSentence2, double\u003e\u003e\nextract(const Sentence1\u0026 query, const Iterable\u0026 choices, const double score_cutoff = 0.0)\n{\n  std::vector\u003cstd::pair\u003cSentence2, double\u003e\u003e results;\n\n  rapidfuzz::fuzz::CachedRatio\u003ctypename Sentence1::value_type\u003e scorer(query);\n\n  for (const auto\u0026 choice : choices) {\n    double score = scorer.similarity(choice, score_cutoff);\n\n    if (score \u003e= score_cutoff) {\n      results.emplace_back(choice, score);\n    }\n  }\n\n  return results;\n}\n```\n\n### extractOne\n\nThe following function compares a query string to all strings in a list of choices.\n\n```cpp\ntemplate \u003ctypename Sentence1,\n          typename Iterable, typename Sentence2 = typename Iterable::value_type\u003e\nstd::optional\u003cstd::pair\u003cSentence2, double\u003e\u003e\nextractOne(const Sentence1\u0026 query, const Iterable\u0026 choices, const double score_cutoff = 0.0)\n{\n  bool match_found = false;\n  double best_score = score_cutoff;\n  Sentence2 best_match;\n\n  rapidfuzz::fuzz::CachedRatio\u003ctypename Sentence1::value_type\u003e scorer(query);\n\n  for (const auto\u0026 choice : choices) {\n    double score = scorer.similarity(choice, best_score);\n\n    if (score \u003e= best_score) {\n      match_found = true;\n      best_score = score;\n      best_match = choice;\n    }\n  }\n\n  if (!match_found) {\n    return nullopt;\n  }\n\n  return std::make_pair(best_match, best_score);\n}\n```\n\n### multithreading\n\nIt is very simple to use those scorers e.g. with open OpenMP to achieve better performance.\n\n```cpp\ntemplate \u003ctypename Sentence1,\n          typename Iterable, typename Sentence2 = typename Iterable::value_type\u003e\nstd::vector\u003cstd::pair\u003cSentence2, double\u003e\u003e\nextract(const Sentence1\u0026 query, const Iterable\u0026 choices, const double score_cutoff = 0.0)\n{\n  std::vector\u003cstd::pair\u003cSentence2, double\u003e\u003e results(choices.size());\n\n  rapidfuzz::fuzz::CachedRatio\u003ctypename Sentence1::value_type\u003e scorer(query);\n\n  #pragma omp parallel for\n  for (size_t i = 0; i \u003c choices.size(); ++i) {\n    double score = scorer.similarity(choices[i], score_cutoff);\n    results[i] = std::make_pair(choices[i], score);\n  }\n\n  return results;\n}\n```\n\n## License\nRapidFuzz is licensed under the MIT license since I believe that everyone should be able to use it without being forced to adopt the GPL license. That's why the library is based on an older version of fuzzywuzzy that was MIT-licensed as well.\nThis old version of fuzzywuzzy can be found [here](https://github.com/seatgeek/fuzzywuzzy/tree/4bf28161f7005f3aa9d4d931455ac55126918df7).\n","funding_links":["https://github.com/sponsors/maxbachmann","https://www.paypal.com/donate/?hosted_button_id=VGWQBBD5CTWJU"],"categories":["Miscellaneous"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frapidfuzz%2Frapidfuzz-cpp","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frapidfuzz%2Frapidfuzz-cpp","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frapidfuzz%2Frapidfuzz-cpp/lists"}