{"id":19735984,"url":"https://github.com/rapidfuzz/jarowinkler","last_synced_at":"2025-12-11T22:47:16.760Z","repository":{"id":37754163,"uuid":"445592288","full_name":"rapidfuzz/JaroWinkler","owner":"rapidfuzz","description":"Python library for fast approximate string matching using Jaro and Jaro-Winkler similarity","archived":false,"fork":false,"pushed_at":"2024-01-08T23:02:00.000Z","size":108,"stargazers_count":70,"open_issues_count":1,"forks_count":5,"subscribers_count":3,"default_branch":"main","last_synced_at":"2025-04-05T22:31:42.005Z","etag":null,"topics":["cpp","hacktoberfest","jaro","jaro-winkler","python","string-comparison","string-matching","string-similarity"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/rapidfuzz.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":".github/FUNDING.yml","license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null},"funding":{"github":"maxbachmann"}},"created_at":"2022-01-07T16:58:07.000Z","updated_at":"2025-03-18T12:21:03.000Z","dependencies_parsed_at":"2023-12-28T21:56:00.236Z","dependency_job_id":"fe2b5871-a49b-4d4a-9414-2bb613ac9887","html_url":"https://github.com/rapidfuzz/JaroWinkler","commit_stats":{"total_commits":45,"total_committers":4,"mean_commits":11.25,"dds":0.1777777777777778,"last_synced_commit":"1ed8fb3aef98953298f7af5c3cf1723ac7ab7faa"},"previous_names":["rapidfuzz/jarowinkler","maxbachmann/jarowinkler"],"tags_count":15,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rapidfuzz%2FJaroWinkler","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rapidfuzz%2FJaroWinkler/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rapidfuzz%2FJaroWinkler/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rapidfuzz%2FJaroWinkler/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/rapidfuzz","download_url":"https://codeload.github.com/rapidfuzz/JaroWinkler/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":251642940,"owners_count":21620395,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cpp","hacktoberfest","jaro","jaro-winkler","python","string-comparison","string-matching","string-similarity"],"created_at":"2024-11-12T01:04:36.624Z","updated_at":"2025-12-11T22:47:16.719Z","avatar_url":"https://github.com/rapidfuzz.png","language":"Python","readme":"\n\u003ch1 align=\"center\"\u003e\n JaroWinkler\n\u003c/h1\u003e\n\u003cp align=\"center\"\u003e\n  \u003ca href=\"https://github.com/rapidfuzz/JaroWinkler/actions\"\u003e\n    \u003cimg src=\"https://github.com/rapidfuzz/JaroWinkler/workflows/Build/badge.svg\"\n         alt=\"Continuous Integration\"\u003e\n  \u003c/a\u003e\n  \u003ca href=\"https://pypi.org/project/jarowinkler/\"\u003e\n    \u003cimg src=\"https://img.shields.io/pypi/v/jarowinkler\"\n         alt=\"PyPI package version\"\u003e\n  \u003c/a\u003e\n  \u003ca href=\"https://www.python.org\"\u003e\n    \u003cimg src=\"https://img.shields.io/pypi/pyversions/jarowinkler\"\n         alt=\"Python versions\"\u003e\n  \u003c/a\u003e\u003cbr/\u003e\n  \u003ca href=\"https://github.com/rapidfuzz/JaroWinkler/blob/main/LICENSE\"\u003e\n    \u003cimg src=\"https://img.shields.io/github/license/rapidfuzz/JaroWinkler\"\n         alt=\"GitHub license\"\u003e\n  \u003c/a\u003e\n\u003c/p\u003e\n\n\u003ch4 align=\"center\"\u003eJaroWinkler is a library to calculate the Jaro and Jaro-Winkler similarity. It is easy to use, is far more performant than all alternatives and is designed to integrate seemingless with \u003ca href=\"https://github.com/rapidfuzz/RapidFuzz\"\u003eRapidFuzz\u003c/a\u003e.\u003c/h4\u003e\n\n\n\n## ⚡ Quickstart\n```python\n\u003e\u003e\u003e from jarowinkler import *\n\n\u003e\u003e\u003e jaro_similarity(\"Johnathan\", \"Jonathan\")\n0.8796296296296297\n\n\u003e\u003e\u003e jarowinkler_similarity(\"Johnathan\", \"Jonathan\")\n0.9037037037037037\n```\n\n## 🚀 Benchmarks\nThe implementation is based on a novel approach to calculate the Jaro-Winkler similarity using bitparallelism. This is significantly faster than the original approach used in other libraries. The following benchmark shows the performance difference to jellyfish and python-Levenshtein. \n\n\u003cp align=\"center\"\u003e\n\u003cimg src=\"https://raw.githubusercontent.com/rapidfuzz/JaroWinkler/main/bench/results/JaroWinkler.svg?sanitize=true\" alt=\"Benchmark JaroWinkler\"\u003e\n\u003c/p\u003e\n\n## ⚙️ Installation\n\nYou can install this library from [PyPI](https://pypi.org/project/jarowinkler/) with pip:\n```\npip install jarowinkler\n```\nJaroWinkler provides binary wheels for all common platforms.\n\n### Source builds\n\nFor a source build (for example from a SDist packaged) you only require a C++14 compatible compiler. You can install directly from GitHub if you would like.\n```\npip install git+https://github.com/rapidfuzz/JaroWinkler.git@main\n```\n\n## 📖 Usage\n\nAny algorithms in JaroWinkler can not only be used with strings, but with any arbitrary sequences of hashable objects:\n```python\nfrom jarowinkler import jarowinkler_similarity\n\n\njarowinkler_similarity(\"this is an example\".split(), [\"this\", \"is\", \"a\", \"example\"])\n# 0.8666666666666667\n```\n\nSo as long as two objects have the same hash they are treated as similar. You can provide a `__hash__` method for your own object instances.\n\n```python\nclass MyObject:\n    def __init__(self, hash):\n        self.hash = hash\n\n    def __hash__(self):\n        return self.hash\n\njarowinkler_similarity([MyObject(1), MyObject(2)], [MyObject(1), MyObject(2), MyObject(3)])\n# 0.9111111111111111\n```\n\nAll algorithms provide a `score_cutoff` parameter. This parameter can be used to filter out bad matches. Internally this allows JaroWinkler to select faster implementations in some places:\n\n```python\njaro_similarity(\"Johnathan\", \"Jonathan\", score_cutoff=0.9)\n# 0.0\n\njaro_similarity(\"Johnathan\", \"Jonathan\", score_cutoff=0.85)\n# 0.8796296296296297\n```\n\nJaroWinkler can be used with RapidFuzz, which provides multiple methods to compute string metrics on collections of inputs. JaroWinkler implements the RapidFuzz C-API which allows RapidFuzz to call the functions without any of the usual overhead of python, which makes this even faster.\n\n```python\nfrom rapidfuzz import process\n\nprocess.cdist([\"Johnathan\", \"Jonathan\"], [\"Johnathan\", \"Jonathan\"], scorer=jarowinkler_similarity)\narray([[1.       , 0.9037037],\n       [0.9037037, 1.       ]], dtype=float32)\n```\n\n## 👍 Contributing\n\nPRs are welcome!\n- Found a bug? Report it in form of an [issue](https://github.com/rapidfuzz/JaroWinkler/issues) or even better fix it!\n- Can make something faster? Great! Just avoid external dependencies and remember that existing functionality should still work.\n- Something else that do you think is good? Do it! Just make sure that CI passes and everything from the README is still applicable (interface, features, and so on).\n- Have no time to code? Tell your friends and subscribers about JaroWinkler. More users, more contributions, more amazing features.\n\nThank you :heart:\n\n## ⚠️ License\nCopyright 2021 - present [maxbachmann](https://github.com/maxbachmann). `JaroWinkler` is free and open-source software licensed under the [MIT License](https://github.com/rapidfuzz/JaroWinkler/blob/main/LICENSE).\n","funding_links":["https://github.com/sponsors/maxbachmann"],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frapidfuzz%2Fjarowinkler","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frapidfuzz%2Fjarowinkler","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frapidfuzz%2Fjarowinkler/lists"}