https://github.com/jamesturk/jellyfish
🪼 a python library for doing approximate and phonetic matching of strings.
https://github.com/jamesturk/jellyfish
fuzzy-search hacktoberfest hamming jaro-winkler levenshtein metaphone python soundex
Last synced: 1 day ago
JSON representation
🪼 a python library for doing approximate and phonetic matching of strings.
- Host: GitHub
- URL: https://github.com/jamesturk/jellyfish
- Owner: jamesturk
- License: mit
- Created: 2010-07-09T20:41:11.000Z (almost 15 years ago)
- Default Branch: main
- Last Pushed: 2025-04-07T04:20:42.000Z (17 days ago)
- Last Synced: 2025-04-13T11:39:13.597Z (10 days ago)
- Topics: fuzzy-search, hacktoberfest, hamming, jaro-winkler, levenshtein, metaphone, python, soundex
- Language: Jupyter Notebook
- Homepage: https://jamesturk.github.io/jellyfish/
- Size: 3.5 MB
- Stars: 2,119
- Watchers: 41
- Forks: 160
- Open Issues: 5
-
Metadata Files:
- Readme: README.md
- Funding: .github/FUNDING.yml
- License: LICENSE
- Citation: CITATION.cff
Awesome Lists containing this project
- awesome-python-machine-learning-resources - GitHub - 10% open · ⏱️ 07.01.2022): (文本数据和NLP)
- awesome-list - jellyfish - A library for approximate & phonetic matching of strings. (Data Processing / Data Similarity)
- starred-awesome - jellyfish - 🎐 a python library for doing approximate and phonetic matching of strings. (Python)
README
# Overview
**jellyfish** is a library for approximate & phonetic matching of strings.
Source: [https://github.com/jamesturk/jellyfish](https://github.com/jamesturk/jellyfish)
Documentation: [https://jamesturk.github.io/jellyfish/](https://jamesturk.github.io/jellyfish/)
Issues: [https://github.com/jamesturk/jellyfish/issues](https://github.com/jamesturk/jellyfish/issues)
[](https://badge.fury.io/py/jellyfish)
[](https://github.com/jamesturk/jellyfish/actions?query=workflow%3A%22Python+package)
[](https://coveralls.io/r/jamesturk/jellyfish)
## Included Algorithms
String comparison:
* Levenshtein Distance
* Damerau-Levenshtein Distance
* Jaccard Index
* Jaro Distance
* Jaro-Winkler Distance
* Match Rating Approach Comparison
* Hamming DistancePhonetic encoding:
* American Soundex
* Metaphone
* NYSIIS (New York State Identification and Intelligence System)
* Match Rating Codex## Example Usage
``` python
>>> import jellyfish
>>> jellyfish.levenshtein_distance('jellyfish', 'smellyfish')
2
>>> jellyfish.jaro_similarity('jellyfish', 'smellyfish')
0.89629629629629637
>>> jellyfish.damerau_levenshtein_distance('jellyfish', 'jellyfihs')
1>>> jellyfish.metaphone('Jellyfish')
'JLFX'
>>> jellyfish.soundex('Jellyfish')
'J412'
>>> jellyfish.nysiis('Jellyfish')
'JALYF'
>>> jellyfish.match_rating_codex('Jellyfish')
'JLLFSH'
```