Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/simonw/datasette-jellyfish
Datasette plugin adding SQL functions for fuzzy text matching powered by Jellyfish
https://github.com/simonw/datasette-jellyfish
datasette datasette-io datasette-plugin
Last synced: 3 months ago
JSON representation
Datasette plugin adding SQL functions for fuzzy text matching powered by Jellyfish
- Host: GitHub
- URL: https://github.com/simonw/datasette-jellyfish
- Owner: simonw
- License: apache-2.0
- Created: 2019-03-09T16:02:01.000Z (almost 6 years ago)
- Default Branch: main
- Last Pushed: 2023-08-24T21:45:20.000Z (over 1 year ago)
- Last Synced: 2024-05-01T23:17:25.540Z (8 months ago)
- Topics: datasette, datasette-io, datasette-plugin
- Language: Python
- Homepage: https://datasette.io/plugins/datasette-jellyfish
- Size: 15.6 KB
- Stars: 12
- Watchers: 3
- Forks: 3
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# datasette-jellyfish
[![PyPI](https://img.shields.io/pypi/v/datasette-jellyfish.svg)](https://pypi.org/project/datasette-jellyfish/)
[![Changelog](https://img.shields.io/github/v/release/simonw/datasette-jellyfish?include_prereleases&label=changelog)](https://github.com/simonw/datasette-jellyfish/releases)
[![Tests](https://github.com/simonw/datasette-jellyfish/workflows/Test/badge.svg)](https://github.com/simonw/datasette-jellyfish/actions?query=workflow%3ATest)
[![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://github.com/simonw/datasette-jellyfish/blob/main/LICENSE)Datasette plugin that adds custom SQL functions for fuzzy string matching, built on top of the [Jellyfish](https://github.com/jamesturk/jellyfish) Python library by James Turk and Michael Stephens.
Interactive demos:
* [soundex, metaphone, nysiis, match_rating_codex comparison](https://latest-with-plugins.datasette.io/fixtures?sql=SELECT%0D%0A++++soundex%28%3As%29%2C+%0D%0A++++metaphone%28%3As%29%2C+%0D%0A++++nysiis%28%3As%29%2C+%0D%0A++++match_rating_codex%28%3As%29&s=demo).
* [distance functions comparison](https://latest-with-plugins.datasette.io/fixtures?sql=SELECT%0D%0A++++levenshtein_distance%28%3As1%2C+%3As2%29%2C%0D%0A++++damerau_levenshtein_distance%28%3As1%2C+%3As2%29%2C%0D%0A++++hamming_distance%28%3As1%2C+%3As2%29%2C%0D%0A++++jaro_similarity%28%3As1%2C+%3As2%29%2C%0D%0A++++jaro_winkler_similarity%28%3As1%2C+%3As2%29%2C%0D%0A++++match_rating_comparison%28%3As1%2C+%3As2%29%3B&s1=barrack+obama&s2=barrack+h+obama)Examples:
SELECT soundex("hello");
-- Outputs H400
SELECT metaphone("hello");
-- Outputs HL
SELECT nysiis("hello");
-- Outputs HAL
SELECT match_rating_codex("hello");
-- Outputs HLL
SELECT levenshtein_distance("hello", "hello world");
-- Outputs 6
SELECT damerau_levenshtein_distance("hello", "hello world");
-- Outputs 6
SELECT hamming_distance("hello", "hello world");
-- Outputs 6
SELECT jaro_similarity("hello", "hello world");
-- Outputs 0.8181818181818182
SELECT jaro_winkler_similarity("hello", "hello world");
-- Outputs 0.890909090909091
SELECT match_rating_comparison("hello", "helloo");
-- Outputs 1See [the Jellyfish documentation](https://jellyfish.readthedocs.io/en/latest/) for an explanation of each of these functions.