{"id":28316735,"url":"https://github.com/elchemista/fuzler","last_synced_at":"2025-06-24T06:30:48.715Z","repository":{"id":290743201,"uuid":"975432066","full_name":"elchemista/fuzler","owner":"elchemista","description":"A tiny, Rust‑powered string‑similarity helper for Elixir.","archived":false,"fork":false,"pushed_at":"2025-05-10T21:26:22.000Z","size":16222,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-06-01T12:41:48.098Z","etag":null,"topics":["elixir","full-text-search","fuzzy-search","rust"],"latest_commit_sha":null,"homepage":"https://hex.pm/packages/fuzler","language":"Elixir","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/elchemista.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-04-30T09:55:14.000Z","updated_at":"2025-05-10T21:26:25.000Z","dependencies_parsed_at":"2025-04-30T11:35:52.602Z","dependency_job_id":null,"html_url":"https://github.com/elchemista/fuzler","commit_stats":null,"previous_names":["elchemista/fuzler"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/elchemista/fuzler","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/elchemista%2Ffuzler","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/elchemista%2Ffuzler/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/elchemista%2Ffuzler/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/elchemista%2Ffuzler/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/elchemista","download_url":"https://codeload.github.com/elchemista/fuzler/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/elchemista%2Ffuzler/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":261620051,"owners_count":23185450,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["elixir","full-text-search","fuzzy-search","rust"],"created_at":"2025-05-25T03:08:01.540Z","updated_at":"2025-06-24T06:30:48.694Z","avatar_url":"https://github.com/elchemista.png","language":"Elixir","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Fuzler\n\n_A tiny, Rust‑powered string‑similarity helper for Elixir._\n\n`Fuzler` gives you **one public function**:\n\n```elixir\nFuzler.similarity_score(query :: String.t(), target :: String.t()) :: float\n```\n\nIt returns a **normalised score in $0.0 – 1.0$** that tells you how closely\ntwo pieces of text match—robust to typos, word‑order swaps, case and basic\npunctuation.\n\nBehind the scenes it calls a compiled Rust NIF that mixes:\n\n- **Hamming distance** – for very short, nearly equal‑length strings.\n- **SIMD Levenshtein** – fast edit distance from the `triple_accel` crate.\n- **Token‑bag Jaccard** – ignores word order.\n- **Partial‑ratio window** – finds the best‑matching snippet when the target is much longer than the query.\n\nThe result is symmetric (`score(a,b) ≈ score(b,a)`), length‑normalised and remains meaningful from single words to multi‑sentence paragraphs.\n\n---\n\n## Installation\n\nAdd to your `mix.exs`:\n\n```elixir\ndef deps do\n  [\n    {:fuzler, \"~\u003e 0.1.2\"}\n  ]\nend\n```\n\nYou need **Rust ≥ 1.70** installed; `rustler` will compile the NIF automatically.\n\n---\n\n## Quick examples\n\n```elixir\niex\u003e Fuzler.similarity_score(\"ciao\", \"ciao\")\n1.0\n\niex\u003e Fuzler.similarity_score(\"bella ciao\", \"ciao bella\")\n0.70       # same words, different order\n\niex\u003e long_text = \"bella ciao come va oggi spero che tu stia bene ...\"\niex\u003e Fuzler.similarity_score(\"ciao\", long_text)\n0.75       # query appears once inside a 40‑token paragraph\n\niex\u003e Fuzler.similarity_score(\"bonjour\", long_text)\n0.12       # word not present\n```\n\n---\n\n## When should I use it?\n\n| Use case                                    | Why it works well                                    |\n| ------------------------------------------- | ---------------------------------------------------- |\n| typo‑tolerant autocomplete / “did‑you‑mean” | Hamming + Levenshtein catch small edits fast         |\n| matching short queries inside long blobs    | windowed _partial ratio_ focuses on the best slice   |\n| order‑agnostic key comparison               | token‑bag Jaccard treats “ciao bella” = “bella ciao” |\n| quick relevance scoring in Elixir           | pure NIF call, no external service needed            |\n\n**Not** a full‑text search engine or a semantic synonym matcher—that’s what\nTantivy / Embeddings are for.\n\n---\n\n## API\n\n```elixir\n@doc \"Returns a similarity score ∈ [0.0, 1.0]\"\n@spec similarity_score(String.t(), String.t()) :: float\n```\n\nIf the NIF failed to load you’ll get:\n\n```elixir\n:erlang.nif_error(:nif_not_loaded)\n```\n\nso your code can decide to fall back or skip tests.\n\n---\n\n## How good is the score?\n\n| Query / Target                                      | Score ≈     |\n| --------------------------------------------------- | ----------- |\n| identical strings (any case / punctuation)          | 1.00        |\n| same words, swapped order                           | 0.68 – 0.72 |\n| one‑word query present once in 45‑token paragraph   | \\~0.75      |\n| one‑word query absent from paragraph                | ≤ 0.15      |\n| 80‑token paragraph vs same with 1 typo              | ≥ 0.90      |\n| “ciao bella” with +30 random filler tokens appended | \\~0.58      |\n\n---\n\n## Running the test suite\n\n`mix test` runs a handful of ExUnit cases covering:\n\n- case \u0026 punctuation variations\n- word‑order permutations\n- query present / absent in long paragraph (\u003e 40 tokens)\n- very long strings with tiny edits\n- monotonic drop as filler tokens grow\n\nAll similarity tests auto‑skip if the NIF isn’t loaded (e.g. on\nCI without Rust).\n\n---\n\n## License\n\nMIT [License](LICENSE)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Felchemista%2Ffuzler","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Felchemista%2Ffuzler","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Felchemista%2Ffuzler/lists"}