{"id":22071766,"url":"https://github.com/mhutter/fuzzy_set","last_synced_at":"2025-03-23T19:17:58.219Z","repository":{"id":56847919,"uuid":"41917104","full_name":"mhutter/fuzzy_set","owner":"mhutter","description":"Fuzz-searchable set of strings","archived":false,"fork":false,"pushed_at":"2020-04-07T10:10:47.000Z","size":22,"stargazers_count":0,"open_issues_count":2,"forks_count":1,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-03-01T02:51:12.431Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Ruby","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mhutter.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2015-09-04T13:24:44.000Z","updated_at":"2020-04-07T10:10:50.000Z","dependencies_parsed_at":"2022-09-09T07:50:19.530Z","dependency_job_id":null,"html_url":"https://github.com/mhutter/fuzzy_set","commit_stats":null,"previous_names":[],"tags_count":2,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mhutter%2Ffuzzy_set","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mhutter%2Ffuzzy_set/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mhutter%2Ffuzzy_set/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mhutter%2Ffuzzy_set/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mhutter","download_url":"https://codeload.github.com/mhutter/fuzzy_set/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":245153892,"owners_count":20569408,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-30T20:33:57.739Z","updated_at":"2025-03-23T19:17:58.154Z","avatar_url":"https://github.com/mhutter.png","language":"Ruby","funding_links":[],"categories":[],"sub_categories":[],"readme":"# FuzzySet\n\n[![Gem Version](https://badge.fury.io/rb/fuzzy_set.svg)](http://badge.fury.io/rb/fuzzy_set)\n[![Documentation](http://img.shields.io/badge/docs-rdoc.info-blue.svg)](http://rubydoc.org/gems/fuzzy_set/frames)\n[![Build Status](https://travis-ci.org/mhutter/fuzzy_set.svg)](https://travis-ci.org/mhutter/fuzzy_set)\n\n\nFuzzySet represents a set which allows searching its entries by using [Approximate string matching](https://en.wikipedia.org/wiki/Approximate_string_matching).\n\nIt allows you to create a fuzzy-search!\n\n## How does it work?\n\nWhen `add`ing an element to the Set, it first gets indexed. This is, on a very basic level, cutting it up into ngrams and building an index with each ngram pointing to the element.\n\nIf you then query the set with `get`, the query itself is also sliced into ngrams. We then select all elements in the set which share at least one common ngram with the query. The results are then ordered by their [cosine string similarity](https://github.com/mhutter/string-similarity) to the query.\n\n**TODO**:\nSee [Issues labeled #feature](https://github.com/mhutter/fuzzy_set/labels/feature)\n\n## Installation\n\nAdd this line to your application's Gemfile:\n\n```ruby\ngem 'fuzzy_set'\n```\n\nAnd then execute:\n\n    $ bundle\n\nOr install it yourself as:\n\n    $ gem install fuzzy_set\n\n## Usage\n\n```ruby\nrequire 'fuzzy_set'\nstates = open('states.txt').read.split(/\\n/)\n\n# Create a new set and add some elements:\nfs = FuzzySet.new\nfs.add 'Some'\nfs.add 'Words'\nfs.add \"or\", \"even\", \"multiple\", \"words!\"\n\n# Or provide your elements when creating the set:\nfs = FuzzySet.new(states)\n\n# Use #exact_match to find exact matches (= the normalized query\n# matches a normalized element in the set):\nfs.exact_match('michigan!') # =\u003e \"Michigan\"\nfs.exact_match('mischigen') # =\u003e nil\n\n# Use #get to get all approximate matches:\nfs.get('mischigen')\n# =\u003e [\"Michigan\", \"Wisconsin\", \"Mississippi\", \"Minnesota\", \"Missouri\"]\n\n# With the default settings, #get will always first try to get an\n# exact match (see above), and return if there is one:\nfs.get('mississippi') # =\u003e [\"Mississippi\"]\n\n# set `all_matches` to true, to do a full query, even if there is\n# an exact match:\nfs = FuzzySet.new(states, all_matches: true)\nfs.get('mississippi') # =\u003e [\"Mississippi\", \"Missouri\", \"Michigan\", \"Minnesota\"]\n\n# You can configure more stuff (see below)\nfs = FuzzySet.new(states, all_matches: true, ngram_size_min: 1)\n```\n\n### Options\n\n- `:all_matches` - If `false` and there is an exact match for `#get`, return the match immediately. If `true`, do the ngram-query to get more possible matches.\n- `:ngram_size_max` - The maximum Ngram size to use (if there is no match using the max ngram size, try again with a smaller ngran size).\n- `:ngram_size_min` - The minimum Ngram size to use.\n\n\n## Development\n\nAfter checking out the repo, run `bin/setup` to install dependencies. Then, run `rake test` to run the tests. You can also run `bin/console` for an interactive prompt that will allow you to experiment.\n\nTo install this gem onto your local machine, run `bundle exec rake install`.\n\n## Contributing\n\nBug reports and pull requests are welcome on GitHub at https://github.com/mhutter/fuzzy_set.\n\n\n## License\n\nThe gem is available as open source under the terms of the [MIT License](http://opensource.org/licenses/MIT).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmhutter%2Ffuzzy_set","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmhutter%2Ffuzzy_set","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmhutter%2Ffuzzy_set/lists"}