{"id":26197306,"url":"https://github.com/mongoid/mongoid_fulltext","last_synced_at":"2025-04-04T06:09:18.158Z","repository":{"id":2862324,"uuid":"3867297","full_name":"mongoid/mongoid_fulltext","owner":"mongoid","description":"An n-gram-based full-text search implementation for the Mongoid ODM.","archived":false,"fork":false,"pushed_at":"2023-09-28T13:38:59.000Z","size":275,"stargazers_count":150,"open_issues_count":15,"forks_count":64,"subscribers_count":9,"default_branch":"master","last_synced_at":"2025-03-28T05:11:47.696Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Ruby","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":"wizardforcel/dev-tutorial","license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mongoid.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2012-03-29T15:17:05.000Z","updated_at":"2024-03-13T08:54:24.000Z","dependencies_parsed_at":"2024-06-18T21:21:51.724Z","dependency_job_id":"4c97f523-ec45-4934-acf9-bd51a23ae981","html_url":"https://github.com/mongoid/mongoid_fulltext","commit_stats":{"total_commits":197,"total_committers":19,"mean_commits":"10.368421052631579","dds":0.5685279187817258,"last_synced_commit":"5dc0b14e0ca1d7d9989bb2f139f8f3d42cec8f39"},"previous_names":[],"tags_count":36,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mongoid%2Fmongoid_fulltext","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mongoid%2Fmongoid_fulltext/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mongoid%2Fmongoid_fulltext/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mongoid%2Fmongoid_fulltext/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mongoid","download_url":"https://codeload.github.com/mongoid/mongoid_fulltext/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247128752,"owners_count":20888235,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-03-12T02:25:41.153Z","updated_at":"2025-04-04T06:09:18.128Z","avatar_url":"https://github.com/mongoid.png","language":"Ruby","funding_links":[],"categories":[],"sub_categories":[],"readme":"Mongoid Fulltext Search\n=======================\n\n[![Gem Version](https://badge.fury.io/rb/mongoid_fulltext.svg)](http://badge.fury.io/rb/mongoid_fulltext)\n[![Build Status](https://secure.travis-ci.org/mongoid/mongoid_fulltext.svg)](http://travis-ci.org/mongoid/mongoid_fulltext)\n[![Code Climate](https://codeclimate.com/github/mongoid/mongoid_fulltext/badges/gpa.svg)](https://codeclimate.com/github/mongoid/mongoid_fulltext)\n\nFull-text search using n-gram matching for the Mongoid ODM.\n\nMongoDB introduced full-text search capabilities in v2.4, so this gem is a good fit for cases where you want something a little less than a full-blown indexing service. The mongoid_fulltext gem\nlets you do a fuzzy string search across relatively short strings, which makes it good for populating autocomplete boxes based on the display names of your Rails models but not appropriate for, say, indexing hundreds of thousands of HTML documents.\n\nInstall\n-------\n\nVersion 0.6.1 or newer of this gem requires Ruby 1.9.3 or newer and works with Mongoid 3, 4, 5 and 6, 7.\nUse version 0.5.x for Mongoid 2.4.x and Ruby 1.8.7, 1.9.2 or 1.9.3.\nFor Ruby 1.8.7 and/or Mongoid 2.x use [mongoid_fulltext 0.5.x](https://github.com/mongoid/mongoid_fulltext/tree/0.5-stable).\n\n``` ruby\ngem 'mongoid_fulltext'\n```\n\nExamples\n--------\n\nSuppose you have an `Artist` model and want to index each artist's name:\n\n``` ruby\nclass Artist\n  include Mongoid::Document\n  include Mongoid::FullTextSearch\n\n  field :first_name\n  field :last_name\n\n  def name\n    [first_name, last_name].join(' ')\n  end\n\n  fulltext_search_in :name\nend\n```\n\nThe `fulltext_search_in` directive will index the full name of the artist, so now\nyou can call:\n\n``` ruby\nArtist.fulltext_search(\"vince vangogh\")\n```\n\nwhich will return an array of the Artist instances that best match the search string. Most likely,\nVincent van Gogh will be included in the results. You can index multiple fields with the same\nindex, so we can get the same effect of our Artist index above using:\n\n``` ruby\nclass Artist\n  include Mongoid::Document\n  include Mongoid::FullTextSearch\n\n  field :first_name\n  field :last_name\n\n  fulltext_search_in :first_name, :last_name\nend\n```\n\nTo restrict the number of results returned, pass the `:max_results` parameter to `fulltext_search`:\n\n``` ruby\nArtist.fulltext_search(\"vince vangogh\", { :max_results =\u003e 5 })\n```\n\nTo return a pair of `[ result, score ]` instead of an array of results, pass the `:return_scores` parameter to `fulltext_search`:\n\n``` ruby\nArtist.fulltext_search(\"vince vangogh\", { :return_scores =\u003e true })\n```\n\nThe larger a score is, the better mongoid_fulltext thinks the match is. The scores have the following rough\ninterpretation that you can use to make decisions about whether the match is good enough:\n\n* If a prefix of your query matches something indexed, or if your query matches a prefix of something\n  indexed (for example, searching for \"foo\" finds \"myfoo\" or searching for \"myfoo\" finds \"foo\"), you\n  can expect a score of at least 1 for the match.\n* If an entire word in your query matches an entire word that's indexed and you have the `index_full_words`\n  option turned on (it's turned on by default), you can expect a score of at least 2 for the match.\n* If neither of the above criteria are met, you can expect a score less than one.\n\nIf you don't specify a field to index, the default is the result of `to_s` called on the object.\nThe following definition will index the first and last name of an artist:\n\n``` ruby\nclass Artist\n  include Mongoid::Document\n  include Mongoid::FullTextSearch\n\n  field :first_name\n  field :last_name\n\n  def to_s\n    '%s %s' % [first_name, last_name]\n  end\n\n  fulltext_search_in\nend\n```\n\nThe full-text index is stored in a separate MongoDB collection in the same database as the\nmodels you're indexing. By default, the name of this collection is generated for you. Above,\na collection named something like `mongoid_fulltext.index_artist_0` will be created to\nhold the index data. You can override this naming and provide your own collection name with\nthe :index_name parameter:\n\n``` ruby\nclass Artwork\n  include Mongoid::Document\n  include Mongoid::FullTextSearch\n\n  field :title\n  fulltext_search_in :title, :index_name =\u003e 'mongoid_fulltext.foobar'\nend\n```\n\nYou can also create multiple indexes on a single model, in which case you'll want to\nprovide index names:\n\n``` ruby\nclass Artwork\n  include Mongoid::Document\n  include Mongoid::FullTextSearch\n\n  field :title\n  field :artist_name\n  field :gallery_name\n  filed :gallery_address\n\n  fulltext_search_in :title, :index_name =\u003e 'title_index'\n  fulltext_search_in :artist_name, :index_name =\u003e 'artist_name_index'\n  fulltext_search_in :gallery_name, :gallery_address, :index_name =\u003e 'gallery_index'\nend\n```\n\nThe index names are helpful now because you'll have to specify which one you want to use when you\ncall `fulltext_search`:\n\n``` ruby\nArtwork.fulltext_search('warhol', :index =\u003e 'artist_name_index')\n```\n\nIf you have multiple indexes specified and you don't supply a name to `fulltext_search`, the\nmethod call will raise an exception.\n\nIf you're indexing multiple models, you may find that you need to combine results to create\na single result set. For example, if both the `Artist` model and the `Artwork` model are\nindexed for full-text search, then to get results from both, you'd have to call\n`Artist.fulltext_search` and `Artwork.fulltext_search` and combine the results yourself. If\nyour intention is instead to get the top k results from both Artists and Artworks, you can\nmerge both into a single index by using the same `:external_index` parameter:\n\n``` ruby\nclass Artwork\n  include Mongoid::Document\n  include Mongoid::FullTextSearch\n\n  field :title\n  fulltext_search_in :title, :index_name =\u003e 'artwork_and_artists'\nend\n\nclass Artist\n  include Mongoid::Document\n  include Mongoid::FullTextSearch\n\n  field :name\n  fulltext_search_in :name, :index_name =\u003e 'artwork_and_artists'\nend\n```\n\nNow that these two models share the same external index collection, we can search them both through\neither model's `fulltext_search` method:\n\n``` ruby\nArtwork.fulltext_search('picasso')  # returns same results as Artist.fulltext_search('picasso')\n```\n\nIf you want to filter the results from full-text search, you set up filters when the indexes are\ndefined. For example, suppose that in addition to wanting to use the `artwork_and_artists` index\ndefined above to search for `Artwork`s or `Artist`s, we want to be able to run full-text searches\nfor artists only and for artworks priced above $10,000. Instead of creating two new indexes or\nattempting to filter the results after the query is run, we can specify the filter predicates\nat the time of index definition:\n\n``` ruby\nclass Artwork\n  include Mongoid::Document\n  include Mongoid::FullTextSearch\n\n  field :title\n  field :price\n  fulltext_search_in :title, :index_name =\u003e 'artwork_and_artists',\n                     :filters =\u003e { :is_expensive =\u003e lambda { |x| x.price \u003e 10000 },\n                                   :has_long_name =\u003e lambda { |x| x.title.length \u003e 20 }}\nend\n\nclass Artist\n  include Mongoid::Document\n  include Mongoid::FullTextSearch\n\n  field :name\n  field :birth_year\n  fulltext_search_in :name, :index_name =\u003e 'artwork_and_artists',\n                     :filters =\u003e { :born_before_1900 =\u003e lambda { |x| x.birth_year \u003c 1900 },\n                                   :has_long_name =\u003e lambda { |x| x.name.length \u003e 20}}\nend\n```\n\nAfter defining filters, you can query for results that match particular values of filters:\n\n``` ruby\n# Only return artists born before 1900 that match 'foobar'\nArtist.fulltext_search('foobar', :born_before_1900 =\u003e true)\n\n# Return artists or artworks that match 'foobar' and have short names\nArtist.fulltext_search('foobar', :has_long_name =\u003e false)\n\n# Only return artworks with prices over 10000 that match 'mona lisa'\nArtwork.fulltext_search('mona lisa', :is_expensive =\u003e true)\n\n# Only return artworks with prices less than 10000 that match 'mona lisa'\nArtwork.fulltext_search('mona lisa', :is_expensive =\u003e false)\n```\n\nNote that in all of the example queries above, supplying a filter that is defined on exactly\none of the models will restrict the search to results from that model only. For example,\nsince `:is_expensive` is defined only on `Artwork`s, a call to `fulltext_search` with either\n`:is_expensive =\u003e true` or `:is_expensive =\u003e false` will return only `Artwork` results.\n\nYou can specify multiple filters per index and per model. Each filter is a predicate that will\nbe called on objects as they're inserted into the full-text index (any time the model is saved.)\nFilters are only called on instances of models they're defined on, so in the example above, the\n`is_expensive` filter is only applied to instances of `Artwork` and the `born_before_1900` filter\nis only applied to instances of `Artist`, although both filters can be used when querying from\neither model. The `has_long_name` filter, on the other hand, will return instances of both\n`Artwork` and `Artist` since it's defined on each model.\n\nFilters shouldn't ever throw, but if they do, the filter is just ignored. If you apply filters to\nindexes that are on multiple fields, the filter is applied to each field and the filter result is\nthe AND of all of the individual results for each of the fields. Finally, if a filter is defined\nbut criteria for that filter aren't passed to `fulltext_search`, the result is as if the filter\nhad never been defined - you see both models that both pass and fail the filter in the results.\n\nIndexing Options\n----------------\n\nAdditional indexing/query options can be used as parameters to `fulltext_search_in`.\n\n* `alphabet`: letters to index, default is `abcdefghijklmnopqrstuvwxyz0123456789 `.\n* `word_separators`: word separators, default is the space character.\n* `ngram_width`: ngram width, default is `3`.\n* `index_full_words`: index full words, which improves exact matches, default is `true`.\n* `index_short_prefixes`: index a prefix of each full word of length `(ngram_width-1)`. Useful if\n  you use a larger ngram_width than the default of 3. Default for this option is `false`.\n* `stop_words`: a hash of words to avoid indexing as full words. Used only if `index_full_words`\n  is set to `true`. Defaults to a hash containing a list of common English stop words.\n* `apply_prefix_scoring_to_all_words`: score n-grams at beginning of words higher, default is `true`.\n* `max_ngrams_to_search`: maximum number of ngrams to query at any given time, default is `6`.\n* `max_candidate_set_size`: maximum number of candidate ngrams to examine for a given query.\n  Defaults to 1000. If you're seeing poor results, you can try increasing this value to consider\n  more ngrams per query (changing this parameter does not require a re-index.) The amount of time\n  a search takes is directly proportional to this parameter's value.\n* `remove_accents`: remove accents on accented characters, default is `true`.\n  We strip the accents using [NFKD normalization](http://unicode-utils.rubyforge.org/UnicodeUtils.html#method-c-compatibility_decomposition)\n  using an external library, `unicode_utils`.\n* `update_if`: controls whether or not the index will be updated. This can be set to a symbol,\n  string, or proc. If the result of evaluating the value is true, the index will be updated.\n    * When set to a symbol, the symbol is sent to the document.\n    * When set to a string, the string is evaluated within the document's instance.\n    * When set to a proc, the proc is called, and the document is given to the proc as the first arg.\n    * When set to any other type of object, the document's index will not be updated.\n* `reindex_immediately`: whether models will be reindexed automatically upon saves or updates. Defaults to true. When set to false, the class-level `update_ngram_index` method can be called to perform reindexing.\n\nIf you work with Cyrillic texts, use this option: `:alphabet = 'abcdefghijklmnopqrstuvwxyz0123456789абвгдежзиклмнопрстуфхцчшщъыьэюя'`.\n\nArray filters\n-------------\n\nA filter may also return an Array. Consider the following example.\n\n``` ruby\nclass Artist\n  include Mongoid::Document\n  include Mongoid::FullTextSearch\n\n  field :name\n  field :exhibitions, as: Array, default: []\n\n  fulltext_search_in :name, :index_name =\u003e 'exhibited_artist',\n    :filters =\u003e {\n      :exhibitions =\u003e lambda { |artist| artist.exhibitions }\n    }\nend\n```\n\nYou can now find all artists that are at the Art Basel exhibition or all artists that have exhibited\nat both the Art Basel and the New York Armory exhibition.\n\n``` ruby\n# All artists\nArtist.fulltext_search('foobar')\n\n# Artists at the Art Basel exhibition only\nArtist.fulltext_search('foobar', :exhibitions =\u003e [ \"Art Basel\" ])\n\n# Artists at both the Art Basel and the New York Armory exhibition\nArtist.fulltext_search('foobar', :exhibitions =\u003e [ \"Art Basel\", \"New York Armory\" ])\n\n# Note that the following explicit syntax may be used to achieve the\n# same result as above\nArtist.fulltext_search('foobar', :exhibitions =\u003e {:all =\u003e [ \"Art Basel\", \"New York Armory\" ]})\n```\n\nIf you want to find all artists that are at either the Art Basel or the\nNew York Armory exhibition, then you may specify the `:any` operator in\nthe filter.\n\n``` ruby\n# Artists at either the Art Basel or the New York Armory exhibition\nArtist.fulltext_search('foobar', :exhibitions =\u003e {:any =\u003e [ \"Art Basel\", \"New York Armory\" ]})\n```\n\nNote that `:all` and `:any` are currently the only supported operators\nfor the array filters.\n\nBuilding the index\n------------------\n\nThe fulltext index is built and maintained incrementally by hooking into `before_save` and\n`before_destroy` callbacks on each model that's being indexed. If you want to build an index\non existing models, you can call the `update_ngram_index` method on the class or each instance:\n\n``` ruby\nArtwork.update_ngram_index\nArtwork.find(id).update_ngram_index\n```\n\nIf you need to update the index for a large number of documents, using the option below will\ndisable your query from timing out:\n\n```ruby\nArtwork.update_ngram_index(timeout: false)\n```\n\nYou can remove all or individual instances from the index with the `remove_from_ngram_index`\nmethod:\n\n``` ruby\nArtwork.remove_from_ngram_index\nArtwork.find(id).remove_from_ngram_index\n```\n\nThe methods on the model level perform bulk removal operations and are therefore faster that\nupdating or removing records individually.\n\nIf you need to control when the index is updated, provide the `update_if` option to\n`fulltext_search_in`, and set it to a symbol, string, or proc. Eg:\n\n``` ruby\n# Only update the \"age\" index if the \"age\" field has changed.\nfulltext_search_in :age,    :update_if =\u003e :age_changed?\n\n# Only update the \"names\" index if the \"firstname\" or \"lastname\" field has changed.\nfulltext_search_in :names,  :update_if =\u003e \"firstname_changed? || lastname_changed?\"\n\n# Only update the \"email\" index if the \"email\" field ends with \"gmail.com\".\nfulltext_search_in :email,  :update_if =\u003e Proc.new { |doc| doc.email.match /gmail.com\\Z/ }\n```\n\nMongo Database Indexes\n----------------------\n\nMongoid provides an indexing mechanism on its models triggered by the `create_indexes` method.\nMongoid_fulltext will hook into that behavior and create appropriate database indexes on its\ncollections. These indexes are required for an efficient full text search.\n\nCreating database indexes is typically done with the `db:mongoid:create_indexes` task.\n\n``` bash\nrake db:mongoid:create_indexes\n```\n\nRunning the specs\n-----------------\n\nTo run the specs, execute `rake spec`. You need a local MongoDB instance to run the specs.\n\nContributing\n------------\n\nFork the project. Make your feature addition or bug fix with tests. Send a pull request. Bonus points for topic branches.\n\nCopyright and License\n---------------------\n\nMIT License, see [LICENSE](LICENSE) for details.\n\n(c) 2011-2017 [Artsy Inc.](http://artsy.github.io)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmongoid%2Fmongoid_fulltext","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmongoid%2Fmongoid_fulltext","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmongoid%2Fmongoid_fulltext/lists"}