{"id":13462950,"url":"https://github.com/brewster/elastictastic","last_synced_at":"2025-03-25T06:31:23.566Z","repository":{"id":146576959,"uuid":"2130453","full_name":"brewster/elastictastic","owner":"brewster","description":"Object-document mapper and lightweight API adapter for ElasticSearch","archived":false,"fork":false,"pushed_at":"2015-10-19T22:08:48.000Z","size":930,"stargazers_count":88,"open_issues_count":9,"forks_count":13,"subscribers_count":31,"default_branch":"master","last_synced_at":"2024-10-29T13:50:08.363Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Ruby","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/brewster.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2011-07-30T22:42:02.000Z","updated_at":"2023-07-01T15:07:17.000Z","dependencies_parsed_at":"2023-04-10T13:37:31.788Z","dependency_job_id":null,"html_url":"https://github.com/brewster/elastictastic","commit_stats":null,"previous_names":[],"tags_count":44,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/brewster%2Felastictastic","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/brewster%2Felastictastic/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/brewster%2Felastictastic/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/brewster%2Felastictastic/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/brewster","download_url":"https://codeload.github.com/brewster/elastictastic/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":245413749,"owners_count":20611353,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-07-31T13:00:41.863Z","updated_at":"2025-03-25T06:31:23.106Z","avatar_url":"https://github.com/brewster.png","language":"Ruby","readme":"# Elastictastic #\n\nElastictastic is an object-document mapper and lightweight API adapter for\n[ElasticSearch](http://www.elasticsearch.org/). Elastictastic's primary use case\nis to define model classes which use ElasticSearch as a primary\ndocument-oriented data store, and to expose ElasticSearch's search functionality\nto query for those models.\n\n[![Build Status](https://secure.travis-ci.org/brewster/elastictastic.png)](http://travis-ci.org/brewster/elastictastic)\n\n## Dependencies ##\n\nElastictastic requires Ruby 1.9 and ActiveSupport 3. Elastictastic does not\nrequire Rails, but if you do run Rails, Elastictastic will only work with Rails\n3.\n\nYou will also need a running ElasticSearch instance (or cluster). For local\ndevelopment, you can easily [download](http://www.elasticsearch.org/download/)\nand\n[install](http://www.elasticsearch.org/guide/reference/setup/installation.html)\na copy, or your preferred package manager might have it available.\n\n## Installation ##\n\nJust add it to your Gemfile:\n\n```ruby\ngem 'elastictastic'\n```\n\n## Defining models ##\n\nElastictastic's setup DSL will be familiar to those who have used other\nRuby object-document mappers such as [Mongoid](http://mongoid.org/). Persisted\nmodels mix in the `Elastictastic::Document` module, and fields are defined with\nthe `field` class macro:\n\n```ruby\nclass Post\n  include Elastictastic::Document\n\n  field :title\nend\n```\n\nThe `field` method can take options; the options available here are simply those\nthat are available in a\n[field mapping](http://www.elasticsearch.org/guide/reference/mapping/core-types.html)\nin ElasticSearch. Elastictastic is (mostly) agnostic to the options you pass in;\nthey're just used to generate the mapping for ElasticSearch.\n\nBy default, ElasticSearch assigns fields a `string` type. An example of how one\nmight define a field with some options:\n\n```ruby\nclass Post\n  include Elastictastic::Document\n\n  field :comments_count, :type =\u003e :integer, :store =\u003e 'yes'\nend\n```\n\n### Multi-fields ###\n\nElasticSearch allows you to define\n[multi-fields](http://www.elasticsearch.org/guide/reference/mapping/multi-field-type.html),\nwhich index the same data in multiple ways. To define a multi-field in\nElastictastic, you may pass a block to the `field` macro, in which the alternate\nfields are defined using the same DSL:\n\n```ruby\nfield :title, :type =\u003e 'string', :index =\u003e 'analyzed' do\n  field :unanalyzed, :type =\u003e 'string', :index =\u003e 'not_analyzed'\nend\n```\n\nThe arguments passed to the outer `field` method are used for the default field\nmapping; thus, the above is the same as the following:\n\n```ruby\nfield :title,\n  :type =\u003e 'string',\n  :fields =\u003e {\n    :unanalyzed =\u003e { :type =\u003e 'string', :index =\u003e 'not_analyzed' }\n  }\n```\n\n### Document Boost ###\n\nDefining a\n[document boost](http://www.elasticsearch.org/guide/reference/mapping/boost-field.html)\nwill increase or decrease a document's score in search results based on the\nvalue of a field in the document. A boost of 1.0 is neutral. To define a boost\nfield, use the `boost` class macro:\n\n```ruby\nclass Post\n  include Elastictastic::Document\n\n  field :score, :type =\u003e 'integer'\n  boost :score\nend\n```\n\nBy default, if the boost field is empty, a score of 1.0 will be applied. You can\noverride this by passing a `'null_value'` option into the boost method.\n\n### Embedded objects ###\n\nElasticSearch supports deep nesting of properties by way of\n[object fields](http://www.elasticsearch.org/guide/reference/mapping/object-type.html).\nTo define embedded objects in your Elastictastic models, use the `embed` class\nmacro:\n\n```ruby\nclass Post\n  include Elastictastic::Document\n\n  embed :author\n  embed :recent_comments, :class_name =\u003e 'Comment' \nend\n```\n\nThe class that's embedded should include the `Elastictastic::NestedDocument` mixin,\nwhich exposes the same configuration DSL as `Elastictastic::Document` but does\nnot give the class the functionality of a top-level persistent object:\n\n```ruby\nclass Author\n  include Elastictastic::NestedDocument\n\n  field :name\n  field :email, :index =\u003e 'not_analyzed'\nend\n```\n\n### Parent-child relationships ###\n\nYou may define\n[parent-child relationships](http://www.elasticsearch.org/blog/2010/12/27/0.14.0-released.html)\n for your documents using the `has_many` and `belongs_to` macros:\n\n```ruby\nclass Blog\n  include Elastictastic::Document\n\n  has_many :posts\nend\n```\n\n```ruby\nclass Post\n  include Elastictastic::Document\n\n  belongs_to :blog\nend\n```\n\nUnlike in, say, ActiveRecord, an Elastictastic document can only specify one\nparent (`belongs_to`) relationship. A document can have as many children\n(`has_many`) as you would like.\n\nThe parent/child relationship has far-reaching consequences in ElasticSearch,\nand as such you will generally interact with child documents via the parent's\nassociation collection. For instance, this is the standard way to create a new\nchild instance:\n\n```ruby\npost = blog.posts.new\n```\n\nThe above will return a new Post object whose parent is the `blog`; the\n`blog.posts` collection will retain a reference to the transient `post`\ninstance, and will auto-save it when the `blog` is saved.\n\nYou may also create a child instance independently and then add it to a parent's\nchild collection; however, you must do so before saving the child instance, as\nElasticSeach requires types that define parents to have a parent. The following\ncode block has the same outcome as the previous one:\n\n```ruby\npost = Post.new\nblog.posts \u003c\u003c post\n```\n\nIn most other respects, the `blog.posts` collection behaves the same as a\nsearch scope (more on that below), except that enumeration methods (`#each`,\n`#map`, etc.) will return unsaved child instances along with instances\npersisted in ElasticSearch.\n\n### Syncing your mapping ###\n\nBefore you start creating documents with Elastictastic, you need to make\nElasticSearch aware of your document structure. To do this, use the\n`sync_mapping` method:\n\n```ruby\nPost.sync_mapping\n```\n\nIf you have a complex multi-index topology, you may want to consider using\n[ElasticSearch templates](http://www.elasticsearch.org/guide/reference/api/admin-indices-templates.html)\nto manage mappings and other index settings; Elastictastic doesn't provide any\nexplicit support for this at the moment, although you can use e.g.\n`Post.mapping` to retrieve the mapping structure which you can then merge into\nyour template.\n\n### Reserved attributes ###\n\nAll `Elastictastic::Document` models have an `id` and an `index` field, which\ncombine to define the full resource locator for the document in ElasticSearch.\nYou should not define fields or methods with these names. You may, however, set\nthe id explicitly on new (not yet saved) model instances.\n\n### ActiveModel ###\n\nElastictastic documents include all the usual ActiveModel functionality:\nvalidations, lifecycle hooks, observers, dirty-tracking, mass-assignment\nsecurity, and the like. If you would like to squeeze a bit of extra performance\nout of the library at the cost of convenience, you can include the\n`Elastictastic::BasicDocument` module instead of `Elastictastic::Document`.\n\n## Persistence ##\n\nElastictastic models are persisted the usual way, namely by calling `save`:\n\n```ruby\npost = Post.new\npost.title = 'You know, for search.'\npost.save\n```\n\nTo retrieve a document from the data store, use `find`:\n\n```ruby\nPost.find('123')\n```\n\nYou can look up multiple documents by ID:\n\n```ruby\nPost.find('123', '456')\n```\n\nYou can also pass an array of IDs; the following will return a one-element\narray:\n\n```ruby\nPost.find(['123'])\n```\n\nFor child documents, you **must** perform GET requests using the parent's\nassociation collection:\n\n```ruby\npost = blog.posts.new\npost.save\n\nblog.posts.find(post.id) # this will return the post\nPost.find(post.id)       # but this won't!\n```\n\n### Specifying the index ###\n\nElastictastic defines a default index for your documents. If you're using Rails,\nthe default index is your application's name suffixed with the current\nenvironment; outside of Rails, the default index is simply \"default\". You can\nchange this using the `default_index` configuration key.\n\nWhen you want to work with documents in an index other than the default, use\nthe `in_index` class method:\n\n```ruby\nnew_post = Post.in_index('my_special_index').new # create in an index\npost = Post.in_index('my_special_index').find('123') # retrieve from an index\n```\n\nTo retrieve documents from multiple indices at the same time, pass a hash into\n`find` where the keys are index names and the values are the IDs you wish to\nretrieve from that index:\n\n```ruby\nPost.find('default' =\u003e ['123', '456'], 'my_special_index' =\u003e '789')\n```\n\n### Bulk operations ###\n\nIf you are writing a large amount of data to ElasticSearch in a single process,\nuse of the\n[bulk API](http://www.elasticsearch.org/guide/reference/api/bulk.html)\nis encouraged. To perform bulk operations using Elastictastic, simply wrap your\noperations in a `bulk` block:\n\n```ruby\nElastictastic.bulk do\n  params[:posts].each do |post_params|\n    post = Post.new(post_params)\n    post.save\n  end\nend\n```\n\nAll create, update, and destroy operations inside the block will be executed in\na single bulk request when the block completes. If you are performing an\nindefinite number of operations in a bulk block, you can pass an `:auto_flush`\noption to flush the bulk buffer after the specified number of operations:\n\n```ruby\nElastictastic.bulk(:auto_flush =\u003e 100) do\n  150.times { Post.new.save! }\nend\n```\n\nThe above will perform two bulk requests: the first after the first 100\noperations, and the second when the block completes.\n\nYou can alternatively pass an `:auto_flush_bytes` option to flush the bulk buffer\nafter it reaches the specified number of bytes:\n\n```ruby\nElastictastic.bulk(:auto_flush_bytes =\u003e 48 * 100) do\n  150.times { Post.new.save! }\nend\n```\n\nAssuming, as in the specs in this project. that 'Post.new.save!' sends a\n48-byte operation to Elastic Search, this will cause two batches of requests:\none with 100 Posts, and one with 50.\n\nNote that the nature of bulk writes means that any operation inside a bulk block\nis essentially asynchronous: instances are not created, updated, or destroyed\nimmediately upon calling `save` or `destroy`, but rather when the bulk block\nexits. You may pass a block to `save` and `destroy` to provide a callback for\nwhen the instance is actually persisted and its local state updated. Let's say,\nfor instance, we wish to expand the example above to pass the IDs of the newly\ncreated posts to our view layer:\n\n```ruby\n@ids = []\nElastictastic.bulk do\n  params[:posts].each do |post_params|\n    post = Post.new(post_params)\n    post.save do |e|\n      @ids \u003c\u003c post.id\n    end\n  end\nend\n```\n\nIf the save was not successful (due to a duplicate ID or a version mismatch,\nfor instance), the `e` argument to the block will be passed an exception object;\nif the save was successful, the argument will be nil.\n\n### Concurrent document creation ###\n\nWhen Elastictastic creates a document with an application-defined ID, it uses\nthe `_create` verb in ElasticSearch, ensuring that a document with that ID does\nnot already exist. If the document does already exist, an\n`Elastictastic::ServerError::DocumentAlreadyExistsEngineException` will be\nraised. In the case where multiple processes may attempt concurrent creation of\nthe same document, you can gracefully handle concurrent creation using the\n`::create_or_update` class method on your document class. This will first\nattempt to create the document; if a document with that ID already exists, it\nwill then load the document and modify it using the block passed:\n\n```ruby\nPost.create_or_update('1') do |post|\n\tpost.title = 'My Post'\nend\n```\n\nIn the above case, Elastictastic will first attempt to create a new post with ID\n\"1\" and title \"My Post\". If a Post with that ID already exists, it will load it,\nset its title to \"My Post\", and save it. The update uses the `::update` method\n(see next section) to ensure that concurrent modification doesn't cause data to\nbe lost.\n\n### Optimistic locking ###\n\nElastictastic provides optimistic locking via ElasticSearch's built-in\n[document versioning](http://www.elasticsearch.org/guide/reference/api/index_.html).\nWhen a document is retrieved from persistence, it carries a version, which is a\nnumber that increments from 1 on each update. When Elastictastic models are\nupdated, the document version that it carried when it was loaded is passed into\nthe update operation; if this version does not match ElasticSearch's current\nversion for that document, it indicates that another process has modified the\ndocument concurrently, and an\n`Elastictastic::ServerError::VersionConflictEngineException` is raised. This\nprevents data loss through concurrent conflicting updates.\n\nThe easiest way to guard against concurrent modification is to use the\n`::update` class method to make changes to existing documents. Consider the\nfollowing example:\n\n```ruby\nPost.update('12345') do |post|\n  post.title = 'New Title'\nend\n```\n\nIn the above, the Post with ID '12345' is loaded from ElasticSearch and yielded\nto the block. When the block completes, the instance is saved back to\nElasticSearch. If this save results in a version conflict, a new instance is\nloaded from ElasticSearch and the block is run again. The process repeats until\na successful update.\n\nThis method will work inside a bulk operation, but note that if the first update\ngenerates a version conflict, additional updates will occur in discrete\nrequests, not as part of any bulk operation.\n\nIf you wish to safely update documents retrieved from a search scope\n(see below), use the `update_each` method:\n\n```ruby\nPost.query { constant_score { filter { term(:blog_id =\u003e 1) }}}.update_each do |post|\n  post.title = post.title.upcase\nend\n```\n\n## Search ##\n\nElasticSearch is, above all, a search tool. Accordingly, aside from direct\nlookup by ID, all retrieval of documents is done via the\n[search API](http://www.elasticsearch.org/guide/reference/api/search/).\nElastictastic models have class methods corresponding to the top-level keys\nin the ElasticSearch search API; you may chain these much as in ActiveRecord\nor Mongoid:\n\n```ruby\nPost.query(:query_string =\u003e { :query =\u003e 'pizza' }).facets(:cuisine =\u003e { :term =\u003e { :field =\u003e :tags }}).from(10).size(10)\n# Generates {\"query\": {\"query_string\": {\"query\": \"pizza\"}}, \"facets\": {\"cuisine\": {\"term\": {\"field\": \"tags\" }}}, \"from\": 10, \"size\": 10}\n```\n\nElastictastic also has an alternate block-based query builder, if you prefer:\n\n```ruby\nPost.query do\n  query_string { query('pizza') }\nend.facets { cuisine { term { field :tags }}}.from(10).size(10)\n# Same effect as the previous example\n```\n\nThe scopes that are generated by the preceding calls act as collections of\nmatching documents; thus all the usual Enumerable methods are available:\n\n```ruby\nPost.query(:query_string =\u003e { :query =\u003e 'pizza' }).each do |post|\n  puts post.title\nend\n```\n\nYou may access other components of the response using hash-style access; this\nwill return a `Hashie::Mash` which allows hash-style or object-style access:\n\n```ruby\nPost.facets(:cuisine =\u003e { :term =\u003e { :field =\u003e :tags }})['facets'].each_pair do |name, facet|\n  facet.terms.each { |term| puts \"#{term.term}: #{term.count}\" }\nend\n```\n\nYou can also call `count` on a scope; this will give the total number of\ndocuments matching the query.\n\nIn some situations, you may wish to access metadata about search results beyond\nsimply the result document. To do this, use the `#find_each` method, which\nyields a `Hashie::Mash` containing the raw ElasticSearch hit object in the\nsecond argument:\n\n```ruby\nPost.highlight { fields(:title =\u003e {}) }.find_each do |post, hit|\n  puts \"Post #{post.id} matched the query string in the title field: #{hit.highlight['title']}\"\nend\n```\n\nSearch scopes also expose a `#find_in_batches` method, which also yields the raw\nhit. The following code gives the same result as the previous example:\n\n```ruby\nPost.highlight { fields(:title =\u003e {}) }.find_in_batches do |batch|\n  batch.each do |post, hit|\n    puts \"Post #{post.id} matched the query string in the title field: #{hit.highlight['title']}\"\n  end\nend\n```\n\nBoth `find_each` and `find_in_batches` accept a `:batch_size` option.\n\n## Support \u0026 Bugs ##\n\nIf you find a bug, feel free to\n[open an issue](https://github.com/brewster/elastictastic/issues/new) on GitHub.\nPull requests are most welcome.\n\nFor questions or feedback, hit up our mailing list at\n[elastictastic@groups.google.com](http://groups.google.com/group/elastictastic)\nor find outoftime on the #elasticsearch IRC channel on Freenode.\n\n## License ##\n\nElastictastic is distributed under the MIT license. See the attached LICENSE\nfile for all the sordid details.\n","funding_links":[],"categories":["Active Record Plugins","Ruby"],"sub_categories":["Rails Search"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbrewster%2Felastictastic","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbrewster%2Felastictastic","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbrewster%2Felastictastic/lists"}