{"id":13502003,"url":"https://github.com/seek-ai/esengine","last_synced_at":"2025-03-29T10:32:37.803Z","repository":{"id":57426930,"uuid":"45132343","full_name":"seek-ai/esengine","owner":"seek-ai","description":"ElasticSearch ODM (Object Document Mapper) for Python - pip install esengine","archived":false,"fork":false,"pushed_at":"2020-05-25T02:39:10.000Z","size":486,"stargazers_count":109,"open_issues_count":16,"forks_count":17,"subscribers_count":34,"default_branch":"master","last_synced_at":"2024-05-21T10:07:05.039Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"http://seek-ai.github.io/esengine/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/seek-ai.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2015-10-28T18:09:21.000Z","updated_at":"2024-01-03T14:12:56.000Z","dependencies_parsed_at":"2022-09-19T06:41:08.839Z","dependency_job_id":null,"html_url":"https://github.com/seek-ai/esengine","commit_stats":null,"previous_names":[],"tags_count":5,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/seek-ai%2Fesengine","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/seek-ai%2Fesengine/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/seek-ai%2Fesengine/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/seek-ai%2Fesengine/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/seek-ai","download_url":"https://codeload.github.com/seek-ai/esengine/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":246174179,"owners_count":20735406,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-07-31T22:01:58.107Z","updated_at":"2025-03-29T10:32:37.244Z","avatar_url":"https://github.com/seek-ai.png","language":"Python","funding_links":[],"categories":["Search","Python","Awesome Python"],"sub_categories":["Search"],"readme":"\n\u003cimg src=\"https://raw.githubusercontent.com/catholabs/esengine/master/octosearch.gif\" align=\"left\" width=\"192px\" height=\"132px\"/\u003e\n\u003cimg align=\"left\" width=\"0\" height=\"192px\" hspace=\"10\"/\u003e\n\n\u003e **esengine** - The **E**lastic**s**earch **O**bject **D**ocument **M**apper\n\n[![PyPI](https://img.shields.io/pypi/v/esengine.svg)](https://pypi.python.org/pypi/esengine)\n[![versions](https://img.shields.io/pypi/pyversions/esengine.svg)](https://pypi.python.org/pypi/esengine)\n[![downloads](https://img.shields.io/pypi/dw/esengine.svg)](https://pypi.python.org/pypi/esengine)\n[![Travis CI](http://img.shields.io/travis/seek-ai/esengine.svg)](https://travis-ci.org/seek-ai/esengine)\n[![Coverage Status](http://img.shields.io/coveralls/catholabs/esengine.svg)](https://coveralls.io/r/catholabs/esengine)\n[![Code Health](https://landscape.io/github/catholabs/esengine/master/landscape.svg?style=flat)](https://landscape.io/github/catholabs/esengine/master)\n\n\n**esengine** is an ODM (**O**bject **D**ocument **M**apper) it maps Python classes in to **E**lastic**s**earch **index/doc_type** and **object instances()** in to Elasticsearch documents.\n\n\u003cbr\u003e\u003cbr\u003e\n\n### Modeling\n\nOut of the box ESengine takes care only of the Modeling and CRUD operations including:\n\n- Index, DocType and Mapping specification \n- Fields and its types coercion\n- basic CRUD operations (Create, Read, Update, Delete)\n\n### Communication\nESengine does not communicate directly with ElasticSearch, it only creates the basic structure, \nTo communicate it relies on an ES client providing the transport methods (index, delete, update etc). \n\n### ES client\nESengine does not enforce the use of the official ElasticSearch client,\nbut you are encouraged to use it because it is well maintained and has the support to **bulk** operations. But you are free to use another client or create your own (useful for tests).\n\n### Querying the data\nESengine does not enforce or encourage you to use a DSL language for queries, out of the box you have to\nwrite the elasticsearch **payload** representation as a raw Python dictionary. However ESEngine comes with **utils.payload** helper module to help you building payloads in a less verbose and Pythonic way.\n\n### Why not elasticsearch_dsl?\n\nElasticSearch DSL is an excellent tool, a very nice effort by the maintainers of the official ES library, it is handy in most of the cases, but because it is built on top of operator overiding, sometimes leads to a **confuse query building**, sometimes it is better to write raw_queries or use a simpler payload builder having more control and visibility of what os being generated. \n\nElasticSearch_DSL as a high level abstraction promotes **Think only of Python objects, dont't worry about Elastic queries** while ESengine promotes **Know well the Elastic queries and then write them as Python objects**.\n\nElasticSearch_DSL is more powerful and more complete, tight more with ES specifications while ESEngine is simpler, lightweight shipping only the basics.\n\n### Project Stage\n\nIt is in beta-Release, working in production, but missing a lot of features, you can help using, testing,, discussing or coding!\n\n\n# Getting started\n\n## Installation\n\nESengine needs a client to communicate with E.S, you can use one of the following:\n\n- ElasticSearch-py (official)\n- Py-Elasticsearch (unofficial)\n- Create your own implementing the same api-protocol\n- Use the MockES provided as py.test fixture (only for tests)\n\nBecause of bulk operations you are recommendded to use\n**elasticsearch-py** (Official E.S Python library) so the instalation \ndepends on the version of elasticsearch you are using.\n\n\n### in short\n\nInstall the client and then install ESEngine\n\n- for 2.0 + use \"elasticsearch\u003e=2.0.0,\u003c3.0.0\"\n- for 1.0 + use \"elasticsearch\u003e=1.0.0,\u003c2.0.0\"\n- under 1.0 use \"elasticsearch\u003c1.0.0\"\n\n\nFor the latest use:\n\n```sh\n$ pip install elasticsearch\n$ pip install esengine\n\n```\n\n### Or install them together\n\n#### Elasticsearch 2.x\n\n```bash\npip install esengine[es2]\n```\n\n#### Elasticsearch 1.x\n\n```bash\npip install esengine[es1]\n```\n\n#### Elasticsearch 0.90.x\n\n```bash\npip install esengine[es0]\n```\n\nThe above command will install esengine and the elasticsearch library specific for you ES version.\n\n# Usage\n\n```python\n# importing\n\nfrom elasticsearch import ElasticSearch\nfrom esengine import Document, StringField\n\n# Defining a document\nclass Person(Document):\n    # define _meta attributes\n    _doctype = \"person\"  # optional, it can be set after using \"having\" method\n    _index = \"universe\"  # optional, it can be set after using \"having\" method\n    _es = ElasticSearch(host='host', port=port)  # optional, it can be explicit passed to methods\n    \n    # define fields\n    name = StringField()\n\n# Initializing mappings and settings\nPerson.init()\n```\n\n\u003e If you do not specify an \"id\" field, ESEngine will automatically add \"id\" as StringField. It is recommended that when specifying you use StringField for ids.\n\n\n## TIP: import base module\n\nA good practice is to import the base module, look the same example\n\n```python\nimport esengine as ee\n\nclass Person(ee.Document):\n    name = ee.StringField()\n```\n\n## Fields\n\n### Base Fields\n\n```python\nname = StringField()\nage = IntegerField()\nweight = FloatField()\nfactor = LongField()\nactive = BooleanField()\nbirthday = DateField()\n```\n\n### Special Fields\n\n#### GeoPointField\n\nA field to hold GeoPoint with modes dict|array|string and its mappings\n\n```python\nclass Obj(Document):\n    location = GeoPointField(mode='dict')  # default\n    # An object representation with lat and lon explicitly named\n\nObj.init() # important to put the proper mapping for geo location\n\nobj = Obj()\n\nobj.location = {\"lat\": 40.722, \"lon\": -73.989}}\n\nclass Obj(Document):\n    location = GeoPointField(mode='string')\n    # A string representation, with \"lat,lon\"\n\nobj.location = \"40.715, -74.011\"\n\nclass Obj(Document):\n    location = GeoPointField(mode='array')\n    # An array representation with [lon,lat].\n\nobj.location = [-73.983, 40.719]\n```\n\n#### ObjectField\n\nA field to hold nested one-dimension objects, schema-less or with properties validation.\n\n```python\n# accepts only dictionaries having strct \"street\" and \"number\" keys\naddress = ObjectField(properties={\"street\": \"string\", \"number\": \"integer\"})\n\n# Accepts any Python dictionary\nextravalues = ObjectField() \n```\n\n#### ArrayField\n\nA Field to hold arrays (python lists)\n\nIn the base, any field can accept **multi** parameter\n\n```python\ncolors = StringField(multi=True)   # accepts [\"blue\", \"green\", \"yellow\", ....]\n```\n\nBut sometimes (specially for nested objects) it is better to be explicit, and also it generates a better mapping\n\n```python\n# accepts an array of strings [\"blue\", \"green\", \"yellow\", ....]\ncolors = ArrayField(StringField()) \n```\n\nIt is available for any other field\n\n```\nlocations = ArrayField(GeoPointField())\nnumbers = ArrayField(IntegerField())\nfractions = ArrayField(FloatField())\naddresses = ArrayField(ObjectField(properties={\"street\": \"string\", \"number\": \"integer\"}))\nlist_of_lists_of_strings = ArrayField(ArrayField(StringField()))\n```\n\n## Indexing\n\n```python\nperson = Person(id=1234, name=\"Gonzo\")\nperson.save()  # or pass .save(es=es_client_instance) if not specified in model \n```\n\n## Getting by id\n\n```python\nPerson.get(id=1234)\n```\n\n## filtering by IDS\n\n```python\nids = [1234, 5678, 9101]\npower_trio = Person.filter(ids=ids)\n```\n\n\n## filtering by fields\n\n```python\nPerson.filter(name=\"Gonzo\")\n```\n\n## Searching\n\nESengine does not try to create abstraction for query building, \nby default ESengine only implements search transport receiving a raw ES query \nin form of a Python dictionary.\n\n```python\nquery = {\n    \"query\": {\n        \"filtered\": {\n            \"query\": {\n                \"match_all\": {}\n            },\n            \"filter\": {\n                \"ids\": {\n                    \"values\": [1, 2]\n                }\n            }\n        }\n    }\n}\nPerson.search(query, size=10)\n```\n\n## Getting all documents (match_all)\n\n```python\nPerson.all()\n\n# with more arguments\n\nPerson.all(size=20)\n\n```\n\n\n## Counting\n\n```python\nPerson.count(name='Gonzo')\n```\n\n## Updating\n\n###  A single document\n\nA single document can be updated simply using the **.save()** method\n\n```python\n\nperson = Person.get(id=1234)\nperson.name = \"Another Name\"\nperson.save()\n\n```\n\n### Updating a Resultset\n\nThe Document methods **.get**, **.filter** and **.search** will return an instance\nof **ResultSet** object. This object is an Iterator containing the **hits** reached by \nthe filtering or search process and exposes some CRUD methods[ **update**, **delete** and **reload** ]\nto deal with its results.\n\n\n```python\npeople = Person.filter(field='value')\npeople.update(another_field='another_value')\n```\n\n\u003e When updating documents sometimes you need the changes done in the E.S index reflected in the objects \nof the **ResultSet** iterator, so you can use **.reload** method to perform that action.\n\n\n### The use of **reload** method\n \n```python\npeople = Person.filter(field='value')\nprint people\n... \u003cResultset: [{'field': 'value', 'another_field': None}, \n                 {'field': 'value', 'another_field': None}]\u003e\n\n# Updating another field on both instances\npeople.update(another_field='another_value')\nprint people\n... \u003cResultset: [{'field': 'value', 'another_field': None}, {'field': 'value', 'another_field': None}]\u003e\n\n# Note that in E.S index the values weres changed but the current ResultSet is not updated by defaul\n# you have to fire an update\npeople.reload()\n\nprint people\n... \u003cResultset: [{'field': 'value', 'another_field': 'another_value'},\n                 {'field': 'value', 'another_field': 'another_value'}]\u003e\n\n\n```\n\n### Deleting documents\n\n\n#### A ResultSet\n\n```python\npeople = Person.all()\npeople.delete()\n```\n\n#### A single document\n\n```python\nPerson.get(id=123).delete()\n```\n\n## Bulk operations\n\nESEngine takes advantage of elasticsearch-py helpers for bulk actions, \nthe **ResultSet** object uses **bulk** melhod to **update** and **delete** documents.\n\nBut you can use it in a explicit way using Document's **update_all**, **save__all** and **delete_all** methods.\n\n#### Lets create a bunch of document instances\n\n\n```python\ntop_5_racing_bikers = []\n\nfor name in ['Eddy Merckx', \n             'Bernard Hinault', \n             'Jacques Anquetil', \n             'Sean Kelly', \n             'Lance Armstrong']:\n     top_5_racing_bikers.append(Person(name=name))\n```\n\n#### Save it all \n\n```python\nPerson.save_all(top_5_racing_bikers)\n```\n\n#### Using the **create** shortcut\n\nThe above could be achieved using **create** shortcut\n\n\n##### A single\n\n```python\nPerson.create(name='Eddy Merckx', active=False)\n```\n\n\u003e Create will return the instance of the indexed Document\n\n##### All using list comprehension\n\n```python\ntop_5_racing_bikers = [\n    Person.create(name=name, active=False)\n    for name in ['Eddy Merckx', \n                 'Bernard Hinault', \n                 'Jacques Anquetil', \n                 'Sean Kelly', \n                 'Lance Armstrong']\n]\n\n```\n\u003e NOTE: **.create** method will automatically save the document to the index, and\nwill not raise an error if there is a document with the same ID (if specified), it will update it acting as upsert.\n\n#### Updating all\n\nTurning the field **active** to **True** for all documents\n\n```python\nPerson.update_all(top_5_racing_bikes, active=True)\n```\n\n#### Deleting all\n\n```python\nPerson.delete_all(top_5_racing_bikes)\n```\n\n\n#### Chunck size\n\nchunk_size is number of docs in one chunk sent to ES (default: 500)\nyou can change using **meta** argument.\n\n```python\nPerson.update_all(\n    top_5_racing_bikes, # the documents\n    active=True,  # values to be changed\n    meta={'chunk_size': 200}  # meta data passed to **bulk** operation    \n)\n```\n\n#### Utilities\n\n#### Mapping and Mapping migrations\n\nESEngine does not saves mappings automatically, but it offers an utility to generate and save mappings on demand\nYou can create a cron job to refresh mappings once a day or run it every time your model changes\n\n##### Using the document\n\n```python\nclass Person(Document):\n    # define _meta attributes\n    _doctype = \"person\"  # optional, it can be set after using \"having\" method\n    _index = \"universe\"  # optional, it can be set after using \"having\" method\n    _es = ElasticSearch(host='host', port=port)  # optional, it can be explicit passed to methods\n    \n    # define fields\n    name = StringField()\n    \n```\n\n##### You can use **init()** class method to initialize/update mappings, settings and analyzers    \n\n```\nPerson.init()  # if not defined in model, pass an **es=es_client** here\n```\n\n\u003e Include above in your the last line of your model files or cron jobs or migration scripts\n\n\n#### Dynamic meta attributes\n\nIn ESEngine Document all attributes starting with _ is a meta attribute, sometimes you can't define them hardcoded in your models and want them to be dynamic.\nyou can achieve this by subclassing your base document, but sometimes you really need to change at runtime.\n\n\u003e Sometimes it is useful for sharding.\n\n```python\nfrom models import Person\n\nBrazilianUsers = Person.having(index='another_index', doctype='brasilian_people', es=Elasticsearch(host='brazil_datacenter'))\nAmericanUsers = Person.having(index='another_index', doctype='american_people', es=Elasticsearch(host='us_datacenter'))\n\nbrazilian_users = BrasilianUsers.filter(active=True)\namerican_users = AmericanUsers.search(query=query)\n\n```\n\n#### Validators\n\n##### Field Validator\n\nTo validate each field separately you can set a list of validators, each \nvalidator is a callable receiving field_name and value as arguments and\nshould return None to be valid. If raise or return the data will be invalidated\n\n```python\nfrom esengine.exceptions import ValidationError\n\ndef category_validator(field_name, value):\n    # check if value is in valid categories\n    if value not in [\"primary\", \"secondary\", ...]:\n        raise ValidationError(\"Invalid category!!!\")\n    \nclass Obj(Document):\n    category = StringField(validators=[category_validator])\n\nobj = Obj()\nobj.category = \"another\"\nobj.save()\nTraceback: ValidationError(....)\n\n```\n\n##### Document Validator\n\nTo validate the whole document you can set a list of validators, each \nvalidator is a callable receiving the document instance and\nshould return None to be valid. If raise or return the data will be invalidated\n\n```python\nfrom esengine.exceptions import ValidationError\n\ndef if_city_state_is_required(obj):\n    if obj.city and not obj.state:\n        raise ValidationError(\"If city is defined you should define state\")\n        \nclass Obj(Document):\n    _validators = [if_city_state_is_required]\n    \n    city = StringField()\n    state = StringField()\n\nobj = Obj()\nobj.city = \"Sao Paulo\"\nobj.save()\nTraceback: ValidationError(....)\n\n```\n\n#### Refreshing\n\nSometimes you need to force indices-shards refresh for testing, you can use\n\n```python\n# Will refresh all indices\nDocument.refresh()\n```\n\n#### Payload builder\n\nSometimes queries turns in to complex and verbose data structures, to help you\n(use with moderation) you can use Payload utils to build queries.\n\n\n###### Example using a raw query:\n\n```python\nquery = {\n    \"query\": {\n        \"filtered\": {\n            \"query\": {\n                \"match_all\": {}\n            },\n            \"filter\": {\n                \"ids\": {\n                    \"values\": [1, 2]\n                }\n            }\n        }\n    }\n}\n\nPerson.search(query=query, size=10)\n```\n\n###### Same example using payload utils\n\n```python\nfrom esengine import Payload, Query, Filter\npayload = Payload(query=Query.filtered(query=Query.match_all(), filter=Filter.ids([1, 2])))\nPerson.search(payload, size=10)\n```\n\n\u003e Payload utils exposes Payload, Query, Filter, Aggregate, Suggesters\n\nYou can also set model on payload initialization to create a more complete payload definition\n\n```python\nfrom esengine import Payload, Query, Filter\npayload = Payload(\n    model=Person,\n    query=Query.filtered(query=Query.match_all(), filter=Filter.ids([1, 2]))\n    sort={\"name\": {\"order\": \"desc\"}},\n    size=10\n)\npayload.search()\n```\n\n###### More examples\n\nYou can use Payload, Query or Filter direct in search\n\n```python\nfrom esengine import Payload, Query, Filter\n\nPerson.search(Payload(query=Query.match_all()))\n\nPerson.search(Query.bool(must=[Query.match(\"name\", \"Gonzo\")]))\n\nPerson.search(Query.match_all())\n\nPerson.search(Filter.ids([1, 2, 3]))\n\n```\n\n###### chaining\n\nPayload object is chainable so you can do:\n```python\npayload = Payload(query=query).size(10).sort(\"field\", order=\"desc\")\nDocument.search(payload) \n# or the equivalent\npayload.search(Document)\n```\n\n\n#### Pagination\n\nYou can paginate a payload, lets say you have indexed 500 documents under 'test' category and now you need to retrieve 50 per page.\n\n\u003e Result will be included in **pagination.items** \n\n```python\nfrom esengine import Payload, Filter\nfrom models import Doc\n\npayload = Payload(Doc, filter=Filter.term('category', 'test'))\n\n# Total documents\npayload.count()\n500\n\n# Paginate it\ncurrent_page = 1  # you have to increase it on each pagination\npagination = payload.paginate(page=current_page, per_page=50)\n\npagination.total\n500\n\npagination.pages\n10\n\npagination.has_prev\nFalse\n\npagination.has_next\nTrue\n\npagination.next_num\n2\n\nlen(pagination.items)\n50\n\nfor item in pagination.items:\n    # do something with item\n\n# Turn the page\n\ncurrent_page += 1\npagination = payload.paginate(page=current_page, per_page=50)\npagination.page\n2\npagination.has_prev\nTrue\n\n# Another option to move pages\n\npagination  = pagination.next_page()\npagination.page\n3\n\npagination = pagination.prev_page()\npagination.page\n2\n\n# Turn the page in place\n\npagination.backward()\npagination.page\n1\n\npagination.forward()\npagination.page\n2\n```\n\n##### Create a paginator in Jinja template\n\nSo you want to create buttons for pagination in your jinja template\n\n```html+jinja\n{% macro render_pagination(pagination, endpoint) %}\n  \u003cdiv class=pagination\u003e\n  {%- for page in pagination.iter_pages() %}\n    {% if page %}\n      {% if page != pagination.page %}\n        \u003ca href=\"{{ url_for(endpoint, page=page) }}\"\u003e{{ page }}\u003c/a\u003e\n      {% else %}\n        \u003cstrong\u003e{{ page }}\u003c/strong\u003e\n      {% endif %}\n    {% else %}\n      \u003cspan class=ellipsis\u003e…\u003c/span\u003e\n    {% endif %}\n  {%- endfor %}\n  \u003c/div\u003e\n{% endmacro %}\n```\n\n\n# Contribute\n\nESEngine is OpenSource! join us!\n\u003ca href=\"http://smallactsmanifesto.org\" title=\"Small Acts Manifesto\"\u003e\u003cimg src=\"http://smallactsmanifesto.org/static/images/smallacts-badge-80x15-blue.png\" style=\"border: none;\" alt=\"Small Acts Manifesto\" /\u003e\u003c/a\u003e\n\n**MADE WITH #LOVE AND #PYTHON (which is the same) AT [CathoLabs](http://catholabs.com)**  \n\n![catholabs](http://catholabs.com/_themes/catholabs/img/logo_black.png)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fseek-ai%2Fesengine","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fseek-ai%2Fesengine","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fseek-ai%2Fesengine/lists"}