{"id":23226808,"url":"https://github.com/botsquad/match_engine","last_synced_at":"2025-08-19T13:33:13.722Z","repository":{"id":48283151,"uuid":"122170032","full_name":"botsquad/match_engine","owner":"botsquad","description":"In-memory matching/filtering engine with Mongo-like query syntax","archived":false,"fork":false,"pushed_at":"2024-01-24T10:35:56.000Z","size":88,"stargazers_count":13,"open_issues_count":0,"forks_count":3,"subscribers_count":2,"default_branch":"master","last_synced_at":"2024-04-25T20:21:47.227Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Elixir","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/botsquad.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2018-02-20T08:01:11.000Z","updated_at":"2024-03-27T19:00:32.000Z","dependencies_parsed_at":"2022-09-13T08:21:40.589Z","dependency_job_id":null,"html_url":"https://github.com/botsquad/match_engine","commit_stats":null,"previous_names":[],"tags_count":11,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/botsquad%2Fmatch_engine","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/botsquad%2Fmatch_engine/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/botsquad%2Fmatch_engine/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/botsquad%2Fmatch_engine/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/botsquad","download_url":"https://codeload.github.com/botsquad/match_engine/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":230356045,"owners_count":18213573,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-12-19T00:19:31.792Z","updated_at":"2024-12-19T00:19:32.266Z","avatar_url":"https://github.com/botsquad.png","language":"Elixir","funding_links":[],"categories":[],"sub_categories":[],"readme":"# MatchEngine\n\n[![CI](https://github.com/botsquad/match_engine/actions/workflows/elixir.yml/badge.svg)](https://github.com/botsquad/match_engine/actions/workflows/elixir.yml)\n[![Module Version](https://img.shields.io/hexpm/v/match_engine.svg)](https://hex.pm/packages/match_engine)\n[![Hex Docs](https://img.shields.io/badge/hex-docs-lightgreen.svg)](https://hexdocs.pm/match_engine/)\n[![Total Download](https://img.shields.io/hexpm/dt/match_engine.svg)](https://hex.pm/packages/match_engine)\n[![License](https://img.shields.io/hexpm/l/match_engine.svg)](https://github.com/botsquad/match_engine/blob/master/LICENSE)\n[![Last Updated](https://img.shields.io/github/last-commit/botsquad/match_engine.svg)](https://github.com/botsquad/match_engine/commits/master)\n\n\u003c!-- MDOC !--\u003e\n\nMatchEngine is an in-memory matching/filtering engine with\nMongoDB-like query syntax.\n\n## Introduction\n\nThe query language consists of nested Elixir \"keyword list\". Each\ncomponent of the query consists of a _key_ part and a _value_\npart. The key part is either a logic operator (and/or/not), or a\nreference to a field, the value part is either a plain value, or a\nvalue operator.\n\nWhen a query is run against a document, where each term is scored\nindividually and then summed. (This implies \"or\"). Some example\nqueries:\n\n```\n[title: \"hoi\"]\n[title: [_eq: \"hoi\"]]\n[_and: [name: \"John\", age: 36]]\n[_or: [name: \"John\", age: 36]]\n[_not: [title: \"foo\"]]\n```\n\nTwo ways of saying \"Score all documents in which the title equals `\"hoi\"`\":\n\n```\n[title: \"hoi\"]\n[title: [_eq: \"hoi\"]]\n```\n\nCombining various matchers with logic operators:\n\n```\n[_and: [name: \"John\", age: 36]]\n[_or: [name: \"John\", age: 36]]\n[_not: [title: \"foo\"]]\n```\n\nPerforming matches in nested objects is also possible; the query\nsimply follows the shape of the data.\n\nGiven a document consisting of a nested structure, `%{\"user\" =\u003e %{\"name\" =\u003e \"John\"}}`:\n\n\"User name equals John\":\n\n```\n[user: [name: \"John\"]]\n```\n\n\"User name does not equal John\":\n\n```\n[_not: [user: [name: \"John\"]]]\n```\n\n\u003e Note that this is a different approach for nesting fields than MongoDB, which uses dot notation for field nesting.\n\n## Query execution\n\nThe queries can be run by calling `MatchEngine.score_all/2` or `MatchEngine.filter_all/2`.\n\nQueries are first preprocessed, and then executed on a list of search\n\"documents\". A \"document\" is just a normal Elixir map, with string\nkeys.\n\nThe preprocessing phase compiles any regexes, checks whether all\noperators exist, and de-nests nested field structures.\n\nThe query phase runs the preprocessed query for each document in the\nlist, by calculating the score for the given document, given the\nquery. When using filter_all/2, documents with a zero score are\nremoved from the input list. When using score_all, the list is\nsorted on score, descending, and this score, including any\nadditional metadata, is returned in a `\"_match\"` map inside the\ndocument.\n\n## Value operators\n\n_Value operators_ work on an individual field. Various operators can\nbe used to calculate a score for a given field.\n\n### `_eq`\n\nScores on the equality of the argument.\n\n```\n[title: \"hello\"]\n[title: [_eq: \"hello\"]]\n```\n\n### `_ne`\n\nScores on the *in*equality of the argument. (\"Not equals\")\n\n```\n[title: [_ne: \"hello\"]]\n```\n\n### `_has`\n\nScores when the document's value contains a member of the given list or contains the given word or words\n\n```\n[tag: [_has: [\"production\"]]]\n[title: [_has: \"The\"]]\n[title: [_has: [\"The\", \"title\"]]]\n```\n\n### `_hasnt`\n\nScores when the document's value does NOT contains any member of the given list\nor does not contain any of the given word or words\n\n```\n[tag: [_hasnt: [\"production\"]]]\n[title: [_hasnt: \"The\"]]\n[title: [_hasnt: [\"The\", \"title\"]]]\n```\n\n### `_in`\n\nScores when the document's value is a member of the given list.\n\n```\n[role: [_in: [\"developer\", \"freelancer\"]]]\n```\n\n### `_nin`\n\nScores when the document's value is _not_ a member of the given list.\n\n```\n[role: [_nin: [\"recruiter\"]]]\n```\n\n### `_lt`, `_gt`, `_lte`, `_gte`\n\nScores on using the comparison operators \u003c, \u003e, \u003c= and \u003e=.\n\n```\n[age: [_gt: 18]]\n```\n\n### `_sim`\n\nNormalized string similarity. The max of the Normalised Levenshtein\ndistance and Jaro distance.\n\n### `_regex`\n\nMatch a regular expression. The input is a string, which gets compiled\ninto a regex. This operator scores on the length of match divided by\nthe total string length. It is possible to add named captures to the\nregex, which then get added to the `_match` metadata map, as seen in the following example:\n\n```\n# regex matches entire string, 100% score\nassert %{\"score\" =\u003e 1} == score([title: [_regex: \"foo\"]], %{\"title\" =\u003e \"foo\"})\n# regex matches with a capture called 'name'. It is boosted by weight.\nassert %{\"score\" =\u003e 1.6, \"name\" =\u003e \"food\"} == score([title: [_regex: \"(?P\u003cname\u003efoo[dl])\", w: 4]], %{\"title\" =\u003e \"foodtrucks\"})\n```\n\nThe regex match can also be inversed, where the document value is\ntreated as the regular expression, and the query input is treated as\nthe string to be matched. (No captures are supported in this case).\n\n```\nassert %{\"score\" =\u003e 0.5} == score([title: [_regex: \"foobar\", inverse: true]], %{\"title\" =\u003e \"foo\"})\n```\n\n### `_geo`\n\nCalculate document score based on its geographical distance to a given\npoint. The geo distance (both in the operator and in the document) can\nbe given as:\n\n- A regular list, e.g. `[4.56, 52.33]`\n- A keyword list, e.g. `[lat: 52.33, lon: 4.56]`\n- A map with atom keys, e.g. `%{lat: 52.33, lon: 4.56}`\n- A map with string keys, e.g. `%{\"lat\" =\u003e 52.33, \"lon\" =\u003e 4.56}`\n\nThe calculated `distance` is returned in meters, as part of the `_match` map.\n\nAn extra argument, `max_distance` can be given to the operator which\nspecifies the maximum cutoff point. It defaults to 100km. (100_000).\nDistance is scored logarithmically with respect to the maximum\ndistance.\n\n```\ndoc = %{\"location\" =\u003e %{\"lat\" =\u003e 52.340500999999996, \"lon\" =\u003e 4.8832816}}\nq = [location: [_geo: [lat: 52.340500999999996, lon: 4.8832816]]]\nassert %{\"score\" =\u003e 1, \"distance\" =\u003e 0.0} == score(q, doc)\n```\n\nWhen `radius` is given as an option, all geo points that are within\nthe radius will score a 1 and the max_distance scoring will be in\neffect for distances larger than the radius.\n\n### `_geo_poly`\n\nCalculate document score based on its containment inside a given\ngeographical polygon.\n\nAccepts a list of geographical coordinates, each in the same format\nas `_geo`.\n\nLike `_geo`, the `max_distance` option can be given to the operator\nwhich specifies the maximum cutoff point. It defaults to\n100km. (100_000). Distance is scored logarithmically with respect\nto the maximum distance.\n\nWhen the point is inside the polygon, the score is always 1. Only\nwhen the point is outside the polygon, the geographical distance\nfrom the document point to the closest point on the edge of the\npolygon is calculated and scored based on the `max_distance`\nsetting.\n\n### `_time`\n\nScore by an UTC timestamp, relative to the given time.\n\n```\nt1 = \"2018-02-19T15:29:53.672235Z\"\nt2 = \"2018-02-19T15:09:53.672235Z\"\nassert %{\"score\" =\u003e s} = score([inserted_at: [_time: t1]], %{\"inserted_at\" =\u003e t2})\n```\n\nThis way, documents can be returned in order of recency.\n\n## Logic operators\n\n### `_and`\n\nCombine matchers, multiplying the score. When one of the matchers\nreturns 0, the total score is 0 as well.\n\n```\n[_and: [name: \"John\", age: 36]]\n```\n\n### `_or`\n\nCombine matchers, adding the scores.\n\n```\n[_or: [name: \"John\", id: 12]]\n```\n\n### `_not`\n\nReverse the score of the nested matchers. (when score \u003e 0, return 0, otherwise, return 1.\n\n```\n[_not: [title: \"foo\"]]\n```\n\n### Matcher weights\n\n`w: 10` can be added to a matcher term to boost its score by the given weight.\n\n```\n[title: [_eq: \"Pete\", w: 5], summary: [_sim: \"hello\", w: 2]]\n```\n\n`b: true` can be added to force a score of 1 when the score is \u003e 0.\n\n```\n[title: [_sim: \"hello\", b: true]]\n```\n\n## Map syntax for queries\n\nInstead of keyword lists, queries can also be specified as maps. In\nthis case, the keys of the map need to be strings. Query maps are\nmeant to be used from user-generated input, and can be easily created from JSON files.\n\n```\n[_not: [title: \"foo\"]]\n# can also be written as:\n%{\"_not\" =\u003e %{\"title\" =\u003e \"foo\"}}\n\n[title: [_eq: \"Pete\", w: 5], summary: [_sim: \"hello\", w: 2]]\n# can also be written as:\n%{\"title\" =\u003e %{\"_eq\" =\u003e \"Pete\", \"w\" =\u003e 5}, \"summary\" =\u003e %{\"_sim\" =\u003e \"hello\", \"w\" =\u003e 2}}\n```\n\n\u003c!-- MDOC !--\u003e\n\n## Installation\n\nIf [available in Hex](https://hex.pm/docs/publish), the package can be installed\nby adding `match_engine` to your list of dependencies in `mix.exs`:\n\n```elixir\ndef deps do\n  [\n    {:match_engine, \"~\u003e 1.0\"}\n  ]\nend\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbotsquad%2Fmatch_engine","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbotsquad%2Fmatch_engine","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbotsquad%2Fmatch_engine/lists"}