{"id":18260093,"url":"https://github.com/roppa/elasticsearch-synonyms","last_synced_at":"2025-04-08T23:44:47.131Z","repository":{"id":146092625,"uuid":"72521856","full_name":"roppa/elasticsearch-synonyms","owner":"roppa","description":"Elasticsearch/Solr synonym processor","archived":false,"fork":false,"pushed_at":"2016-11-13T16:20:39.000Z","size":20,"stargazers_count":0,"open_issues_count":0,"forks_count":1,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-02-14T18:36:30.538Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/roppa.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2016-11-01T09:26:15.000Z","updated_at":"2016-11-01T09:26:40.000Z","dependencies_parsed_at":null,"dependency_job_id":"d2f84c2f-c47b-44da-b0f8-c4d09bf502ef","html_url":"https://github.com/roppa/elasticsearch-synonyms","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/roppa%2Felasticsearch-synonyms","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/roppa%2Felasticsearch-synonyms/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/roppa%2Felasticsearch-synonyms/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/roppa%2Felasticsearch-synonyms/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/roppa","download_url":"https://codeload.github.com/roppa/elasticsearch-synonyms/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247947825,"owners_count":21023058,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-05T10:42:03.583Z","updated_at":"2025-04-08T23:44:47.107Z","avatar_url":"https://github.com/roppa.png","language":"JavaScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Synonyms are hard, lets face it\n\n[![Build Status](https://travis-ci.org/roppa/elasticsearch-synonyms.svg?branch=master)](https://travis-ci.org/roppa/elasticsearch-synonyms)\n\nWell, they aren't really, just check out a Thesaurus. However, the difficulty comes when we use phrases for synonyms. As Solr and Elasticsearch parse with a space ' ', phrases are broken up and our results are not what we expect. Like I say, synonyms are hard.\n\nI'm also not worried about case at the moment, so RIRO, I expect your parameters to be the exact case you want.\n\nThe following is taken from elasticsearch [synonym tokenfilter](https://www.elastic.co/guide/en/elasticsearch/reference/2.3/analysis-synonym-tokenfilter.html) documentation:\n\n```\n# Blank lines and lines starting with pound are comments.\n\n# Explicit mappings match any token sequence on the LHS of \"=\u003e\"\n# and replace with all alternatives on the RHS.  These types of mappings\n# ignore the expand parameter in the schema.\n# Examples:\ni-pod, i pod =\u003e ipod,\nsea biscuit, sea biscit =\u003e seabiscuit\n\n# Equivalent synonyms may be separated with commas and give\n# no explicit mapping.  In this case the mapping behavior will\n# be taken from the expand parameter in the schema.  This allows\n# the same synonym file to be used in different synonym handling strategies.\n# Examples:\nipod, i-pod, i pod\nfoozball , foosball\nuniverse , cosmos\n\n# If expand==true, \"ipod, i-pod, i pod\" is equivalent\n# to the explicit mapping:\nipod, i-pod, i pod =\u003e ipod, i-pod, i pod\n# If expand==false, \"ipod, i-pod, i pod\" is equivalent\n# to the explicit mapping:\nipod, i-pod, i pod =\u003e ipod\n\n# Multiple synonym mapping entries are merged.\nfoo =\u003e foo bar\nfoo =\u003e baz\n# is equivalent to\nfoo =\u003e foo bar, baz\n```\n\nThere are four permutations of these synonyms:\n\n  - Simple expansion (a,b,c)\n  - Simple contraction (a,b,c =\u003e a)\n  - Genre expansion (a =\u003e c,b,a)\n  - Explicit mappings (a,b,c =\u003e a,b,c)\n\n## Simple expansion\n\nSimple expansion/Equivalent synonyms, are single words separated by a comma. Each term equals each other.\n\n```\nfootball,soccer,foosball\n```\n\nSearches for ```soccer``` would return ```foosball``` and ```football```.\n\nPhrases would also be included in this if the lhs equaled the rhs of the fat arrow.\n\n## Simple contraction\n\nThe key is in the term 'contraction' - words on the left are replaced by the term/s on the rhs.\n\n```\nleap,hop =\u003e jump\n```\n\nThis has to be used at analysis time as well as query time. I think this is because at index time, terms on the left will be replaced with the term on the right, so in order for your search for \"hop\" to return results, you need to pass that in at the query time.\n\n## Genre expansion\n\nThis sets up genres. For example, a cat is a type of pet. A kitten is a type of cat, which is a type of pet. A dog is a pet, and a puppy is a type of dog.\n\n```\ncat =\u003e cat,pet,\nkitten =\u003e kitten,cat,pet,\ndog =\u003e dog,pet,\npuppy =\u003e puppy,dog,pet\n```\n\nSearching 'pet' would return 'cat', 'kitten', 'dog', 'puppy'.\n\n## Explicit mapping\n\nThese match any token sequence on the LHS of \"=\u003e\" and replace with all alternatives on the RHS. This has issues with phrases as elasticsearch tokenizes using whitespace. Terms on the left will be replaced by terms on the right.\n\n```\na,b,c =\u003e a,b,c\n```\n\n## Install\n\n```\nconst s = require('elasticsearch-synonyms');\n```\n\n## Methods\n\n### s.expand(array)\n\nTakes an array and returns a comma delimited string.\n\nTurns:\n\n```\n['u s a', 'usa', 'united states of america']\n```\n\ninto:\n\n```\n'u s a,usa,united states of america =\u003e u s a,usa,united states of america'\n```\n\n### s.expandString(string)\n\nTakes a string of words separated with spaces and returns a comma delimited string, ```'wood bark tree splinter'``` becomes ```'wood,bark,tree,splinter'```.\n\n### s.contract(array, [replacement])\n\nThe contract method should take an array and perform a simple contraction (a,b,c =\u003e a). If there is no replacement parameter (optional) it takes the first non-phrase and uses that for the replacement. For example:\n\n```\n['a', 'b b', 'c', 'd']\n```\n\n```\n'a,c,d,b b =\u003e a'\n```\n\nIf all phrases are used, each phrase is expanded:\n\n```\n['a a', 'b b', 'c c', 'd d']\n```\n\n```\n'a a,b b,c c,d d =\u003e a a,b b,c c,d d'\n```\n\n### s.genre(object)\n\nThe genre method should take a hierarchy object and perform genre expansion (a =\u003e a,b,c).\n\nGiven the following object:\n\n```\n{\n  pet: {\n    cat: {\n      kitten: 'kitten',\n    },\n    dog: {\n      puppy: 'puppy',\n    }\n  }\n}\n```\nResult will be:\n\n```\ncat =\u003e cat,pet\nkitten =\u003e kitten,cat,pet\ndog =\u003e dog,pet\npuppy =\u003e puppy,dog,pet\n```\n\nThere must be only one common ancestor. Each subsequent element starts off lhs, then fat arrow, then itself and predecessors.\n\n### s.explicit(array, [array])\n\nIf a single array, comma delimits lhs and duplicates on rhs:\n\n```\ns.explicit(['g b', 'gb', 'great britain']);\n\u003e g b,gb,great britain =\u003e g b,gb,great britain\n```\n\nIf two arrays, second array becomes the rhs:\n\n```\ns.explicit(['g b', 'gb', 'great britain'], ['britain', 'england', 'scotland', 'wales']);\ng b,gb,great britain =\u003e britain,england,scotland,wales\n```\n\n### s.stringify(array or object)\n\nTakes an array or object and stringifies it. With an object, a new line is inserted after each attribute (just the top level values are flattened):\n\n```\n{\n  a: ['a', 'b'],\n  c: ['c', 'd'],\n}\n```\n\n```\n'a,b\\nc,d'\n```\n\n### s.stringToArray(string)\n\nTakes a string and splits on new line character. Any comments (#) are removed. Used as the starting point for config file processing.\n\n### s.parseFile(string)\n\nTakes a config file (as a string), like the example token filter file in the introduction, and converts it to an object (tokens are expanded by default):\n\n```\n{\n  'i-pod': ['ipod', 'i-pod', 'i pod'],\n  'i pod': ['ipod', 'i-pod', 'i pod'],\n  ipod: ['ipod', 'i-pod', 'i pod'],\n  'sea biscuit': ['seabiscuit'],\n  'sea biscit': ['seabiscuit'],\n  foozball: ['foozball', 'foosball'],\n  foosball: ['foozball', 'foosball'],\n  universe: ['universe', 'cosmos'],\n  cosmos: ['universe', 'cosmos'],\n  foo: ['foo bar', 'baz']\n}\n```\n\n## Testing\n\nRun ```npm run test```\n\n## References\n\n  - Elasticsearch [synonyms, expand or contract](https://www.elastic.co/guide/en/elasticsearch/guide/current/synonyms-expand-or-contract.html)\n  - Elasticsearch [synonym formats](https://www.elastic.co/guide/en/elasticsearch/guide/current/synonym-formats.html)\n  - [Node solr synonyms](https://github.com/Prinzhorn/node-solr-synonyms)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Froppa%2Felasticsearch-synonyms","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Froppa%2Felasticsearch-synonyms","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Froppa%2Felasticsearch-synonyms/lists"}