{"id":28458134,"url":"https://github.com/jolicode/emoji-search","last_synced_at":"2025-07-02T05:31:35.513Z","repository":{"id":37773900,"uuid":"52520016","full_name":"jolicode/emoji-search","owner":"jolicode","description":":smile: Emoji synonyms to build your own emoji-capable search engine (elasticsearch, solr, OpenSearch)","archived":false,"fork":false,"pushed_at":"2025-02-07T12:32:03.000Z","size":25536,"stargazers_count":221,"open_issues_count":2,"forks_count":64,"subscribers_count":25,"default_branch":"main","last_synced_at":"2025-06-07T00:09:47.328Z","etag":null,"topics":["analyzer","cldr","elasticsearch","elasticsearch-plugin","emoji","emoticons","hacktoberfest","opensearch","plugin"],"latest_commit_sha":null,"homepage":"https://jolicode.com/blog/elasticsearch-icu-now-understands-emoji","language":"PHP","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/jolicode.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2016-02-25T11:36:19.000Z","updated_at":"2025-02-07T17:36:47.000Z","dependencies_parsed_at":"2025-02-07T13:30:49.601Z","dependency_job_id":"c57cfd0a-d7f8-49d1-b1fa-a6700010dacf","html_url":"https://github.com/jolicode/emoji-search","commit_stats":null,"previous_names":[],"tags_count":17,"template":false,"template_full_name":null,"purl":"pkg:github/jolicode/emoji-search","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jolicode%2Femoji-search","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jolicode%2Femoji-search/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jolicode%2Femoji-search/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jolicode%2Femoji-search/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/jolicode","download_url":"https://codeload.github.com/jolicode/emoji-search/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jolicode%2Femoji-search/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":263081303,"owners_count":23410850,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["analyzer","cldr","elasticsearch","elasticsearch-plugin","emoji","emoticons","hacktoberfest","opensearch","plugin"],"created_at":"2025-06-07T00:10:03.398Z","updated_at":"2025-07-02T05:31:35.487Z","avatar_url":"https://github.com/jolicode.png","language":"PHP","funding_links":[],"categories":[],"sub_categories":[],"readme":"# 🙂 Emoji, flags \u0026 emoticons support for Elasticsearch\n\nAdd support for **emoji** and **flags** in any **Lucene** compatible search engine!\n\nIf you wish to search `🍩` to find **donuts** in your documents, you came to the\nright place. We offer synonym files ready for usage in Elasticsearch and OpenSearch analyzer.\n\n![Test all synonym files on a real Elasticsearch](https://github.com/jolicode/emoji-search/workflows/Test%20all%20synonym%20files%20on%20a%20real%20Elasticsearch/badge.svg)\n\n## Requirements to index emoji in Elasticsearch\n\nThere is no requirements for Elasticsearch \u003e= 6.7.\n\n\u003cdetails\u003e\u003csummary\u003eUsing older version of Elasticsearch? Open me! 🖱\u003c/summary\u003e\n\n| Version                        |                                                                                                               Requirements                                                                                                                |\n|--------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------:|\n| Elasticsearch \u003e= 6.4 and \u003c 6.7 | You need to install the official [ICU Plugin](https://www.elastic.co/guide/en/elasticsearch/plugins/current/analysis-icu.html). See our [blog post about this change](https://jolicode.com/blog/elasticsearch-icu-now-understands-emoji). |\n| Elasticsearch \u003c 6.4            |                    You need our [custom ICU Tokenizer Plugin](https://github.com/jolicode/emoji-search/tree/6.2.4/esplugin), see our [blog post](http://jolicode.com/blog/search-for-emoji-with-elasticsearch) (2016).                    |\n\nRun the following test to verify that you get 4 EMOJI tokens:\n\n```json\nGET _analyze\n{\n  \"text\": [\"🍩 🇫🇷 👩‍🚒 🚣🏾‍♀\"]\n}\n```\n\u003c/details\u003e\n\n## The Synonyms, flags and emoticons\n\nWhat you need to search with emoji is a way to expand them to words that can\nmatch searches and documents, in **your language**. That's the goal of the\n[synonym dictionaries](/synonyms).\n\nWe build Solr / Lucene compatible synonyms files in all languages supported by\n[Unicode CLDR](http://cldr.unicode.org/) so you can set them up in an analyzer.\nIt looks like this:\n\n```\n👩‍🚒 =\u003e 👩‍🚒, firefighter, firetruck, woman\n👩‍✈ =\u003e 👩‍✈, pilot, plane, woman\n🥓 =\u003e 🥓, bacon, meat, food\n🥔 =\u003e 🥔, potato, vegetable, food\n😅 =\u003e 😅, cold, face, open, smile, sweat\n😆 =\u003e 😆, face, laugh, mouth, open, satisfied, smile\n🚎 =\u003e 🚎, bus, tram, trolley\n🇫🇷 =\u003e 🇫🇷, france\n🇬🇧 =\u003e 🇬🇧, united kingdom\n```\n\nFor emoticons, use [this mapping](emoticons.txt) with a char_filter to replace\nemoticons by emoji.\n\n### Installation\n\nDownload the emoji and emoticon file you want from this repository and store\nthem in `PATH_TO_ES/config/analysis` (_or anywhere Elasticsearch can read_).\n\n```\nconfig\n├── analysis\n│   ├── cldr-emoji-annotation-synonyms-en.txt\n│   └── emoticons.txt\n├── elasticsearch.yml\n...\n```\n\nUse them like this (this is a complete _english_ example with Elasticsearch \u003e=\n6.7):\n\n```json\nPUT /tweets\n{\n  \"settings\": {\n    \"analysis\": {\n      \"filter\": {\n        \"english_emoji\": {\n          \"type\": \"synonym\",\n          \"synonyms_path\": \"analysis/cldr-emoji-annotation-synonyms-en.txt\"\n        },\n        \"emoji_variation_selector_filter\": {\n          \"type\": \"pattern_replace\",\n          \"pattern\": \"\\\\uFE0E|\\\\uFE0F\",\n          \"replace\": \"\"\n        },\n        \"english_stop\": {\n          \"type\":       \"stop\",\n          \"stopwords\":  \"_english_\"\n        },\n        \"english_keywords\": {\n          \"type\":       \"keyword_marker\",\n          \"keywords\":   [\"example\"]\n        },\n        \"english_stemmer\": {\n          \"type\":       \"stemmer\",\n          \"language\":   \"english\"\n        },\n        \"english_possessive_stemmer\": {\n          \"type\":       \"stemmer\",\n          \"language\":   \"possessive_english\"\n        }\n      },\n      \"analyzer\": {\n        \"english_with_emoji\": {\n          \"tokenizer\": \"standard\",\n          \"filter\": [\n            \"english_possessive_stemmer\",\n            \"lowercase\",\n            \"emoji_variation_selector_filter\",\n            \"english_emoji\",\n            \"english_stop\",\n            \"english_keywords\",\n            \"english_stemmer\"\n          ]\n        }\n      }\n    }\n  },\n  \"mappings\": {\n    \"properties\": {\n      \"content\": {\n        \"type\": \"text\",\n        \"analyzer\": \"english_with_emoji\"\n      }\n    }\n  }\n}\n```\n\nYou can now test the result with:\n\n```json\nGET tweets/_analyze\n{\n  \"field\": \"content\",\n  \"text\": \"🍩 🇫🇷 👩‍🚒 🚣🏾‍♀\"\n}\n```\n\n## How to contribute\n\n### Build from CLDR SVN\n\nYou will need:\n\n- php cli\n- php zip, mbstring, xml and curl extensions\n- a running Elasticsearch (`make start`)\n\nEdit the tag in `tools/build-released.php` and run `php tools/build-released.php`.\n\n### Update emoticons\n\nRun `php tools/build-emoticon.php`.\n\n## Licenses\n\nEmoji data courtesy of CLDR. See [unicode-license.txt](unicode-license.txt) for\ndetails. Some modifications are done on the data, [see\nhere](https://github.com/jolicode/emoji-search/issues/6). Emoticon data based on\n[https://github.com/wooorm/emoticon/](https://github.com/wooorm/emoticon/)\n(MIT).\n\nThis repository in distributed under [MIT License](LICENSE). Feel free to use\nand contribute as you please!\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjolicode%2Femoji-search","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjolicode%2Femoji-search","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjolicode%2Femoji-search/lists"}