{"id":28754233,"url":"https://github.com/zeroasterisk/cakephp-elasticsearchindex","last_synced_at":"2025-10-19T09:22:44.616Z","repository":{"id":13047318,"uuid":"15727322","full_name":"zeroasterisk/CakePHP-ElasticSearchIndex","owner":"zeroasterisk","description":"CakePHP Plugin to facilitate full text indexing across any Model via ElasticSearch","archived":false,"fork":false,"pushed_at":"2021-12-15T20:44:30.000Z","size":306,"stargazers_count":22,"open_issues_count":3,"forks_count":7,"subscribers_count":9,"default_branch":"master","last_synced_at":"2023-07-17T07:02:32.660Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"PHP","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/zeroasterisk.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2014-01-08T06:18:06.000Z","updated_at":"2023-07-17T07:02:32.661Z","dependencies_parsed_at":"2022-08-28T06:03:18.498Z","dependency_job_id":null,"html_url":"https://github.com/zeroasterisk/CakePHP-ElasticSearchIndex","commit_stats":null,"previous_names":[],"tags_count":0,"template":null,"template_full_name":null,"purl":"pkg:github/zeroasterisk/CakePHP-ElasticSearchIndex","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zeroasterisk%2FCakePHP-ElasticSearchIndex","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zeroasterisk%2FCakePHP-ElasticSearchIndex/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zeroasterisk%2FCakePHP-ElasticSearchIndex/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zeroasterisk%2FCakePHP-ElasticSearchIndex/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/zeroasterisk","download_url":"https://codeload.github.com/zeroasterisk/CakePHP-ElasticSearchIndex/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zeroasterisk%2FCakePHP-ElasticSearchIndex/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":260269459,"owners_count":22983647,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-06-17T01:08:00.199Z","updated_at":"2025-10-19T09:22:44.523Z","avatar_url":"https://github.com/zeroasterisk.png","language":"PHP","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Elastic Search Index\n\nThis plugin allow for a very easy search index powered by\n[ElasticSearch](http://www.elasticsearch.org/)\nwith all kinds of\n[Lucene](http://lucene.apache.org/)\npowered goodness. *(it powers GitHub)*\n\nWith this, you keep your models on your own normal (default) datasource.  All\nsaves and finds and joins and callbacks... normal.\n\nBut when you attach this behavior, you now have additional callbacks which\ngather the data you want to use as a search index... it stores that data to\nElasticSearch via it's own datasource, `index` as setup via the (above) Elastic\nplugin.\n\n![ModelData: normal + SearchIndex: ElasticSearch](https://www.filepicker.io/api/file/zCgRIjKzTYG8jScsLtzE)\n\nWhat you end up with is having you cake and eating it too.\n\n* Your Model and datasource are unchanged and work as before.\n * all your data is still where it has always been\n * you can still do joins\n * non-search conditions can still work on the normal fields\n* The searchy goodness of ElasticSearch / Lucene is avaialble to you\n * The indexed string for each record is a customizable second copy of the data's text\n * It's avaialble on ElasticSearch for any other usage as well\n\nNow you can search by\n\n* term: `foo`\n* multi-term: `foo bar`\n* partials: `fo*`\n* partials in the front: `*oo`\n* phrases: `\"foo bar\"`\n* fuzzy term: `~bars` *(prefix with `~`)*\n* ... and more ... (suggestions?)\n\n*Note: it is working great, but we could use more ElasticSearch special sauce if\nyou want to help improve it.*\n\n## Install\n\nGet this plugin into place you need to also get the [Icing](https://github.com/AudiologyHoldings/Icing) Plugin\n\n```\ngit submodule add https://github.com/zeroasterisk/CakePHP-ElasticSearchIndex app/Plugin/ElasticSearchIndex\ngit submodule add https://github.com/AudiologyHoldings/Icing app/Plugin/Icing\n```\n\nor if you don't like submodules:\n\n```\ngit clone https://github.com/zeroasterisk/CakePHP-ElasticSearchIndex app/Plugin/ElasticSearchIndex\ngit clone https://github.com/AudiologyHoldings/Icing app/Plugin/Icing\n```\n\nCopy the default `ElasticSearchRequest` configuration into your app\nand edit it to suit your setup (*ElasticSearch url/port*).\n\n```\ncp app/Plugin/Icing/Config/elastic_search_request.php.default app/Config/elastic_search_request.php\n```\n\nNote that there's a `default` config and a `test` config which will override\nthe `default` config...  But only if your tests set the following Configure variable:\n\n```\nConfigure::write('inUnitTest', true);\n```\n\nIn `app/Config/bootstrap.php` load the plugin\n\n```\nCakePlugin::load('Icing');\nCakePlugin::load('ElasticSearchIndex');\n```\n\n### Now setup into any Models you want to search / index\n\nIn your `Model` add this behavior\n\n```\n\tpublic $actsAs = array(\n\t\t'ElasticSearchIndex.ElasticSearchIndexable' =\u003e array(),\n\t);\n```\n\nAnd here are the behaviour config options, with default values\n\n```\n\tpublic $actsAs = array(\n\t\t'ElasticSearchIndex.ElasticSearchIndexable' =\u003e array(\n\t\t\t// url to the elastic search index for this model/table\n\t\t\t'url' =\u003e null,\n\t\t\t// extra config for ElasticSearchRequest (parsed from URL)\n\t\t\t'index' =\u003e null,\n\t\t\t// extra config for ElasticSearchRequest (parsed from URL, or defaulted to $Model-\u003euseTable)\n\t\t\t'table' =\u003e null,\n\t\t\t// do we build the index automatically via the afterSave callback? (recommended)\n\t\t\t'rebuildOnUpdate' =\u003e true,\n\t\t\t// what fields to used when we build the index (ignonred if custom method on model)\n\t\t\t//   eg: array('title', 'name', 'email', 'city', 'state',  'country'),\n\t\t\t//       or for all (text/varchar) fields: '*' (default)\n\t\t\t'fields' =\u003e '*',\n\t\t\t// when we build the index, do we find data first?\n\t\t\t//   false: we only have the data which was saved this time\n\t\t\t//   true: we do a find() afterSave() to get all fields\n\t\t\t'queryAfterSave' =\u003e true,\n\t\t\t// optional config for HttpSocket (better to configure ElasticSearchRequest)\n\t\t\t// limit the search results to this many results\n\t\t\t'limit' =\u003e 200,\n\t\t\t'request' =\u003e array(),\n\t\t\t// details needed to link to Model (edge cases)\n\t\t\t'foreignKey' =\u003e false, // primaryKey to save against\n\t\t\t// optional optimizing configuration, register_shutdown_function()\n\t\t\t//   if true, we don't actually save on ElasticSearch until\n\t\t\t//   this script is completed... via register_shutdown_function()\n\t\t\t//   NOTE: this will \"stack\" multiple saves if they happen, in order\n\t\t\t'register_shutdown_function' =\u003e false,\n\t\t),\n\t);\n```\n\n## How to Save Records\n\nIt's **automatic, after every save**, the behaviour will post that record to the ElasticSearch index.\n\nIf you want to manually index any model `$data` arrays (with the fields from\nthis model), in your `Model` you can do so with `saveToIndex($id, $data)`...\n\n```\n// in a Model\n$data = $this-\u003eread(null, '1234');\n$id = $data[$this-\u003ealias][$this-\u003eprimaryKey];\n$success = $this-\u003esaveToIndex($id, $data);\n```\n\nIf you have a simple string, you want to index for a record on your `Model`\nthen you can use `saveIndexDataToIndex($id, $indexString)`\n\n```\n// in a Model\n$success = $this-\u003esaveIndexDataToIndex(1234, 'This is a custom string, this will be indexed');\n```\n\n*(want to index all of your records?  see below for `reIndexAll()`)*\n\n### Customize the data to save to the Index\n\nYou can specify a few methods on your model, which override the basic functionality.\n\nThere are 3 levels of customization available:\n\n* `getDataForIndex()` getting data, allows you to do a custom `find()` adding contains, etc (output = array)\n* `indexData()` parsing/transforming the data from array --\u003e index string\n* `cleanForIndex()` clean/replace the index string (final pass)\n\nNOTE: if you use `getDataForIndex()` to *get extra data*\nyou will probably also want to use `indexData()` to *convert into a string*\n\n#### Customize Getting Data for the Index: getDataForIndex()\n\nMake this method on your model to control, extend the data gathered for\nindexing...\n\nThis is useful if you need to contain other data.  A blog Post might need to be\nsearchable via the content of the Post as well as all of the comments.\n\n```\n/**\n * This method will customize the 'find' ElasticSearchIndex uses to get data for it's index\n *\n * @param mixed $id\n * @return array $record find('first')\n */\npublic function getDataForIndex($id) {\n\treturn $this-\u003efind('first', array(\n\t\t'fields' =\u003e array('id', 'title', 'body', 'date'),\n\t\t'contain' =\u003e array('Comment' =\u003e array('subject', 'body')),\n\t\t'conditions' =\u003e array(\"{$this-\u003ealias}.{$this-\u003eprimaryKey}\" =\u003e $id),\n\t));\n}\n```\n\n#### Customize Parsing Data for the Index: indexData()\n\nMake this method on your Model to process a data array into a string for indexing.\n\n```\n/**\n * This method will customize the parsing of\n * (array)data-\u003e(string)index for ElasticSearchIndex\n *\n * NOTE: the data must be set on the Model-\u003edata already\n *\n * @return string $index\n */\npublic function indexData() {\n\tif (empty($this-\u003edata)) {\n\t\treturn '';\n\t}\n\t// you'd want to customize this a bit more\n\t$data = Hash::filter(Hash::flatten($this-\u003edata));\n\treturn implode(' ', $data);\n}\n```\n\nIt should return a string (the text which will be stored in the index)\n\n#### Customize Cleaning Data for the Index: cleanForIndex()\n\nMake this method on your Model to clean or post-process the index text.\nYou can replace terms, characters or whatever you like.\n\n```\n/**\n * This method will Customize the cleaning of (string)index data\n *\n * @param string $index\n * @return string $index\n */\npublic function cleanForIndex($index) {\n\tif (!empty($this-\u003ecursewords)) {\n\t\tstr_replace($this-\u003ecursewords, '#\u0026$@!', $index);\n\t}\n\tpreg_replace('#[^0-9a-zA-Z\\s]#', ' ', $index);\n\tpreg_replace('#[\\s]+#', ' ', $index);\n\treturn $index;\n}\n```\n\n## How to re-index all Records\n\nIn any Model you can run `reIndexAll($conditions)` and it will walk through\nyour data and re-index all of them... it can be really slow...\n\n```\n// in Controller\n// this is really slow, but it will re-index everything (create/update indexes)\n$statusString = $this-\u003eMyModel-\u003ereIndexAll();\n// or you can pass in any conditions you like to limit the scope of the reIndex\n$statusString = $this-\u003eMyModel-\u003ereIndexAll(array(\n    'modified \u003e' =\u003e date('Y-m-d 00:00:00', strtotime('-2 months')),\n));\n```\n\n**Pro Tip:** the Icing Plugin has a `DoShell` allowing you to easily run any\nmethod on any Model... so from a command line you can do:\n\n```\n./cake Icing.do MyModel reIndexAll\n```\n\n## How to Search\n\nThe core search method for this behavior is `esSearchGetKeys`\nwhich returns just the `id`s of the `Model`.\n\n```\n$primaryKeys = $this-\u003eesSearchGetKeys($term);\n```\n\nAnd with `$optionsForElasticSearchRequest` (`limit`, `page`).\n\n```\n$primaryKeys = $this-\u003eesSearchGetKeys($term, $optionsForElasticSearchRequest);\n```\n\nThis is a really useful method, it can easily be added to any `conditions` array.\n\n```\n$conditions = array(\n\t\"{$this-\u003ealias}.{$this-\u003eprimaryKey}\" =\u003e $this-\u003eesSearchGetKeys('Search Term'),\n\t// the rest of my conditions which also have to match\n);\n```\n\nIf you are using the [CakeDC/search](https://github.com/CakeDC/search) plugin,\nyou can use this to make subquery or query filters... **(which is sweet!)**\n\n## How to Search with results Sorted by best match\n\nSearch results are usually sorted by which results are the best match for the\nsearch term.\n\n```\n$sortedIds = $this-\u003eesSearchGetKeys('Search Term');\n$results = $this-\u003efind('all', array(\n\t'conditions' =\u003e array(\n\t\t\"{$this-\u003ealias}.{$this-\u003eprimaryKey}\" =\u003e $sortedIds\n\t)\n));\n$results = $this-\u003esearchResultsResort($results, $sortedIds);\n```\n\n\n## Convenience Search, Resort, and Return Data\n\nIf you want to just get search results, without any other conditions, it's\nreally simple:\n\n```\n$findAllResults = $this-\u003esearch($term)\n```\n\nAnd here are all the possible paramters...\n\n```\n$findAllResults = $this-\u003esearch($term, $optionsForFindAll, $optionsForElasticSearchRequest);\n```\n\n\n\n\n## Background\n\nThis project is based in large part on the\n[Searchable/SearchIndex](https://github.com/zeroasterisk/Searchable-Behaviour-for-CakePHP)\nPlugin/Behavior and my former fork of it.  The original version stored all of\nthe index data into a MySQL table with a full-text-index.  That worked pretty\nwell, but it only worked with the MyISAM table engine and it doesn't offer all\nthe sweet search syntax/features.\n\nInitially, this was using the\n[Elasitc](https://github.com/dkullmann/CakePHP-Elastic-Search-DataSource)\nPlugin/Datasource and it worked ok... but there were un-necissary complications\ndue to the data storage patter (as CakePHP nested models) and because all of\nthe data for all of the models was stored in the same \"table\" on ElasticSearch.\nAlso the Elastic model required curl, not bad but not needed.\n\nNow ElasticSearchIndex is using\n[Icing.Lib/ElasticSearch](https://github.com/AudiologyHoldings/Icing/#elasticsearchrequest).\nfor interactions with ElasticSearch.\n\nIt's a little odd to interact with a \"database\" not through a \"datasource\" but\nthe Lib is really an extension of the HttpSocket utility, and it's indended to\nfacilitate both a raw interactions (where you manually create whatever data you\nwant to send) and it has tools to help automate simple data to pass.\n\n## Attribution\n\nThis project is an extension of Searchable/SearchIndex and informed by the Elastic DataSource...\nThe base of the work is theirs.  Big thanks!\n\n* https://github.com/dkullmann/CakePHP-Elastic-Search-DataSource\n* https://github.com/connrs/Searchable-Behaviour-for-CakePHP\n  [my fork](https://github.com/zeroasterisk/Searchable-Behaviour-for-CakePHP)\n* https://github.com/AudiologyHoldings/Icing\n\nand of course, you... pull requests welcome!\n\n## License\n\nThis code is licensed under the MIT License\n\n\nCopyright (C) 2013--2014 Alan Blount \u003calan@zeroasterisk.com\u003e https://github.com/zeroasterisk/\n\nPermission is hereby granted, free of charge, to any person obtaining a copy of\nthis software and associated documentation files (the \"Software\"), to deal in\nthe Software without restriction, including without limitation the rights to\nuse, copy, modify, merge, publish, distribute, sublicense, and/or sell copies\nof the Software, and to permit persons to whom the Software is furnished to do\nso, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\nSOFTWARE.\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fzeroasterisk%2Fcakephp-elasticsearchindex","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fzeroasterisk%2Fcakephp-elasticsearchindex","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fzeroasterisk%2Fcakephp-elasticsearchindex/lists"}