{"id":17284644,"url":"https://github.com/rangoo94/object-regexp","last_synced_at":"2025-03-26T16:25:59.343Z","repository":{"id":57312736,"uuid":"119565531","full_name":"rangoo94/object-regexp","owner":"rangoo94","description":"Match regular expressions on list of objects","archived":false,"fork":false,"pushed_at":"2018-02-10T11:01:44.000Z","size":164,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-03-18T10:49:23.734Z","etag":null,"topics":["formula","js-engine","objects","parser","quantifiers","regexp","regular-expression"],"latest_commit_sha":null,"homepage":null,"language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/rangoo94.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2018-01-30T16:49:44.000Z","updated_at":"2019-01-03T12:24:18.000Z","dependencies_parsed_at":"2022-09-20T23:02:41.888Z","dependency_job_id":null,"html_url":"https://github.com/rangoo94/object-regexp","commit_stats":null,"previous_names":[],"tags_count":20,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rangoo94%2Fobject-regexp","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rangoo94%2Fobject-regexp/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rangoo94%2Fobject-regexp/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rangoo94%2Fobject-regexp/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/rangoo94","download_url":"https://codeload.github.com/rangoo94/object-regexp/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":245691038,"owners_count":20656693,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["formula","js-engine","objects","parser","quantifiers","regexp","regular-expression"],"created_at":"2024-10-15T09:54:37.204Z","updated_at":"2025-03-26T16:25:59.318Z","avatar_url":"https://github.com/rangoo94.png","language":"JavaScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Regular Expressions for Objects\n\n[![Travis](https://travis-ci.org/rangoo94/object-regexp.svg)](https://travis-ci.org/rangoo94/object-regexp)\n[![Code Climate](https://codeclimate.com/github/rangoo94/object-regexp/badges/gpa.svg)](https://codeclimate.com/github/rangoo94/object-regexp)\n[![Coverage Status](https://coveralls.io/repos/github/rangoo94/object-regexp/badge.svg?branch=master)](https://coveralls.io/github/rangoo94/object-regexp?branch=master)\n[![NPM Downloads](https://img.shields.io/npm/dm/object-regexp.svg)](https://www.npmjs.com/package/object-regexp)\n\nIt handles syntax similar expressions to regexp, to search within array of objects.\n\n## What is it useful for?\n\nMost important use case for this package is to use it in parsers,\nas it can help build syntax tree out of tokens, with simple and known syntax.\n\nAlso, because of this case it is **very efficient, result optimizations here are done even on `0.000001ms` level (`1e-6ms` or `1e-3μs`)**.\nWorks best on newest V8 engines.\n\n## Optimizations put into\n\nAs this library is working in such speed, there is a lot of optimizations put inside, i.e.:\n\n- **Hidden classes optimizations**\n  keeping proper order (and types) of both initialized and mutated properties\n- **Only fast constructions**\n  `else`/`else if` are too slow on this level of optimization\n- **Better typing**\n  Some instructions are build using `eval` to allow JS engine to work faster on them\n- **Less context switch**\n  Accessing different contexts is always slow, calling functions most of time, causing a lot of `ContextifyScript::New` events\n- **Optimizing inline caches**\n  Make less options for JS engine which it have to cover, by using same types and separating these which have differences\n- ...many, many others.\n\n## How to install\n\nPackage is available as `object-regexp` in NPM, so you can use it in your project using\n`npm install object-regexp` or `yarn add object-regexp`\n\n## What are requirements?\n\nCode itself is written in ES6 and should work in Node.js 6+ environment (best in Node.js 9+).\nIf you would like to use it in browser or older development, there is also transpiled and bundled (UMD) version included.\nYou can use `object-regexp/browser` in your requires or `ObjectRegexp` in global environment (in browser):\n\n```js\n// Load library\nconst ObjectRegexp = require('object-regexp/browser')\n\nconst expression = '[ReservedWord][Space]+[Variable]'\nconst process = ObjectRegexp.compile(expression)\n\nconst objects = [ { type: 'ReservedWord' } ]\n\nconsole.log(process(objects))\n```\n\n## What is proper input for library?\n\nMost importantly, it should be array of objects. If you would like to use syntax like `[ObjectType]`,\nthese objects should have `type` property. Additionally, if you would like to match by value (`ObjectType=value]`),\nlibrary is matching that value against `object.data.value`. Example input:\n\n```js\nconst input = [\n  { name: 'ReservedWord', data: { value: 'declare' } },\n  { name: 'Space' },\n  { name: 'Space' },\n  { name: 'NewLine' },\n  { name: 'Variable' }\n]\n```\n\n## How does syntax look?\n\nSyntax is very similar to regular expressions, i.e.:\n\n```js\n[ReservedWord=declare][Space|NewLine]+([Variable]|[Number][Unit])\n```\n\nwhich could much previous input.\n\n### Possible syntax options\n\n#### Object types\n\nYou can match against `type` property in objects using `[Type]` syntax, example:\n\n```js\nconst input = [ { type: 'Rule' } ]\nconst matchingExpression = '[Rule]'\n```\n\nAlso, using `[Type=value]` format you can match against `data.value` property:\n\n```js\nconst input = [ { type: 'Rule', data: { value: 'abrakadabra' } } ]\nconst matchingExpression = '[Rule=abrakadabra]'\n```\n\nThere is also a way to simplify alternative, using `|` character inside:\n\n```js\nconst input = [ { type: 'Rule', data: { value: 'abrakadabra' } } ]\nconst matchingExpression = '[Rule=abrakadabra|Anything|Else=xyz]'\n```\n\nRule above will match any object which is either `Rule=abrakadabra`, `Anything` or `Else=xyz`.\n\n#### Negated object types\n\nYou can match also objects which are NOT as specified, similar way to regular expressions:\n\n```js\nconst input = [ { type: 'Rule' } ]\nconst matchingExpression = '[^Something]'\n```\n\nIt will match any object which has type different than `Something`. Same as in basic object types,\nyou can check value:\n\n```js\nconst input = [ { type: 'Rule', data: { value: 'regexp' } } ]\nconst matchingExpression = '[^Rule=abrakadabra]'\n```\n\nThis expression will match input, as it has different value.\n\nSimilar (but more importantly) to basic object types, you can combine few types:\n\n```js\nconst input = [ { type: 'Rule', data: { value: 'regexp' } } ]\nconst matchingExpression = '[^Rule=abrakadabra|Anything|Else=xyz]'\n```\n\nThis rule will match everything that is NOT any of these rules (instead of using OR it uses AND).\n\n#### Any object\n\nUsing `.` you can match any object:\n\n```js\nconst input = [ { type: 'Rule', data: { value: 'regexp' } }, { type: 'Rule', data: { value: 'regexp' } } ]\nconst matchingExpression = '..' // expects 2 objects, no matter what is inside\n```\n\n#### Alternatives\n\nYou can match with simple alternatives using `|` sign.\nEverything in current block which is on left side will be first option, everything on right - second.\n\nExamples:\n\n```js\nconst input = [ { type: 'Rule', data: { value: 'regexp' } } ]\n\nconst expression1 = '[Rule]|[Rule2]' // will match object which is `Rule` or `Rule2`\nconst expression2 = '[Rule]|[Rule2][Rule3]' // will match either `[Rule]` or `[Rule2][Rule3]`\nconst expression3 = '[Rule]|[^Rule4]' // will firstly try to match `[Rule]`, otherwise `[^Rule4]`\nconst expression4 = '[Rule]|[Rule2]|[^Rule4]' // you can nest them as well\n```\n\n#### Optionals\n\nOptionals are simple alternatives, either will be found or not. You should use `?` sign for that:\n\n```js\nconst input = [ { type: 'Rule', data: { value: 'regexp' } } ]\nconst matchingExpression = '[OtherRule]?' // it will match, as this `OtherRule` may or may not be.\n```\n\n#### Groups\n\nYou can define groups. Simple groups are not captured, but can be used to apply rule above:\n\n```js\nconst input = [ { type: 'Rule' }, { type: 'Rule' }, { type: 'Rule' } ]\nconst matchingExpression = '([Rule][Rule][Rule])?' // optional will check for 3 objects\n```\n\nAlso, if you would like to get content from inside you can use named groups:\n\n```js\nconst input = [ { type: 'Rule' }, { type: 'Rule' }, { type: 'Rule' } ]\n\n// Optional will check for 3 objects, and as result you will get information about them\nconst matchingExpression = '(?\u003cname\u003e[Rule][Rule][Rule])?'\n```\n\n#### Expected number of occurrences\n\nSimilar to regexp, we've got four ways to describe expected number of occurrences:\n\n##### At least N objects\n\n```js\nconst input = [ { type: 'Rule' }, { type: 'Rule' }, { type: 'Rule' } ]\n\n// It will match at least 2 objects, but catch as many as it can (this time it's 3)\nconst matchingExpression = '[Rule]{2,}'\n```\n\n##### Maximum N objects\n\n```js\nconst input = [ { type: 'Rule' }, { type: 'Rule' }, { type: 'Rule' } ]\n\n// It will match at most 5 objects, but will allow smaller number of objects\nconst matchingExpression = '[Rule]{,5}'\n```\n\n##### Amount between\n\n```js\nconst input = [ { type: 'Rule' }, { type: 'Rule' }, { type: 'Rule' } ]\n\n// It will match between 2 and 5 objects, trying to catch as much as it can\nconst matchingExpression = '[Rule]{2,5}'\n```\n\n##### Exact amount\n\n```js\nconst input = [ { type: 'Rule' }, { type: 'Rule' }, { type: 'Rule' } ]\n\n// It will match only 3 objects, it's equivalent of [Rule][Rule][Rule]\nconst matchingExpression = '[Rule]{3}'\n```\n\n#### \"Any\" quantifier\n\nWe've got - same as in regular expressions - \"Any\" quantifier which is represented by `*`.\nIt is searching for as many objects it can, but it will accept no objects as well.\n\n```js\nconst input = [ { type: 'Rule' }, { type: 'Rule' }, { type: 'Rule' } ]\n\n// It will match as many objects as it can, this time 3\nconst matchingExpression = '[Rule]*'\n\n// It will match as many objects as it can, this time 0\nconst matchingExpression = '[UnknownRule]*'\n```\n\nThis is greedy quantifier, if you would like to use lazy quantifier you can use `*?`.\nDifference between is that **lazy (*?)** quantifier will try to gather as less as it can,\nwhen **greedy (*)** will try to get as many as it can:\n\n```js\nconst input = [ { type: 'Rule' } ]\nconst matchingExpression = '[Rule]*?' // It will catch nothing\n\nconst input = [ { type: 'Rule' }, { type: 'Rule' }, { type: 'AnotherRule' } ]\n\n// This time it will catch two `[Rule]` objects, to satisfy root expression (finding `AnotherRule` later).\nconst matchingExpression = '[Rule]*?[AnotherRule]'\n```\n\n#### \"Many\" quantifier\n\nThere is also \"Many\" quantifier (`+`) which is very similar to \"Any\".\nOnly difference is that it will fail if no objects found.\n\n```js\nconst input = [ { type: 'Rule' }, { type: 'Rule' }, { type: 'Rule' } ]\n\n// It will match as many objects as it can, this time 3\nconst matchingExpression = '[Rule]+'\n\n// It will not match :(\nconst notMatchingExpression = '[UnknownRule]+'\n```\n\nThere is also lazy version:\n\n```js\nconst input = [ { type: 'Rule' }, { type: 'Rule' } ]\n\n// It will catch single `Rule`, as it's smallest amount it can accept\nconst matchingExpression = '[Rule]+?'\n\nconst input = [ { type: 'Rule' }, { type: 'Rule' }, { type: 'AnotherRule' } ]\n\n// It will catch two `[Rule]` objects, to satisfy root expression (finding `AnotherRule` later).\nconst matchingExpression = '[Rule]+?[AnotherRule]'\n```\n\n#### Atomic groups\n\nIf you would like regular expression to work faster, you can think about atomic groups (and possessive quantifiers).\nThese groups, after will be finished will remove it's save points - you can't recover for them.\n\nSee example:\n\n```js\nconst rules = [ { type: 'A' }, { type: 'A' }, { type: 'A' } ]\n\n// It will get all A's to first quantifier ([A]+), but when it will try to get ending A,\n// it will recover to [A]+ with 2 elements (as no other A's left).\n// So, because of recovering, this expression will MATCH rules above.\nconst matchingExpression = '[A]+[A]'\n\n// It will get all A's to first quantifier. It will satisfy atomic group,\n// but nothing will be left for ending [A], so expression will fail.\nconst failingExpression = '(?\u003e[A]+)[A])'\n```\n\nYou can pass any number of sub-instructions to atomic groups.\n\n#### Possessive quantifiers\n\nThere are simpler instructions for stuff like `(?\u003e[A]+)`, which are possessive quantifiers.\nYou can make `?`, `+`, `*` possessive, using `+` sign, in sequence: `?+`, `++`, `*+`.\nIt will be equivalent of `(?\u003e[A]?)`, `(?\u003e[A]+)` and `(?\u003e[A}*)`.\n\nAlso, you can make it for `Amount at least`, `Amount at most` and `Amount between` quantifiers,\njust adding `+` sign after.\n\n#### Start and end index\n\nIn regular expressions there are common used `^` and `$` signs matching beginning and end of string.\n\nFor performance reasons we don't have (yet?) starting index (`^`), but we have `$` sign.\n\nIf you would like to search from different index (than beginning),\nlook at **\"Searching from different index than beginning\"** chapter.\n\nExample:\n\n```js\nconst rules = [ { type: 'A' }, { type: 'A' }, { type: 'A' } ]\nconst matchingExpression = '[A][A]'\nconst matchingExpression2 = '[A][A][A]$'\nconst notMatchingExpression = '[A][A]$'\n```\n\n#### Missing parts of syntax\n\nThere are most important things for parsing, but still we are missing some features out of regular expressions:\n\n- Beginning index (`^`)\n- Negative and positive lookaheads (`?!` and `?=`)\n\n### How to use it\n\nMost importantly `object-regexp` package is exporting `compile` and `toCode` methods.\n\n```js\nconst compile = require('object-regexp').compile\nconst toCode = require('object-regexp').toCode\n\nconst expression = '[Space]+'\nconst match = compile(expression)\n\nconst objects = [ { type: 'Space' }, { type: 'Space' }, { type: 'Space' } ]\n\n// Match dynamic expressions\nconsole.log(match(objects))\n\n// Save standalone code of expression\nrequire('fs').writeFileSync('expression.js', 'module.exports = ' + toCode(expression))\n```\n\nFormat of success result:\n\n```js\nconst result = {\n  finished: true, // This expression is fully finished\n  index: 0, // beginning index for searching\n  length: 10, // number of objects which are matching this expression\n  expectations: [\n    // even succeeded expression can be continued,\n    // so sometimes you may want expectations to extend it\n    { type: 'oneOf' step: 1, options: [ { type: 'Space' }, { type: 'NewLine' } ] },\n    { type: 'notOneOf', step: 4, options: [ { type: 'Space' }, { type: 'NewLine' } ] },\n    { type: 'any', step: 10 }\n  ],\n  groups: {\n    group1: { from: 0, to: 3 } // objects for named group `group1` found between 0 and 3 indexes\n  }\n}\n```\n\nWhen there is no way to continue this expression, `expectations` will be `null`.\n\nFormat of failed result which **CAN'T** be continued:\n\n```js\nnull\n```\n\nIt's just always `null`.\n\nFormat of failed result which **CAN** be continued with some objects:\n\n```js\nconst failedResult = {\n  finished: false,\n  expectations: [\n    { type: 'oneOf', step: 1, options: [ { type: 'Space' }, { type: 'NewLine' } ] },\n    { type: 'notOneOf', step: 5, options: [ { type: 'Space' }, { type: 'NewLine' } ] },\n    { type: 'any', step: 3 }\n  ]\n}\n```\n\nAs you can see, failed result can return some expectations.\nIf you passed all objects you have, it means that it failed.\nOtherwise, if you are adding them one by one,\nit says what should be in next object to allow continuing on this expression.\n\nExample:\n\n```js\nconst compile = require('object-regexp').compile\n\nconst expression = '[Space]+[Literal]'\nconst match = compile(expression)\n\nconst objects = [ { type: 'Space' }, { type: 'Space' } ]\n\nconsole.log(match(objects))\n\n/*\n{\n  expectations: [\n    { type: 'oneOf', step: 3, options: [ { type: 'Space' } ] },\n    { type: 'oneOf', step: 4, options: [ { type: 'Literal' } ] }\n  ]\n*/\n```\n\nAs you can see this expression couldn't be finished, because of lack of `Literal`.\nThis engine assumes, that you may send something more to satisfy matcher.\nIn this case, you can either send `Space` object (and later again `Space` or `Literal`)\nor `Literal` object to finish expression.\n\nIt's mostly important if you will try to parse one by one, with some rules what objects can be included in some place.\n\n- `oneOf` is equivalent of missing `[Type]` rule\n- `notOneOf` is equivalent of missing `[^Type]` rule\n- `any` means that it can be any object\n\nSumming up, to check if expression has succeed, you have to check:\n\n```js\n// ...\n\nconst result = match(objects)\nconst succeed = result \u0026\u0026 result.finished\n```\n\n### Macros\n\nAdditionally, to make life simpler there is created a way to pass macros for expression.\nIt's very useful when you are making a lot of rules, and you would prefer, i.e.\n\n```\n@for (var:$var) #from (from:$value) (how:#to|#through) (end:$value)\n```\n\nover\n\n```\n[AtRule=for][Space|NewLine]+(?\u003cvar\u003e[Variable])[Space|NewLine]+[Literal=from][Space|NewLine]+(?\u003cfrom\u003e[String|Variable|Number])[Space|NewLine]+(?\u003chow\u003e[Literal=to|Literal=through])[Space|NewLine]+(?\u003cend\u003e[String|Variable|Number])\n```\n\nsyntax.\n\nMacros have pretty simple format:\n\n```js\nconst macros = [\n  {\n    // You can use named groups, and apply them to result\n    // Also, $0, $1, $2... works for captured groups\n    from: '@(?\u003cname\u003e\\\\w+)'\n    to: '[AtRule=$name]'\n  },\n  {\n    // 'from' is regular expression,\n    // so you have to escape every reserved sign you want to use literally\n    from: '\\\\$var',\n    to: '[Variable]'\n  },\n  {\n    from: '\\\\$value',\n    to: '[String|Variable|Number]'\n  },\n  {\n    from: ' ',\n    to: '[Space|NewLine]+'\n  },\n  {\n    // It is replacing expression before parsing, so you can event change syntax a little:\n    from: '\\\\((?\u003cname\u003e[a-zA-Z]+):',\n    to: '(?\u003c$name\u003e'\n  },\n  {\n    from: '#(?\u003cname\u003e[a-zA-Z]+)',\n    to: '[Literal=$name]'\n  }\n]\n```\n\nTo apply macros to expression, you have to use second parameter of compile method:\n\n```js\nconst compile = require('object-regexp').compile\n\nconst expression = '(?\u003cspacing\u003e[Space]+)[Literal]'\nconst macros = [ /* ... */ ]\nconst match = compile(expression, macros)\n\nconst objects = [ { type: 'Space' }, { type: 'Space' } ]\n\nconsole.log(match(objects))\n```\n\n### Searching from different index than beginning\n\nYou don't have to search from beginning of list, there is also `startIndex` parameter:\n\n```js\nconst compile = require('object-regexp').compile\n\nconst expression = '(?\u003cspacing\u003e[Space]+)[Literal]'\nconst macros = [ /* ... */ ]\nconst match = compile(expression, macros)\n\nconst objects = [ { type: 'X' }, { type: 'Space' }, { type: 'Space' } ]\n\nconsole.log(match(objects, 1))\n```\n\n### Beautify code parameter\n\nBy default there is no indentation preserved when you are generating your matching code.\n\nIf you would like to beautify this code,\nyou can pass third parameter to `toCode` function:\n\n```js\nconst toCode = require('object-regexp').toCode\n\nconst expression = '(?\u003cspacing\u003e[Space]+)[Literal]'\nconst code = toCode(expression, null, true)\n```\n\n\u003e **Remember:**\n\u003e\n\u003e FOR BETTER PERFORMANCE DO NOT EDIT THESE FILES (UGLIFY AT MOST).\n\u003e EVEN REDUNDANT `ok = true` IS THERE TO MAKE IT FASTER.\n\n## Changelog\n\n### Version 2\n\n- **2.0.0** - inline all instructions, add optimizations (2-100x faster than v1), write tests\n\n### Version 1\n\n- **1.3.6** - optimize a lot, mostly groups and possessive instructions\n- **1.3.5** - fix `walkBackEnd` traverser to always go from end\n- **1.3.4** - added `startIndex` parameter for processing functions\n- **1.3.3** - fix serialization function (add missing node types)\n- **1.3.2** - optimize last atomic instructions (on the end of `Root` or `AtomicGroup`)\n- **1.3.1** - added end index sign (`$`)\n- **1.3.0** - add possessive quantifiers and atomic groups\n- **1.2.1** - add missing `universal-lexer` dependency\n- **1.2.0** - return `expectations` even if expression succeed\n- **1.1.6** - fix critical problem with `Many Lazy` and `Any Lazy` formulas\n- **1.1.5** - fix critical problem with `Any Object` formula\n- **1.1.4** - just rebuild broken NPM package\n- **1.1.3** - optimize going between nodes while processing instruction (works up to 2x faster)\n- **1.1.2** - fix problems with `Nothing` rule\n- **1.1.1** - optimize simple instructions (works 2-3x faster)\n- **1.1.0** - fix problem with `Many Lazy` and `Any Lazy`, add documentation to processing function\n- **1.0.2** - add information about `Exact amount` quantifier\n- **1.0.1** - small fixes for README file\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frangoo94%2Fobject-regexp","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frangoo94%2Fobject-regexp","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frangoo94%2Fobject-regexp/lists"}