{"id":19500582,"url":"https://github.com/naturalintelligence/grapes","last_synced_at":"2025-08-31T08:44:05.201Z","repository":{"id":66140674,"uuid":"48052485","full_name":"NaturalIntelligence/Grapes","owner":"NaturalIntelligence","description":"Flexible Regular Expression","archived":false,"fork":false,"pushed_at":"2018-04-10T23:38:36.000Z","size":267,"stargazers_count":18,"open_issues_count":0,"forks_count":2,"subscribers_count":5,"default_branch":"master","last_synced_at":"2025-04-25T23:33:49.066Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/NaturalIntelligence.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2015-12-15T15:43:06.000Z","updated_at":"2024-08-12T06:51:28.000Z","dependencies_parsed_at":"2024-01-23T21:35:18.248Z","dependency_job_id":null,"html_url":"https://github.com/NaturalIntelligence/Grapes","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/NaturalIntelligence/Grapes","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NaturalIntelligence%2FGrapes","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NaturalIntelligence%2FGrapes/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NaturalIntelligence%2FGrapes/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NaturalIntelligence%2FGrapes/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/NaturalIntelligence","download_url":"https://codeload.github.com/NaturalIntelligence/Grapes/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NaturalIntelligence%2FGrapes/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":272959416,"owners_count":25022056,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-31T02:00:09.071Z","response_time":79,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-10T22:08:58.603Z","updated_at":"2025-08-31T08:44:05.170Z","avatar_url":"https://github.com/NaturalIntelligence.png","language":"Java","funding_links":["https://liberapay.com/amitgupta/donate","https://www.paypal.com/cgi-bin/webscr?cmd=_s-xclick\u0026hosted_button_id=KQJAX48SPUKNC"],"categories":[],"sub_categories":[],"readme":"# Grapes\n\n![Logo](assets/grapes_logo.svg)\n\n\u003ca href=\"https://liberapay.com/amitgupta/donate\"\u003e\u003cimg alt=\"Donate using Liberapay\" src=\"https://liberapay.com/assets/widgets/donate.svg\"\u003e\u003c/a\u003e\n\n\n## Mainteners are needed\nThough it needs just the work of few weeks. I'm quite engaged in the maintenace and development of other open source projects. I would appreciate if someone would like to raise a PR to get this thing done.\n\n## History\nI started this project by mistake... yeah you heard me right. I was developing a fast NLP tokenizer. One day I needed a feature of Regular Expression which can help me to validate some dynamic patterns. Since I was not aware that the feature is already presented with RE, I thought to code it by myself. Intially I thought to modify existing RE engine to support the new feature. But since I am bit lazy to read books, I couldn't read the whole book of Autometa theory to know how RE engine works (I dint remember how I passed the exam of Autometa theory in my college) so I decided to develop my own RE engine. And that's how this project was started. Funny but True\n\nLater on I realized that this is 3 times faster than java RE. So I decided to using Grapes instead of RE in my project. And then I introduced new features which are not even present with current RE that we'll discuss later in this ReadMe.\n\nCurrently Grapes is 1.5-2 times faster than current Java RE.\n\n**Contribute** : You can contribute to this project by adding new features mentioned at the end of this README or by testing and figuring out bug if any, or by just writing more unit tests to cover more scenarios that I may have missed, or just to buy some time for me by donating. [![Donate to author](https://www.paypalobjects.com/webstatic/en_US/btn/btn_donate_92x26.png)](https://www.paypal.com/cgi-bin/webscr?cmd=_s-xclick\u0026hosted_button_id=KQJAX48SPUKNC)\n\n## Description\nGrapes is kind of Regular Expression to make string comparision faster.\n\nLet's understand it with some examples;\n\n1. Create the Grapes as you create RE (not all the features of RE is supported currently)\n\n  ```java\n\tseq = new BooleanSequence(\"ab(cde|c)?mn\");\n  ```\n2. Compile it and minimize it. Better to do it in the starting of your application.\n\n  ```java\n\tseq.compile().minimize();\n  ```\n  \n3. Use an appropriate Matcher\n\n  ```java\n\tCoreMatcher matcher = seq.getCoreMatcher();\n  ```\n4. Assert\n\n  ```java\n    assertTrue(matcher.match(\"abmn\".toCharArray()));\n    assertTrue(matcher.match(\"abcdemn\".toCharArray()));\n    assertTrue(matcher.match(\"abcmn\".toCharArray()));\n  ```\n\nCurrently 3 types of matcher are supported.\n\n1. **Core Matcher** (as explained above)\n2. **Progressive Matcher** : You need not to pass complete string for comparison in one go. It is good in case of streams.\n  \n  ```java\n\tBooleanSequence seq = new BooleanSequence(\"a([bc])d(mn|o)\\\\1a\\\\2\");\n\tseq.capture = true;\n\tseq.compile().minimize();\n\tProgressiveMatcher matcher = new ProgressiveMatcher(seq);\n\n\tAssert.assertEquals(FAILED,matcher.match());\n\tAssert.assertEquals(MATCHED,matcher.match(\"ab\".toCharArray()));\n\tAssert.assertEquals(MATCHED,matcher.match(\"abdob\".toCharArray()));\n\tAssert.assertEquals(PASSED,matcher.match(\"abdobao\".toCharArray()));\n  ```\n  \n3. **Lazy Matcher** : They are same as Progressive Matcher. But with this matcher you can pass just the extra part of string.\n  \n  ```java\n\tBooleanSequence seq = new BooleanSequence(\"a([bc])d(mn|o)\\\\1a\\\\2\");\n\tseq.capture = true;\n\tseq.compile().minimize();\n\tLazyMatcher matcher = new LazyMatcher(seq);\n\n\tAssert.assertEquals(FAILED,matcher.match());\n\tAssert.assertEquals(MATCHED,matcher.match(\"ab\".toCharArray()));\n\tAssert.assertEquals(MATCHED,matcher.match(\"dob\".toCharArray()));\n\tAssert.assertEquals(PASSED,matcher.match(\"ao\".toCharArray()));\n  ```\n  \nThese are the 3 matchers I intially created. But you can create your own matchers of different features. Moreover, currently they accept char[] as input. But you can create them to accept byte[], list etc. to make them more fast. One of the sample matcher I have created under the matcher package.\n\n## Features\nIn addition of this; There are many other features;\n\n* **JSON view** : You can convert a Grapes sequence to json. I have created a temporary visualization tool to understand how a sequence is evaluated.\n  \n  ```java\n\tRESequenceUtil.toJson(reSeq));\n  ```\n* **Merge Sequences** : You can merge any number of sequences.\n  \n  ```java\n\tBooleanSequence rootSeq = new BooleanSequence(\"\");rootSeq.compile().minimize();\n\tBooleanSequence angleSeq = new BooleanSequence(\"-?0(.0)?°\",TokenType.ANGLE); angleSeq.compile().minimize();\n\tBooleanSequence cordinateSeq = new BooleanSequence(\"-?0(.0)?° -?0(.0)?['′] -?0(.0)?[\\\"″] a\",TokenType.G_Cordinate);cordinateSeq.compile().minimize();\n\tBooleanSequence tempratureSeq = new BooleanSequence(\"-?0(.0)?° ?a\",TokenType.TEMPRATURE);tempratureSeq.compile().minimize();\n\n\trootSeq.merge(angleSeq).merge(cordinateSeq).merge(tempratureSeq);\n\t \n\tProgressiveMatcher matcher = rootSeq.getProgressiveMatcher();\n\t \n\tAssert.assertEquals(TokenType.ANGLE,matcher.match(\"0°\".toCharArray()));\n\tAssert.assertEquals(TokenType.G_Cordinate,matcher.match(\"0° 0′ 0″ a\".toCharArray()));\n  ```\n  \n* **Custom return type** : Instead of just returning whether your string is matching to a sequence or not, you can also return custom type. It is basically helpful when you merge multiple type of expressions. You can refer above example for the same. By default it returns : PASSED, MATCHED, FAILED.\n\t\n* **Optimized expressions** : Whether you write \"bc|bcd|bcde\" , \"bc(d|de)?\", or \"bc(de?)?\", Grapes Parser parses the same sequence. Hence the performance of evaluating all mentioned 3 expressions is same.\n\nand more ...\n\n### Supported RE symbols\nCurrently Grapes is supporting following RE symbols\n\n* Range Selector : [,] eg [a-zA-Z0-9], [abc]\n* Grouping, Capture, Sub Expression using '(', ')'\n* Optional '?'\n* Or '|'\n* Any '.'\n* Dynamic selection \\\\1 upto 9 *forgive my laziness to support upto 9 only*\n\n\n### Next plan\nMy immediate plans to support \n* Frequence {min,max}, {,max}, {min,} , '+', '*'\n* Convert into gradle project. So that you can use generated jar directly instead of compiling the project by yourself [done]\n\n\n### Future Plan\n* Improving \"JSON to Graph\" view tool\n* Laziness in case of dynamic selection\n* Making it thread safe\n* Supporting more RE symbols\n\n\n### Worth to mention\n\n- **[निम्न (NIMN)](https://github.com/nimndata/spec)** : Schema aware object compression. 60% more compressed than JSON. 40% more compressed than msgpack.\n- **[imglab](https://github.com/NaturalIntelligence/imglab)** : Web based tool to label images for object. So that they can be used to train dlib or other object detectors. You can integrate 3rd party libraries for fast labeling.\n- [fast-lorem-ipsum](https://github.com/amitguptagwl/fast-lorem-ipsum) : Generate lorem ipsum words, sentences, paragraph very quickly.\n- [stubmatic](https://github.com/NaturalIntelligence/Stubmatic) : A stub server to mock behaviour of HTTP(s) / REST / SOAP services.\n- [अनुमार्गक (anumargak)](https://github.com/NaturalIntelligence/anumargak) : Amazinf fast router for node web servers.\n- [fastify-xml-body-parser](https://github.com/NaturalIntelligence/fastify-xml-body-parser/) : Fastify plugin / module to parse XML payload / body into JS object using fast-xml-parser.\n- [Grapes](https://github.com/amitguptagwl/grapes) : Flexible Regular expression engine (for java) which can be applied on char stream. (under development)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnaturalintelligence%2Fgrapes","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fnaturalintelligence%2Fgrapes","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnaturalintelligence%2Fgrapes/lists"}