{"id":50680395,"url":"https://github.com/yagizerdem/needle","last_synced_at":"2026-06-08T18:04:22.376Z","repository":{"id":361658930,"uuid":"1065442030","full_name":"yagizerdem/needle","owner":"yagizerdem","description":"A regular expression engine built from scratch.","archived":false,"fork":false,"pushed_at":"2026-05-31T20:16:43.000Z","size":101,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-05-31T22:05:40.091Z","etag":null,"topics":["nfa","parser","regex"],"latest_commit_sha":null,"homepage":"","language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/yagizerdem.png","metadata":{"files":{"readme":"readme.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-09-27T18:28:50.000Z","updated_at":"2026-05-31T20:16:46.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/yagizerdem/needle","commit_stats":null,"previous_names":["yagizerdem/regex_engine","yagizerdem/needle"],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/yagizerdem/needle","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yagizerdem%2Fneedle","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yagizerdem%2Fneedle/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yagizerdem%2Fneedle/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yagizerdem%2Fneedle/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/yagizerdem","download_url":"https://codeload.github.com/yagizerdem/needle/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yagizerdem%2Fneedle/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34073838,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-08T02:00:07.615Z","response_time":111,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["nfa","parser","regex"],"created_at":"2026-06-08T18:04:20.394Z","updated_at":"2026-06-08T18:04:22.371Z","avatar_url":"https://github.com/yagizerdem.png","language":"Java","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Needle — Regex Engine\n\nA regex engine built from scratch in Java, based on Thompson NFA construction. It compiles a regex pattern into an AST, builds an NFA from that AST, and matches input via NFA simulation.\n\n## Features\n\n- Concatenation, alternation (`|`), and grouping (`(...)`)\n- Quantifiers: `*`, `+`, `?`, `{n}`, `{n,}`, `{n,m}`\n- Character classes: `[abc]`, `[a-z]`, negation `[^...]`\n- Wildcard: `.`\n- Escapes: `\\` for matching special characters literally\n- Three execution modes:\n  - `exact` \u0026rarr; the entire input must match the pattern\n  - `contains` \u0026rarr; any substring of the input must match\n  - `grep` \u0026rarr; line-by-line search across the given files\n\n## Project Structure\n\n- [src/main/java/Needle/Preprocessor.java](src/main/java/Needle/Preprocessor.java) — Resolves escapes and tags characters\n- [src/main/java/Needle/Lexer.java](src/main/java/Needle/Lexer.java) — Token generation\n- [src/main/java/Needle/Parser.java](src/main/java/Needle/Parser.java) — Recursive-descent parser, produces the AST\n- [src/main/java/Needle/AstNode.java](src/main/java/Needle/AstNode.java) — AST node definitions\n- [src/main/java/Needle/NfaBuilder.java](src/main/java/Needle/NfaBuilder.java) — Thompson NFA construction from the AST\n- [src/main/java/Needle/NfaSimulation.java](src/main/java/Needle/NfaSimulation.java) — NFA simulation (the matching engine)\n- [src/main/java/Needle/Core.java](src/main/java/Needle/Core.java) — High-level API for compile + match\n- [src/main/java/Needle/Main.java](src/main/java/Needle/Main.java) — CLI entry point\n- [src/main/java/Needle/cfg.md](src/main/java/Needle/cfg.md) — Supported grammar (BNF)\n\n## Grammar\n\nFor the full supported grammar, see [src/main/java/Needle/cfg.md](src/main/java/Needle/cfg.md).\n\n## Build\n\n```powershell\n.\\gradlew.bat build\n```\n\n## Test\n\n```powershell\n.\\gradlew.bat test\n```\n\nTest classes:\n\n- [src/test/java/LexerTest.java](src/test/java/LexerTest.java)\n- [src/test/java/AstPrinterTest.java](src/test/java/AstPrinterTest.java)\n- [src/test/java/NfaSimulationTest.java](src/test/java/NfaSimulationTest.java)\n- [src/test/java/SubstringSearchTest.java](src/test/java/SubstringSearchTest.java)\n\n## Usage\n\nCLI arguments:\n\n| Argument                         | Description                                             |\n|----------------------------------|---------------------------------------------------------|\n| `--regex=\u003cpattern\u003e`              | Regex pattern                                           |\n| `--input=\u003ctext\u003e`                 | Input to match against (for `exact` / `contains` modes) |\n| `--mode=\u003cexact\\|contains\\|grep\u003e` | Execution mode (default: `exact`)                       |\n| `--files=f1,f2,...`              | Files to scan in `grep` mode                            |\n\n### Examples\n\nExact match:\n\n```powershell\njava -jar needle.jar --regex=\"a(b|c)+\" --input=\"abccb\" --mode=exact\n```\n\nSubstring search:\n\n```powershell\njava -jar needle.jar --regex=\"[0-9]{3}\" --input=\"code: 421 ok\" --mode=contains\n```\n\nGrep across files:\n\n```powershell\njava -jar needle.jar --regex=\"error\" --mode=grep --files=log1.txt,log2.txt\n```\n\n## Architecture\n\n```\nregex string\n    │\n    ▼\nPreprocessor  ── resolves escapes, produces a Pchar list\n    │\n    ▼\nLexer         ── token stream\n    │\n    ▼\nParser        ── AST\n    │\n    ▼\nNfaBuilder    ── NFA via Thompson's construction\n    │\n    ▼\nNfaSimulation ── simulates the NFA over the input\n    │\n    ▼\nresult (boolean / matching lines)\n```\n## Author\nYagiz Erdem \\\nyagizerdem819@gmail.com\n\n## License\nReleased under MIT License","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fyagizerdem%2Fneedle","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fyagizerdem%2Fneedle","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fyagizerdem%2Fneedle/lists"}