{"id":19610565,"url":"https://github.com/kaboc/dart_text_parser","last_synced_at":"2025-07-06T16:32:16.820Z","repository":{"id":54170516,"uuid":"315317637","full_name":"kaboc/dart_text_parser","owner":"kaboc","description":"A Dart package for flexibly parsing text into easy-to-handle format according to multiple regular expression patterns.","archived":false,"fork":false,"pushed_at":"2024-12-14T07:53:24.000Z","size":177,"stargazers_count":9,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-04-27T21:35:09.414Z","etag":null,"topics":["dart","text-parser"],"latest_commit_sha":null,"homepage":"https://pub.dev/packages/text_parser","language":"Dart","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/kaboc.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2020-11-23T13:04:34.000Z","updated_at":"2025-03-02T15:53:18.000Z","dependencies_parsed_at":"2024-11-11T10:31:26.226Z","dependency_job_id":"7af93c74-9528-4eb2-b62c-f4d655f54f04","html_url":"https://github.com/kaboc/dart_text_parser","commit_stats":{"total_commits":93,"total_committers":2,"mean_commits":46.5,"dds":"0.34408602150537637","last_synced_commit":"9045fbd68677d8825476d176ad1c20ad49b42359"},"previous_names":[],"tags_count":28,"template":false,"template_full_name":null,"purl":"pkg:github/kaboc/dart_text_parser","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kaboc%2Fdart_text_parser","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kaboc%2Fdart_text_parser/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kaboc%2Fdart_text_parser/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kaboc%2Fdart_text_parser/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/kaboc","download_url":"https://codeload.github.com/kaboc/dart_text_parser/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kaboc%2Fdart_text_parser/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":263934570,"owners_count":23532167,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["dart","text-parser"],"created_at":"2024-11-11T10:30:08.228Z","updated_at":"2025-07-06T16:32:16.745Z","avatar_url":"https://github.com/kaboc.png","language":"Dart","readme":"[![Pub Version](https://img.shields.io/pub/v/text_parser)](https://pub.dev/packages/text_parser)\n[![Dart CI](https://github.com/kaboc/dart_text_parser/workflows/Dart%20CI/badge.svg)](https://github.com/kaboc/dart_text_parser/actions)\n[![codecov](https://codecov.io/gh/kaboc/dart_text_parser/branch/main/graph/badge.svg?token=YTDF6ZVV3N)](https://codecov.io/gh/kaboc/dart_text_parser)\n\nA Dart package for parsing text flexibly according to preset or custom regular expression patterns.\n\n## Usage\n\n### Using the preset matchers (URL / email address / phone number)\n\nThe package has the following preset matchers.\n\n- [EmailMatcher]\n- [UrlMatcher]\n- [UrlLikeMatcher]\n- [TelMatcher]\n\nBelow is an example of using three of the preset matchers except for `UrlLikeMatcher`.\n\n```dart\nimport 'package:text_parser/text_parser.dart';\n\nvoid main() {\n  const text = 'abc https://example.com/sample.jpg. def\\n'\n      'john.doe@example.com +1-012-3456-7890';\n\n  final parser = TextParser(\n    matchers: const [\n      EmailMatcher(),\n      UrlMatcher(),\n      TelMatcher(),\n    ],\n  );\n  final elements = parser.parseSync(text);\n  elements.forEach(print);\n}\n```\n\nOutput:\n\n```\nTextElement(matcherType: TextMatcher, matcherIndex null, offset: 0, text: abc , groups: [])\nTextElement(matcherType: UrlMatcher, matcherIndex 1, offset: 4, text: https://example.com/sample.jpg, groups: [])\nTextElement(matcherType: TextMatcher, matcherIndex null, offset: 34, text: . def\\n, groups: [])\nTextElement(matcherType: EmailMatcher, matcherIndex 0, offset: 40, text: john.doe@example.com, groups: [])\nTextElement(matcherType: TextMatcher, matcherIndex null, offset: 60, text:  , groups: [])\nTextElement(matcherType: TelMatcher, matcherIndex 2, offset: 61, text: +1-012-3456-7890, groups: [])\n```\n\nThe regular expression pattern of each of them is not very strict. If it does not meet\nyour use case, overwrite the pattern by yourself to make it stricter.\n\n#### parse() vs parseSync()\n\n[parseSync()][parseSync] literally executes parsing synchronously. If you want\nto prevent an execution from blocking the UI in Flutter or pauses other tasks\nin pure Dart, use [parse()][parse] instead.\n\n- `useIsolate: false`\n    - Parsing is scheduled as a microtask.\n- `useIsolate: true` (default)\n    - Parsing is executed in an [isolate][isolate].\n    - On Flutter Web, this is treated the same as `useIsolate: false` since\n      dart:isolate is not supported on the platform.\n\n#### UrlMatcher vs UrlLikeMatcher\n\n[UrlMatcher] does not match URLs not starting with \"http\" (e.g. `example.com`, `//example.com`,\netc). If you want them to be matched too, use [UrlLikeMatcher] instead.\n\n#### matcherType and matcherIndex\n\n`matcherType` contained in a [TextElement] object is the type of the matcher that\nwas used to extract the element. `matcherIndex` is the index of the matcher in\nthe matcher list passed to the `matchers` argument of [TextParser].\n\n#### Extracting only matching text elements\n\nBy default, the result of [parse()][parse] or [parseSync()][parseSync] contains\nall elements including the ones that have [TextMatcher][TextMatcher] as `matcherType`,\nwhich are elements of a string that did not match any match pattern. If you want\nto exclude them, pass `onlyMatches: true` when calling `parse()` or `parseSync()`.\n\n```dart\nfinal elements = parser.parseSync(text, onlyMatches: true);\nelements.forEach(print);\n```\n\nOutput:\n\n```\nTextElement(matcherType: UrlMatcher, matcherIndex 1, offset: 4, text: https://example.com/sample.jpg, groups: [])\nTextElement(matcherType: EmailMatcher, matcherIndex 0, offset: 40, text: foo@example.com, groups: [])\nTextElement(matcherType: TelMatcher, matcherIndex 2, offset: 56, text: +1-012-3456-7890, groups: [])\n```\n\n#### Extracting text elements of a particular matcher type\n\n```dart\nfinal telElements = elements.whereMatcherType\u003cTelMatcher\u003e().toList();\n```\n\nOr use a classic way:\n\n```dart\nfinal telElements = elements.map((elm) =\u003e elm.matcherType == TelMatcher).toList();\n```\n\n#### Conflict between matchers\n\nIf multiple matchers match the string at the same position in text, the first one\nin those matchers takes precedence.\n\n```dart\nfinal parser = TextParser(matchers: const[UrlLikeMatcher(), EmailMatcher()]);\nfinal elements = parser.parseSync('foo.bar@example.com');\n```\n\nIn this example, `UrlLikeMatcher` matches `foo.bar` and `EmailMatcher` matches\n`foo.bar@example.com`, but `UrlLikeMatcher` is used because it is written before\n`EmailMatcher` in the matchers list.\n\n### Overwriting the pattern of a preset matcher\n\nIf you want to parse a sequence of eleven numbers after \"tel:\" as a phone number:\n\n```dart\nTelMatcher(r'(?\u003c=tel:)\\d{11}')\n```\n\n### Using a custom pattern\n\nYou can create a matcher with a custom pattern either with [PatternMatcher][PatternMatcher]\nor by extending [TextMatcher][TextMatcher].\n\n#### PatternMatcher\n\n```dart\nconst boldMatcher = PatternMatcher(r'\\*\\*(.+?)\\*\\*');\nfinal parser = TextParser(matchers: [boldMatcher]);\n```\n\n#### Custom matcher class\n\nIt is also possible to create a matcher class by extending [TextMatcher][TextMatcher].\n\nBelow is an example of a matcher that parses the HTML `\u003ca\u003e` tags into a set of the href\nvalue and the link text.\n\n```dart\nclass ATagMatcher extends TextMatcher {\n  const ATagMatcher()\n      : super(\n          r'\\\u003ca\\s(?:.+?\\s)*?href=\"(.+?)\".*?\\\u003e'\n          r'\\s*(.+?)\\s*'\n          r'\\\u003c/a\\\u003e',\n        );\n}\n```\n\n```dart\nconst text = '''\n\u003ca class=\"foo\" href=\"https://example.com/\"\u003e\n  Content inside tags\n\u003c/a\u003e\n''';\n\nfinal parser = TextParser(\n  matchers: const [ATagMatcher()],\n  dotAll: true,\n);\nfinal elements = parser.parseSync(text, onlyMatches: true);\nprint(elements.first.groups);\n```\n\nOutput:\n\n```\n[https://example.com/, Content inside tags]\n```\n\n### ExactMatcher\n\n`ExactMatcher` escapes reserved characters of RegExp so that those are used\nas regular characters. The parser extracts the substrings that exactly match\nany of the strings in the list passed as the argument.\n\n```dart\nTextParser(\n  matchers: [\n    // 'e.g.' matches only 'e.g.', not 'edge' nor 'eggs'.\n    ExactMatcher(['e.g.', 'i.e.']),\n  ],\n)\n```\n\n### Groups\n\nEach [TextElement][TextElement] in a parse result has the property of\n[groups][TextElement_groups]. It is a list of strings that have matched the smaller pattern\ninside every set of parentheses `( )`.\n\nBelow is an example of a pattern that matches a Markdown style link.\n\n```dart\nr'\\[(.+?)\\]\\((.*?)\\)'\n```\n\nThis pattern has two sets of parentheses; `(.+?)` in `\\[(.+?)\\]` and `(/*?)` in `\\((.*?)\\)`.\nWhen this matches `[foo](bar)`, the first set of parentheses captures \"foo\" and the second\nset captures \"bar\", so `groups` results in `['foo', 'bar']`.\n\nTip:\n\nIf you want certain parentheses to be not captured as a group, add `?:` after the opening\nparenthesis, like `(?:pattern)` instead of `(pattern)`.\n\n#### Named groups\n\nNamed groups are captured too, but their names are lost in the resulting `groups` list.\nBelow is an example where a single match pattern contains capturing of both unnamed and\nnamed groups. \n\n```dart\nfinal parser = TextParser(\n  matchers: const [PatternMatcher(r'(?\u003cyear\u003e\\d{4})-(\\d{2})-(?\u003cday\u003e\\d{2})')],\n);\nfinal elements = parser.parseSync('2020-01-23');\nprint(elements.first);\n```\n\nOutput:\n\n```\nTextElement(matcherType: PatternMatcher, matcherIndex 0, offset: 0, text: 2020-01-23, groups: [2020, 01, 23])\n```\n\n### RegExp options\n\nHow a regular expression is treated can be configured in the `TextParser` constructor.\n\n- multiLine\n- caseSensitive\n- unicode\n- dotAll\n\nThese options are passed to the constructor of [RegExp][RegExp] internally, so\nrefer to its [document][RegExp_constructor] for information.\n\n[TextParser]: https://pub.dev/documentation/text_parser/latest/text_parser/TextParser-class.html\n[TextParser_matchers]: https://pub.dev/documentation/text_parser/latest/text_parser/TextParser/matchers.html\n[TextMatcher]: https://pub.dev/documentation/text_parser/latest/text_parser/TextMatcher-class.html\n[UrlMatcher]: https://pub.dev/documentation/text_parser/latest/text_parser/UrlMatcher-class.html\n[UrlLikeMatcher]: https://pub.dev/documentation/text_parser/latest/text_parser/UrlLikeMatcher-class.html\n[EmailMatcher]: https://pub.dev/documentation/text_parser/latest/text_parser/EmailMatcher-class.html\n[TelMatcher]: https://pub.dev/documentation/text_parser/latest/text_parser/TelMatcher-class.html\n[ExactMatcher]: https://pub.dev/documentation/text_parser/latest/text_parser/ExactMatcher-class.html\n[PatternMatcher]: https://pub.dev/documentation/text_parser/latest/text_parser/PatternMatcher-class.html\n[parse]: https://pub.dev/documentation/text_parser/latest/text_parser/TextParser/parse.html\n[parseSync]: https://pub.dev/documentation/text_parser/latest/text_parser/TextParser/parseSync.html\n[TextElement]: https://pub.dev/documentation/text_parser/latest/text_parser/TextElement-class.html\n[TextElement_groups]: https://pub.dev/documentation/text_parser/latest/text_parser/TextElement/groups.html\n[isolate]: https://api.dartlang.org/stable/dart-isolate/dart-isolate-library.html\n[RegExp]: https://api.dart.dev/stable/dart-core/RegExp-class.html\n[RegExp_constructor]: https://api.dart.dev/stable/dart-core/RegExp/RegExp.html","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkaboc%2Fdart_text_parser","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fkaboc%2Fdart_text_parser","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkaboc%2Fdart_text_parser/lists"}