{"id":19204394,"url":"https://github.com/vxern/robots_txt","last_synced_at":"2026-01-30T02:44:46.901Z","repository":{"id":56838293,"uuid":"398784878","full_name":"vxern/robots_txt","owner":"vxern","description":"⚙️ A quality `robots.txt` ruleset parser to ensure your application follows the standard specification for the file.","archived":false,"fork":false,"pushed_at":"2024-12-02T18:52:08.000Z","size":48,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-04-03T12:04:33.664Z","etag":null,"topics":["complete","dart","documented","fast","parser","robots","robots-txt","robots-txt-parser","robotstxt","simple","tiny"],"latest_commit_sha":null,"homepage":"https://pub.dev/packages/robots_txt","language":"Dart","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/vxern.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-08-22T11:39:09.000Z","updated_at":"2024-12-02T18:52:12.000Z","dependencies_parsed_at":"2023-02-08T13:16:57.345Z","dependency_job_id":"b8b7f633-5b5e-446f-9581-9fb4148830c5","html_url":"https://github.com/vxern/robots_txt","commit_stats":null,"previous_names":["vxern/robots_txt","wordcollector/robots_txt"],"tags_count":15,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vxern%2Frobots_txt","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vxern%2Frobots_txt/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vxern%2Frobots_txt/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vxern%2Frobots_txt/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/vxern","download_url":"https://codeload.github.com/vxern/robots_txt/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248225658,"owners_count":21068078,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["complete","dart","documented","fast","parser","robots","robots-txt","robots-txt-parser","robotstxt","simple","tiny"],"created_at":"2024-11-09T13:07:42.765Z","updated_at":"2026-01-30T02:44:46.843Z","avatar_url":"https://github.com/vxern.png","language":"Dart","funding_links":[],"categories":[],"sub_categories":[],"readme":"## A complete, dependency-less and fully documented `robots.txt` ruleset parser.\n\n### Usage\n\nYou can obtain the robot exclusion rulesets for a particular website as follows:\n\n```dart\n// Get the contents of the `robots.txt` file.\nfinal contents = /* Your method of obtaining the contents of a `robots.txt` file. */;\n// Parse the contents.\nfinal robots = Robots.parse(contents);\n```\n\nNow that you have parsed the `robots.txt` file, you can perform checks to\nestablish whether or not a user-agent is allowed to visit a particular path:\n\n```dart\nfinal userAgent = /* Your user-agent. */;\nprint(robots.verifyCanAccess('/gist/', userAgent: userAgent)); // False\nprint(robots.verifyCanAccess('/government/robots_txt/', userAgent: userAgent)); // True\n```\n\nIf you are only concerned about directives pertaining to your own user-agent,\nyou may instruct the parser to ignore other user-agents as follows:\n\n```dart\n// Parse the contents, disregarding user-agents other than 'government'.\nfinal robots = Robots.parse(contents, onlyApplicableTo: const {'government'});\n```\n\nThe `Robots.parse()` function does not have any built-in structure validation.\nIt will not throw exceptions, and will fail silently wherever appropriate. If\nthe file contents passed into it were not a valid `robots.txt` file, there is no\nguarantee that it will produce useful data, and disallow a bot wherever\npossible.\n\nIf you wish to ensure before parsing that a particular file is valid, use the\n`Robots.validate()` function. Unlike `Robots.parse()`, this one **will throw** a\n`FormatException` if the file is not valid:\n\n```dart\n// Validating an invalid file will throw a `FormatException`.\ntry {\n  Robots.validate('This is an obviously invalid robots.txt file.');\n} on FormatException {\n  print('As expected, this file is flagged as invalid.');\n}\n\n// Validating an already valid file will not throw anything.\ntry {\n  Robots.validate('''\nUser-agent: *\nCrawl-delay: 10\nDisallow: /\nAllow: /file.txt\n\nHost: https://hosting.example.com/\nSitemap: https://example.com/sitemap.xml\n''');\n  print('As expected also, this file is not flagged as invalid.');\n} on FormatException {\n  // Code to handle an invalid file.\n}\n```\n\nBy default, the validator will only accept the following fields:\n\n- User-agent\n- Allow\n- Disallow\n- Sitemap\n- Crawl-delay\n- Host\n\nIf you want to accept files that feature any other fields, you will have to\nspecify them as so:\n\n```dart\ntry {\n  Robots.validate(\n    '''\nUser-agent: *\nCustom-field: value\n''',\n    allowedFieldNames: {'Custom-field'},\n  );\n} on FormatException {\n  // Code to handle an invalid file.\n}\n```\n\nBy default, the `Allow` field is treated as having precedence by the parser.\nThis is the standard approach to both writing and reading `robots.txt` files,\nhowever, you can instruct the parser to follow another approach by telling it to\ndo so:\n\n```dart\nrobots.verifyCanAccess(\n  '/path', \n  userAgent: userAgent, \n  typePrecedence: RuleTypePrecedence.disallow,\n);\n```\n\nSimilarly, fields defined **later** in the file are considered to have\nprecedence too. Similarly also, this is the standard approach. You can instruct\nthe parser to rule otherwise:\n\n```dart\nrobots.verifyCanAccess(\n  '/path',\n  userAgent: userAgent,\n  comparisonMethod: PrecedenceStrategy.lowerTakesPrecedence,\n);\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvxern%2Frobots_txt","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fvxern%2Frobots_txt","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvxern%2Frobots_txt/lists"}