{"id":18682858,"url":"https://github.com/yandex/pire","last_synced_at":"2025-04-06T16:13:08.297Z","repository":{"id":1117731,"uuid":"988682","full_name":"yandex/pire","owner":"yandex","description":"Perl Incompatible Regular Expressions library","archived":false,"fork":false,"pushed_at":"2020-09-08T21:23:28.000Z","size":756,"stargazers_count":333,"open_issues_count":19,"forks_count":30,"subscribers_count":22,"default_branch":"master","last_synced_at":"2025-03-30T15:08:00.882Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"http://github.com/dprokoptsev/pire/wiki","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/yandex.png","metadata":{"files":{"readme":"README","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2010-10-15T00:37:08.000Z","updated_at":"2025-03-16T07:51:01.000Z","dependencies_parsed_at":"2022-08-16T12:05:17.084Z","dependency_job_id":null,"html_url":"https://github.com/yandex/pire","commit_stats":null,"previous_names":[],"tags_count":4,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yandex%2Fpire","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yandex%2Fpire/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yandex%2Fpire/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yandex%2Fpire/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/yandex","download_url":"https://codeload.github.com/yandex/pire/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247509236,"owners_count":20950232,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-07T10:13:01.445Z","updated_at":"2025-04-06T16:13:08.259Z","avatar_url":"https://github.com/yandex.png","language":"C++","readme":"This is PIRE, Perl Incompatible Regular Expressions library.\n\nThis library is aimed at checking a huge amount of text against\nrelatively many regular expressions. Roughly speaking, it can just\ncheck whether given text maches the certain regexp, but can do it\nreally fast (more than 400 MB/s on our hardware is common). Even more,\nmultiple regexps can be combined together, giving capability to\ncheck the text against apx.10 regexps in a single pass (and mantaining\nthe same speed).\n\nSince Pire examines each character only once, without any lookaheads\nor rollbacks, spending about five machine instructions per each character,\nit can be used even in realtime tasks.\n\nOn the other hand, Pire has very limited functionality (compared to\nother regexp libraries). Pire does not have any Perlish conditional\nregexps, lookaheads \u0026 backtrackings, greedy/nongreedy matches; neither\nhas it any capturing facilities.\n\nPire was developed in Yandex (http://company.yandex.ru/) as a part of its\nweb crawler.\n\nMore information can be found in README.ru (in Russian), which is\nyet to be translated.\n\nPlease report bugs to dprokoptsev@yandex-team.ru or davenger@yandex-team.ru.\n\nQuick Start\n=============\n\n#include \u003cstdio.h\u003e\n#include \u003cvector\u003e\n#include \u003cpire/pire.h\u003e\n\nPire::NonrelocScanner CompileRegexp(const char* pattern)\n{\n\t// Transform the pattern from UTF-8 into UCS4\n\tstd::vector\u003cPire::wchar32\u003e ucs4;\n\tPire::Encodings::Utf8().FromLocal(pattern, pattern + strlen(pattern), std::back_inserter(ucs4));\n\n\treturn Pire::Lexer(ucs4.begin(), ucs4.end())\n\t\t.AddFeature(Pire::Features::CaseInsensitive())\t// enable case insensitivity\n\t\t.SetEncoding(Pire::Encodings::Utf8())\t\t// set input text encoding\n\t\t.Parse() \t\t\t\t\t// create an FSM \n\t\t.Surround()\t\t\t\t\t// PCRE_ANCHORED behavior\n\t\t.Compile\u003cPire::NonrelocScanner\u003e();\t\t// compile the FSM\n}\n\nbool Matches(const Pire::NonrelocScanner\u0026 scanner, const char* ptr, size_t len)\n{\n\treturn Pire::Runner(scanner)\n\t\t.Begin()\t// '^'\n\t\t.Run(ptr, len)\t// the text \n\t\t.End();\t\t// '$'\n\t\t// implicitly cast to bool\n}\n\nint main()\n{\n\tchar re[] = \"hello\\\\s+w.+d$\";\n\tchar str[] = \"Hello world\";\n\n\tPire::NonrelocScanner sc = CompileRegexp(re);\n\n\tbool res = Matches(sc, str, strlen(str));\n\n\tprintf(\"String \\\"%s\\\" %s \\\"%s\\\"\\n\", str, (res ? \"matches\" : \"doesn't match\"), re);\n\t\t\n\treturn 0;\n}\n\n","funding_links":[],"categories":["Regular Expression","正则表达式"],"sub_categories":["物理学"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fyandex%2Fpire","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fyandex%2Fpire","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fyandex%2Fpire/lists"}