{"id":44171042,"url":"https://github.com/modelcloud/pypcre","last_synced_at":"2026-04-13T09:01:19.136Z","repository":{"id":318626058,"uuid":"1072000229","full_name":"ModelCloud/PyPcre","owner":"ModelCloud","description":null,"archived":false,"fork":false,"pushed_at":"2026-04-13T07:32:01.000Z","size":422,"stargazers_count":2,"open_issues_count":0,"forks_count":2,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-04-13T08:25:42.654Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ModelCloud.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-10-08T05:54:27.000Z","updated_at":"2026-04-13T07:32:06.000Z","dependencies_parsed_at":"2025-10-08T10:14:05.534Z","dependency_job_id":"171cce4c-d98e-46e2-aede-1e248e6f6747","html_url":"https://github.com/ModelCloud/PyPcre","commit_stats":null,"previous_names":["modelcloud/python-pcre2","modelcloud/pypcre","modelcloud/pcre"],"tags_count":18,"template":false,"template_full_name":null,"purl":"pkg:github/ModelCloud/PyPcre","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ModelCloud%2FPyPcre","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ModelCloud%2FPyPcre/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ModelCloud%2FPyPcre/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ModelCloud%2FPyPcre/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ModelCloud","download_url":"https://codeload.github.com/ModelCloud/PyPcre/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ModelCloud%2FPyPcre/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31746113,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-13T06:26:45.479Z","status":"ssl_error","status_checked_at":"2026-04-13T06:26:44.645Z","response_time":93,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-02-09T10:16:05.265Z","updated_at":"2026-04-13T09:01:19.130Z","avatar_url":"https://github.com/ModelCloud.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003c!--\n# SPDX-FileCopyrightText: 2025 ModelCloud.ai\n# SPDX-FileCopyrightText: 2025 qubitium@modelcloud.ai\n# SPDX-License-Identifier: Apache-2.0\n# Contact: qubitium@modelcloud.ai, x.com/qubitium\n--\u003e\n\u003cdiv align=center\u003e\n\u003cimg width=\"500\" alt=\"image\" src=\"https://github.com/user-attachments/assets/92964c3a-f82e-4949-bd27-278f57c62d9f\" /\u003e\n\u003c/div\u003e\n\u003ch1 align=\"center\"\u003ePyPcre (Python PCRE2 Binding) 🧬\u003c/h1\u003e\n\n\u003cp align=center\u003e\nFast, free-threaded Python bindings for `PCRE2` with a stable `stdlib.re`-compatible API. ⚡\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n    \u003ca href=\"https://github.com/ModelCloud/PyPcre/releases\" style=\"text-decoration:none;\"\u003e\u003cimg alt=\"GitHub release\" src=\"https://img.shields.io/github/release/ModelCloud/Pcre.svg\"\u003e\u003c/a\u003e\n    \u003ca href=\"https://pypi.org/project/PyPcre/\" style=\"text-decoration:none;\"\u003e\u003cimg alt=\"PyPI - Version\" src=\"https://img.shields.io/pypi/v/PyPcre\"\u003e\u003c/a\u003e\n    \u003ca href=\"https://pepy.tech/projects/PyPcre\" style=\"text-decoration:none;\"\u003e\u003cimg src=\"https://static.pepy.tech/badge/PyPcre\" alt=\"PyPI Downloads\"\u003e\u003c/a\u003e\n    \u003ca href=\"https://github.com/ModelCloud/PyPcre/blob/main/LICENSE\"\u003e\u003cimg src=\"https://img.shields.io/pypi/l/PyPcre\"\u003e\u003c/a\u003e\n    \u003ca href=\"https://huggingface.co/modelcloud/\"\u003e\u003cimg src=\"https://img.shields.io/badge/🤗%20Hugging%20Face-ModelCloud-%23ff8811.svg\"\u003e\u003c/a\u003e\n\u003c/p\u003e\n\n\n\n## Latest News 🚀\n* 04/13/2026 [0.3.0](https://github.com/ModelCloud/PyPcre/releases/tag/v0.3.0): Lower-overhead public `Match` objects, faster hot-path `match()` / `search()` / `fullmatch()` / `findall()`, and tighter free-threaded execution. ⚡\n* 03/22/2026 [0.2.15](https://github.com/ModelCloud/PyPcre/releases/tag/v0.2.15): Python 3.15 `re` compatibility (`prefixmatch`, `NOFLAG`) ✅\n* 03/21/2026 [0.2.14](https://github.com/ModelCloud/PyPcre/releases/tag/v0.2.14): Python 3.14 compatibility 🐍\n* 03/02/2026 [0.2.11](https://github.com/ModelCloud/PyPcre/releases/tag/v0.2.11): Auto-detect `Visual Studio` in Windows environments during install and compile. 🪟\n* 02/24/2026 [0.2.10](https://github.com/ModelCloud/PyPcre/releases/tag/v0.2.10): Allow a `Visual Studio` (VS) compiler version check override via an environment variable. 🧰\n* 12/15/2025 [0.2.8](https://github.com/ModelCloud/PyPcre/releases/tag/v0.2.8): Fixed multi-arch Linux OS compatibility when both x86_64 and i386 `pcre2` libraries are installed. 🐧\n* 10/20/2025 [0.2.4](https://github.com/ModelCloud/PyPcre/releases/tag/v0.2.4): Removed the dependency on a system `python3-dev` package. `Python.h` will be downloaded optimistically from python.org when needed. 📦\n* 10/12/2025 [0.2.3](https://github.com/ModelCloud/PyPcre/releases/tag/v0.2.3): 🤗 Full `GIL=0` compliance for Python \u003e= 3.13T. Reduced cache thread contention. Improved performance across all APIs. Expanded CI test coverage. FreeBSD, Solaris, and Windows compatibility validated.\n* 10/09/2025 [0.1.0](https://github.com/ModelCloud/PyPcre/releases/tag/v0.1.0): 🎉 First release. Thread-safe, with auto JIT, auto pattern caching, and optimistic linking to the system library for fast installs.\n\n## Why PyPcre ⚡\n\nPyPcre pairs Python's familiar `re`-compatible API with the real `PCRE2` engine. You keep the ergonomics of the standard library while gaining a more capable regex engine, optional JIT, explicit threading support, and a binding designed and tested for free-threaded Python. 🧠⚡\n\n### Big Wins 🏆\n\n- 🧬 **Full power of PCRE2**: PyPcre uses the real `PCRE2` engine, so you get native compile options, semantics, JIT, and upstream tuning.\n- 🔥 **More expressive regex syntax**: `PCRE2` supports constructs beyond stdlib `re`, including atomic groups `(?\u003e...)`, possessive quantifiers `++`, branch-reset groups `(?|...)`, richer lookarounds, and backtracking control verbs like `(*SKIP)(*FAIL)`.\n- 🧵 **Thread-safe into `nogil`**: PyPcre is built for `PYTHON_GIL=0`, with CI coverage, lock-aware caches, reusable match/JIT resources, and `parallel_map()` for multi-subject fan-out.\n- ⚡ **Fast on real workloads**: `PCRE2` JIT plus cached compiled patterns lets PyPcre match or beat `re` and `regex` on many common scans, especially multiline searches, lookaround-heavy patterns, and free-threaded execution.\n- 🛡️ **Safer operational story**: PyPcre prefers the system `libpcre2-8` shared library so normal OS package updates can bring security and bug-fix benefits without a bundled fork.\n- ✅ **Validated thoroughly**: the project runs API tests, fuzz tests, memory-safety checks, local `valgrind` leak checks, and `massif` heap profiles. Recent local profiling found `0` definite leaks and `0` possible leaks in both the public API and raw binding paths.\n\n### Quick Comparison 🥊\n\n| Area | PyPcre | `stdlib.re` | `regex` |\n| --- | --- | --- | --- |\n| Engine | Full `PCRE2` ✅ | CPython stdlib engine | Separate engine, not `PCRE2` |\n| `PCRE2` syntax and flags | Full access ✅ | No | No |\n| Syntax power | Very rich ✅ | More limited | Rich, but different from `PCRE2` |\n| JIT execution | `PCRE2` JIT ✅ | No | No |\n| `re`-compatible API surface | Stable and familiar ✅ | Native | Similar, but not the main goal |\n| Free-threaded support | Built and tested for `PYTHON_GIL=0` ✅ | No explicit PyPcre-style layer | Not a project focus here |\n| Built-in threaded subject fan-out | `parallel_map()` ✅ | No | No |\n| System library updates | Uses system `libpcre2-8` by default ✅ | N/A | N/A |\n\n### Benchmark Highlights 🏁\n\nMeasured on a `Python 3.14.3` free-threaded build on x86_64 Linux with compiled-pattern reuse. Times are best-of-5; lower is better.\n\n| Workload | Operation | PyPcre | `re` | `regex` | PyPcre edge |\n| --- | --- | ---: | ---: | ---: | --- |\n| First `ERROR` line in a multiline log buffer | `search` | `3.68 ms` | `51.72 ms` | `5.67 ms` | `14.0x` vs `re`, `1.54x` vs `regex` |\n| Extract only `WARN` / `ERROR` lines | `findall` | `6.41 ms` | `91.84 ms` | `91.14 ms` | `14.3x` vs `re`, `14.2x` vs `regex` |\n| Per-line full-name extraction | `findall` | `22.28 ms` | `172.38 ms` | `218.29 ms` | `7.74x` vs `re`, `9.80x` vs `regex` |\n| Lookbehind + negative-lookahead extraction | `findall` | `50.23 ms` | `53.35 ms` | `57.03 ms` | `1.06x` vs `re`, `1.14x` vs `regex` |\n| UUID extraction | `findall` | `77.49 ms` | `83.19 ms` | `134.87 ms` | `1.07x` vs `re`, `1.74x` vs `regex` |\n| Boundary-aware token scan | `findall` | `127.76 ms` | `128.03 ms` | `146.37 ms` | effectively tied with `re`, `1.15x` vs `regex` |\n\n### Free-Threaded Benchmark Highlights 🧵\n\nMeasured in the same environment with `8` threads sharing one compiled pattern. Times are best-of-3; lower is better.\n\n| Workload | Threads | PyPcre | `re` | `regex` | PyPcre edge |\n| --- | ---: | ---: | ---: | ---: | --- |\n| First `ERROR` line in a multiline log buffer | `8` | `25.34 ms` | `38.83 ms` | `40.34 ms` | `1.53x` vs `re`, `1.59x` vs `regex` |\n| Extract only `WARN` / `ERROR` lines | `8` | `28.58 ms` | `65.54 ms` | `73.55 ms` | `2.29x` vs `re`, `2.57x` vs `regex` |\n| Per-line full-name extraction | `8` | `31.68 ms` | `123.44 ms` | `164.80 ms` | `3.90x` vs `re`, `5.20x` vs `regex` |\n\nPyPcre is the stronger all-around choice when you want more than the baseline: full `PCRE2` features, more expressive syntax, JIT, explicit free-threaded support, and a stable `re`-compatible API surface. It keeps Python ergonomics while giving you a substantially more capable engine. 🚀\n\n## Installation 📦\n\n```bash\npip install PyPcre\n```\n\nBy default, the package links against the system `libpcre2-8` shared library for fast installs and to inherit OS security updates. See [Building](#building) for manual build details.\n\n## Platform Support (Validated) ✅\n\n`Linux`, `macOS`, `Windows`, `WSL`, `FreeBSD`\n\n\n## Usage 🛠️\n\nIf you already use the standard library `re`, migration is often just an import swap:\n\n```python\nimport pcre as re\n```\n\nThe high-level API stays close to the standard library, so most existing `re` code can move over with little or no rewriting.\n\n### Quick start 🚀\n\n```python\nfrom pcre import compile, findall, match, search, Flag\n\nif match(r\"(?P\u003cword\u003e\\\\w+)\", \"hello world\"):\n    print(\"found word\")\n\npattern = compile(rb\"\\d+\", flags=Flag.MULTILINE)\nnumbers = pattern.findall(b\"line 1\\nline 22\")\n```\n\n### API Overview 🧭\n\n- Module helpers: `prefixmatch`, `match`, `search`, `fullmatch`, `finditer`,\n  `findall`, `split`, `sub`, `subn`, `compile`, `escape`, `purge`, and\n  `parallel_map`.\n- `compile()` returns a `Pattern` object with the familiar matching helpers\n  plus `split()`, `sub()`, and `subn()`.\n- `Pattern` exposes `.pattern`, `.flags`, `.jit`, `.groupindex`, and `.groups`\n  for introspection.\n- `Match` objects expose the usual `group()`, `groups()`, `groupdict()`,\n  `start()`, `end()`, `span()`, and `expand()` methods, along with `.re`,\n  `.string`, `.pos`, `.endpos`, `.lastindex`, `.lastgroup`, and `.regs`.\n- Flags are available through `pcre.Flag` and familiar aliases such as\n  `IGNORECASE`, `MULTILINE`, `DOTALL`, `VERBOSE`, `ASCII`, `UNICODE`, and\n  `NOFLAG`.\n- Errors are raised as `pcre.PcreError`; `error` and `PatternError` are kept as\n  compatibility aliases.\n\n### Common examples 🧪\n\nCompiled patterns:\n\n```python\nfrom pcre import compile, Flag\n\npattern = compile(r\"(?P\u003cname\u003e[A-Za-z]+)\", flags=Flag.CASELESS)\nmatch = pattern.search(\"User: alice\")\nprint(match.group(\"name\"))  # alice\n```\n\nSubstitution:\n\n```python\nfrom pcre import sub\n\nresult = sub(r\"\\d+\", \"#\", \"room 101\")\nprint(result)  # room #\n```\n\nBytes:\n\n```python\nfrom pcre import compile\n\npattern = compile(br\"\\w+\")\nprint(pattern.findall(b\"ab cd\"))  # [b'ab', b'cd']\n```\n\n### Stdlib `re` compatibility 🔁\n\n- Module-level helpers and the `Pattern` class follow the same call shapes as\n  the standard library `re` module, including `pos`, `endpos`, and `flags`\n  behavior.\n- Python 3.15's `prefixmatch()` alias is available at both the module level\n  and on compiled `Pattern` objects, and `re.NOFLAG` is re-exported as the\n  zero-value compatibility alias.\n- `Pattern` mirrors `re.Pattern` attributes like `.pattern`, `.groupindex`,\n  and `.groups`, while `Match` objects surface the familiar `.re`, `.string`,\n  `.pos`, `.endpos`, `.lastindex`, `.lastgroup`, `.regs`, and `.expand()` API.\n- Substitution helpers enforce the same type rules as the standard library\n  `re` module: string patterns require string replacements, byte patterns\n  require bytes-like replacements, and callable replacements receive the\n  wrapped `Match`.\n- `compile()` accepts native `Flag` values as well as compatible\n  `re.RegexFlag` members from the standard library. Supported stdlib flags\n  map 1:1 to PCRE2 options (`IGNORECASE→CASELESS`, `MULTILINE→MULTILINE`,\n  `DOTALL→DOTALL`, `VERBOSE→EXTENDED`); passing unsupported stdlib flags\n  raises a compatibility `ValueError` to prevent silent divergences.\n- `pcre.escape()` delegates directly to `re.escape` for byte and text\n  patterns so escaping semantics remain identical.\n- String patterns enable Unicode behavior by default. Byte patterns do not.\n\n### `regex` package compatibility 🔄\n\nThe [`regex`](https://pypi.org/project/regex/) package interprets\n`\\uXXXX` and `\\UXXXXXXXX` escapes as UTF-8 code points, while PCRE2 expects\nhexadecimal escapes to use the `\\x{...}` form. Enable `Flag.COMPAT_UNICODE_ESCAPE` to\ntranslate those escapes automatically when compiling patterns:\n\n```python\nfrom pcre import compile, Flag\n\npattern = compile(r\"\\\\U0001F600\", flags=Flag.COMPAT_UNICODE_ESCAPE)\nassert pattern.pattern == r\"\\\\x{0001F600}\"\n```\n\nSet the default behavior globally with `pcre.configure(compat_regex=True)`\nso that subsequent calls to `compile()` and the module-level helpers apply\nthe conversion without repeating the flag.\n\n### Common issues ⚠️\n\n- Unsupported stdlib flags such as `re.DEBUG`, `re.LOCALE`, and `re.ASCII`\n  raise `ValueError`. If you want ASCII-style behavior, use `pcre.ASCII` or\n  `Flag.NO_UTF | Flag.NO_UCP`.\n- Replacement types must match the subject type: text patterns use `str`\n  replacements, while byte patterns use bytes-like replacements.\n- If you are porting patterns from the third-party `regex` package, check\n  `\\u` and `\\U` escapes first. That is the most common compatibility gap.\n- Most users do not need to tune caching, JIT, or threading. The defaults are\n  intended to work well out of the box.\n\n### Optional runtime controls 🎛️\n\n- `pcre.configure(jit=False)` disables JIT globally. `Flag.JIT` and\n  `Flag.NO_JIT` let you override that per pattern.\n- `pcre.set_cache_limit()`, `pcre.get_cache_limit()`, and `pcre.clear_cache()`\n  control the high-level compile cache.\n- `pcre.configure_threads()`, `pcre.configure_thread_pool()`,\n  `shutdown_thread_pool()`, `Flag.THREADS`, and `Flag.NO_THREADS` are available\n  if you want to opt into or restrict threaded execution.\n\n## Building 🏗️\n\nThe extension links against an existing `libpcre2-8` installation. Install the development headers for your platform before building,\nfor example `apt install libpcre2-dev` on Debian/Ubuntu, `dnf install pcre2-devel`\non Fedora/RHEL derivatives, or `brew install pcre2` on macOS.\n\nIf the headers or library live in a non-standard location, you can export one\nor more of the following environment variables prior to invoking the build\n(`pip install .`, `python -m build`, etc.):\n\n- `PYPCRE_ROOT`\n- `PYPCRE_INCLUDE_DIR`\n- `PYPCRE_LIBRARY_DIR`\n- `PYPCRE_LIBRARY_PATH` *(pathsep-separated directories or explicit library files to\n  prioritize when resolving `libpcre2-8`)*\n- `PYPCRE_LIBRARIES`\n- `PYPCRE_CFLAGS`\n- `PYPCRE_LDFLAGS`\n\nIf you would rather force a source build, set `PYPCRE_BUILD_FROM_SOURCE=1`\nbefore installing.\n\nWhen `pkg-config` is available, the build automatically picks up the\nrequired include and link flags via `pkg-config --cflags/--libs libpcre2-8`.\nWithout `pkg-config`, the build script scans common installation prefixes for\nLinux distributions (Debian, Ubuntu, Fedora/RHEL/CentOS, openSUSE, Alpine),\nFreeBSD, and macOS (including Homebrew) to locate the headers and\nlibraries.\n\nIf your system ships `libpcre2-8` under `/usr` but you also maintain a\nmanually built copy under `/usr/local`, export `PYPCRE_LIBRARY_PATH` (and, if\nneeded, a matching `PYPCRE_INCLUDE_DIR`) so the build links against the desired\nlocation.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmodelcloud%2Fpypcre","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmodelcloud%2Fpypcre","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmodelcloud%2Fpypcre/lists"}