{"id":13511457,"url":"https://github.com/PCRE2Project/pcre2","last_synced_at":"2025-03-30T20:33:13.333Z","repository":{"id":40247607,"uuid":"398251321","full_name":"PCRE2Project/pcre2","owner":"PCRE2Project","description":"PCRE2 development is now based here.","archived":false,"fork":false,"pushed_at":"2024-10-18T08:23:47.000Z","size":14052,"stargazers_count":899,"open_issues_count":49,"forks_count":189,"subscribers_count":38,"default_branch":"master","last_synced_at":"2024-10-18T09:46:08.669Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"C","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/PCRE2Project.png","metadata":{"files":{"readme":"README","changelog":"ChangeLog","contributing":null,"funding":null,"license":"COPYING","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":"AUTHORS","dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-08-20T11:18:29.000Z","updated_at":"2024-10-18T08:23:52.000Z","dependencies_parsed_at":"2023-02-19T10:16:00.900Z","dependency_job_id":"4daafd02-4767-4add-8ef7-299e524ac724","html_url":"https://github.com/PCRE2Project/pcre2","commit_stats":null,"previous_names":[],"tags_count":23,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PCRE2Project%2Fpcre2","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PCRE2Project%2Fpcre2/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PCRE2Project%2Fpcre2/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PCRE2Project%2Fpcre2/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/PCRE2Project","download_url":"https://codeload.github.com/PCRE2Project/pcre2/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":246379379,"owners_count":20767694,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-01T03:00:50.451Z","updated_at":"2025-03-30T20:33:13.321Z","avatar_url":"https://github.com/PCRE2Project.png","language":"C","funding_links":[],"categories":["C","others","Regular Expression","Regex engines"],"sub_categories":["Source code"],"readme":"README file for PCRE2 (Perl-compatible regular expression library)\n==================================================================\n\nPCRE2 is a re-working of the original PCRE1 library to provide an entirely new\nAPI. Since its initial release in 2015, there has been further development of\nthe code and it now differs from PCRE1 in more than just the API. There are new\nfeatures, and the internals have been improved. The original PCRE1 library is\nnow obsolete and no longer maintained. The latest release of PCRE2 is available\nin .tar.gz, tar.bz2, or .zip form from this GitHub repository:\n\nhttps://github.com/PCRE2Project/pcre2/releases\n\nThere is a mailing list for discussion about the development of PCRE2 at\npcre2-dev@googlegroups.com. You can subscribe by sending an email to\npcre2-dev+subscribe@googlegroups.com.\n\nYou can access the archives and also subscribe or manage your subscription\nhere:\n\nhttps://groups.google.com/g/pcre2-dev\n\nPlease read the NEWS file if you are upgrading from a previous release. The\ncontents of this README file are:\n\n  The PCRE2 APIs\n  Documentation for PCRE2\n  Building PCRE2 on non-Unix-like systems\n  Building PCRE2 without using autotools\n  Building PCRE2 using autotools\n  Retrieving configuration information\n  Shared libraries\n  Cross-compiling using autotools\n  Making new tarballs\n  Testing PCRE2\n  Character tables\n  File manifest\n\n\nThe PCRE2 APIs\n--------------\n\nPCRE2 is written in C, and it has its own API. There are three sets of\nfunctions, one for the 8-bit library, which processes strings of bytes, one for\nthe 16-bit library, which processes strings of 16-bit values, and one for the\n32-bit library, which processes strings of 32-bit values. Unlike PCRE1, there\nare no C++ wrappers.\n\nThe distribution does contain a set of C wrapper functions for the 8-bit\nlibrary that are based on the POSIX regular expression API (see the pcre2posix\nman page). These are built into a library called libpcre2-posix. Note that this\njust provides a POSIX calling interface to PCRE2; the regular expressions\nthemselves still follow Perl syntax and semantics. The POSIX API is restricted,\nand does not give full access to all of PCRE2's facilities.\n\nThe header file for the POSIX-style functions is called pcre2posix.h. The\nofficial POSIX name is regex.h, but I did not want to risk possible problems\nwith existing files of that name by distributing it that way. To use PCRE2 with\nan existing program that uses the POSIX API, pcre2posix.h will have to be\nrenamed or pointed at by a link (or the program modified, of course). See the\npcre2posix documentation for more details.\n\n\nDocumentation for PCRE2\n-----------------------\n\nIf you install PCRE2 in the normal way on a Unix-like system, you will end up\nwith a set of man pages whose names all start with \"pcre2\". The one that is\njust called \"pcre2\" lists all the others. In addition to these man pages, the\nPCRE2 documentation is supplied in two other forms:\n\n  1. There are files called doc/pcre2.txt, doc/pcre2grep.txt, and\n     doc/pcre2test.txt in the source distribution. The first of these is a\n     concatenation of the text forms of all the section 3 man pages except the\n     listing of pcre2demo.c and those that summarize individual functions. The\n     other two are the text forms of the section 1 man pages for the pcre2grep\n     and pcre2test commands. These text forms are provided for ease of scanning\n     with text editors or similar tools. They are installed in\n     \u003cprefix\u003e/share/doc/pcre2, where \u003cprefix\u003e is the installation prefix\n     (defaulting to /usr/local).\n\n  2. A set of files containing all the documentation in HTML form, hyperlinked\n     in various ways, and rooted in a file called index.html, is distributed in\n     doc/html and installed in \u003cprefix\u003e/share/doc/pcre2/html.\n\n\nBuilding PCRE2 on non-Unix-like systems\n---------------------------------------\n\nFor a non-Unix-like system, please read the file NON-AUTOTOOLS-BUILD, though if\nyour system supports the use of \"configure\" and \"make\" you may be able to build\nPCRE2 using autotools in the same way as for many Unix-like systems.\n\nPCRE2 can also be configured using CMake, which can be run in various ways\n(command line, GUI, etc). This creates Makefiles, solution files, etc. The file\nNON-AUTOTOOLS-BUILD has information about CMake.\n\nPCRE2 has been compiled on many different operating systems. It should be\nstraightforward to build PCRE2 on any system that has a Standard C compiler and\nlibrary, because it uses only Standard C functions.\n\n\nBuilding PCRE2 without using autotools\n--------------------------------------\n\nThe use of autotools (in particular, libtool) is problematic in some\nenvironments, even some that are Unix or Unix-like. See the NON-AUTOTOOLS-BUILD\nfile for ways of building PCRE2 without using autotools.\n\n\nBuilding PCRE2 using autotools\n------------------------------\n\nThe following instructions assume the use of the widely used \"configure; make;\nmake install\" (autotools) process.\n\nIf you have downloaded and unpacked a PCRE2 release tarball, run the\n\"configure\" command from the PCRE2 directory, with your current directory set\nto the directory where you want the files to be created. This command is a\nstandard GNU \"autoconf\" configuration script, for which generic instructions\nare supplied in the file INSTALL.\n\nThe files in the GitHub repository do not contain \"configure\". If you have\ndownloaded the PCRE2 source files from GitHub, before you can run \"configure\"\nyou must run the shell script called autogen.sh. This runs a number of\nautotools to create a \"configure\" script (you must of course have the autotools\ncommands installed in order to do this).\n\nMost commonly, people build PCRE2 within its own distribution directory, and in\nthis case, on many systems, just running \"./configure\" is sufficient. However,\nthe usual methods of changing standard defaults are available. For example:\n\nCFLAGS='-O2 -Wall' ./configure --prefix=/opt/local\n\nThis command specifies that the C compiler should be run with the flags '-O2\n-Wall' instead of the default, and that \"make install\" should install PCRE2\nunder /opt/local instead of the default /usr/local.\n\nIf you want to build in a different directory, just run \"configure\" with that\ndirectory as current. For example, suppose you have unpacked the PCRE2 source\ninto /source/pcre2/pcre2-xxx, but you want to build it in\n/build/pcre2/pcre2-xxx:\n\ncd /build/pcre2/pcre2-xxx\n/source/pcre2/pcre2-xxx/configure\n\nPCRE2 is written in C and is normally compiled as a C library. However, it is\npossible to build it as a C++ library, though the provided building apparatus\ndoes not have any features to support this.\n\nThere are some optional features that can be included or omitted from the PCRE2\nlibrary. They are also documented in the pcre2build man page.\n\n. By default, both shared and static libraries are built. You can change this\n  by adding one of these options to the \"configure\" command:\n\n  --disable-shared\n  --disable-static\n\n  Setting --disable-shared ensures that PCRE2 libraries are built as static\n  libraries. The binaries that are then created as part of the build process\n  (for example, pcre2test and pcre2grep) are linked statically with one or more\n  PCRE2 libraries, but may also be dynamically linked with other libraries such\n  as libc. If you want these binaries to be fully statically linked, you can\n  set LDFLAGS like this:\n\n  LDFLAGS=--static ./configure --disable-shared\n\n  Note the two hyphens in --static. Of course, this works only if static\n  versions of all the relevant libraries are available for linking. See also\n  \"Shared libraries\" below.\n\n  Shared libraries are compiled with symbol versioning enabled on platforms that\n  support this, but this can be disabled by adding --disable-symvers.\n\n. By default, only the 8-bit library is built. If you add --enable-pcre2-16 to\n  the \"configure\" command, the 16-bit library is also built. If you add\n  --enable-pcre2-32 to the \"configure\" command, the 32-bit library is also\n  built. If you want only the 16-bit or 32-bit library, use --disable-pcre2-8\n  to disable building the 8-bit library.\n\n. If you want to include support for just-in-time (JIT) compiling, which can\n  give large performance improvements on certain platforms, add --enable-jit to\n  the \"configure\" command. This support is available only for certain hardware\n  architectures. If you try to enable it on an unsupported architecture, there\n  will be a compile time error. If in doubt, use --enable-jit=auto, which\n  enables JIT only if the current hardware is supported.\n\n. If you are enabling JIT under SELinux environment you may also want to add\n  --enable-jit-sealloc, which enables the use of an executable memory allocator\n  that is compatible with SELinux. Warning: this allocator is experimental!\n  It does not support fork() operation and may crash when no disk space is\n  available. This option has no effect if JIT is disabled.\n\n. If you do not want to make use of the default support for UTF-8 Unicode\n  character strings in the 8-bit library, UTF-16 Unicode character strings in\n  the 16-bit library, or UTF-32 Unicode character strings in the 32-bit\n  library, you can add --disable-unicode to the \"configure\" command. This\n  reduces the size of the libraries. It is not possible to configure one\n  library with Unicode support, and another without, in the same configuration.\n  It is also not possible to use --enable-ebcdic (see below) with Unicode\n  support, so if this option is set, you must also use --disable-unicode.\n\n  When Unicode support is available, the use of a UTF encoding still has to be\n  enabled by setting the PCRE2_UTF option at run time or starting a pattern\n  with (*UTF). When PCRE2 is compiled with Unicode support, its input can only\n  either be ASCII or UTF-8/16/32, even when running on EBCDIC platforms.\n\n  As well as supporting UTF strings, Unicode support includes support for the\n  \\P, \\p, and \\X sequences that recognize Unicode character properties.\n  However, only a subset of Unicode properties are supported; see the\n  pcre2pattern man page for details. Escape sequences such as \\d and \\w in\n  patterns do not by default make use of Unicode properties, but can be made to\n  do so by setting the PCRE2_UCP option or starting a pattern with (*UCP).\n\n. You can build PCRE2 to recognize either CR or LF or the sequence CRLF, or any\n  of the preceding, or any of the Unicode newline sequences, or the NUL (zero)\n  character as indicating the end of a line. Whatever you specify at build time\n  is the default; the caller of PCRE2 can change the selection at run time. The\n  default newline indicator is a single LF character (the Unix standard). You\n  can specify the default newline indicator by adding --enable-newline-is-cr,\n  --enable-newline-is-lf, --enable-newline-is-crlf,\n  --enable-newline-is-anycrlf, --enable-newline-is-any, or\n  --enable-newline-is-nul to the \"configure\" command, respectively.\n\n. By default, the sequence \\R in a pattern matches any Unicode line ending\n  sequence. This is independent of the option specifying what PCRE2 considers\n  to be the end of a line (see above). However, the caller of PCRE2 can\n  restrict \\R to match only CR, LF, or CRLF. You can make this the default by\n  adding --enable-bsr-anycrlf to the \"configure\" command (bsr = \"backslash R\").\n\n. In a pattern, the escape sequence \\C matches a single code unit, even in a\n  UTF mode. This can be dangerous because it breaks up multi-code-unit\n  characters. You can build PCRE2 with the use of \\C permanently locked out by\n  adding --enable-never-backslash-C (note the upper case C) to the \"configure\"\n  command. When \\C is allowed by the library, individual applications can lock\n  it out by calling pcre2_compile() with the PCRE2_NEVER_BACKSLASH_C option.\n\n. PCRE2 has a counter that limits the depth of nesting of parentheses in a\n  pattern. This limits the amount of system stack that a pattern uses when it\n  is compiled. The default is 250, but you can change it by setting, for\n  example,\n\n  --with-parens-nest-limit=500\n\n. PCRE2 has a counter that can be set to limit the amount of computing resource\n  it uses when matching a pattern. If the limit is exceeded during a match, the\n  match fails. The default is ten million. You can change the default by\n  setting, for example,\n\n  --with-match-limit=500000\n\n  on the \"configure\" command. This is just the default; individual calls to\n  pcre2_match() or pcre2_dfa_match() can supply their own value. There is more\n  discussion in the pcre2api man page (search for pcre2_set_match_limit).\n\n. There is a separate counter that limits the depth of nested backtracking\n  (pcre2_match()) or nested function calls (pcre2_dfa_match()) during a\n  matching process, which indirectly limits the amount of heap memory that is\n  used, and in the case of pcre2_dfa_match() the amount of stack as well. This\n  counter also has a default of ten million, which is essentially \"unlimited\".\n  You can change the default by setting, for example,\n\n  --with-match-limit-depth=5000\n\n  There is more discussion in the pcre2api man page (search for\n  pcre2_set_depth_limit).\n\n. You can also set an explicit limit on the amount of heap memory used by\n  the pcre2_match() and pcre2_dfa_match() interpreters:\n\n  --with-heap-limit=500\n\n  The units are kibibytes (units of 1024 bytes). This limit does not apply when\n  the JIT optimization (which has its own memory control features) is used.\n  There is more discussion on the pcre2api man page (search for\n  pcre2_set_heap_limit).\n\n. In the 8-bit library, the default maximum compiled pattern size is around\n  64 kibibytes. You can increase this by adding --with-link-size=3 to the\n  \"configure\" command. PCRE2 then uses three bytes instead of two for offsets\n  to different parts of the compiled pattern. In the 16-bit library,\n  --with-link-size=3 is the same as --with-link-size=4, which (in both\n  libraries) uses four-byte offsets. Increasing the internal link size reduces\n  performance in the 8-bit and 16-bit libraries. In the 32-bit library, the\n  link size setting is ignored, as 4-byte offsets are always used.\n\n. Lookbehind assertions in which one or more branches can match a variable\n  number of characters are supported only if there is a maximum matching length\n  for each top-level branch. There is a limit to this maximum that defaults to\n  255 characters. You can alter this default by a setting such as\n\n  --with-max-varlookbehind=100\n\n  The limit can be changed at runtime by calling pcre2_set_max_varlookbehind().\n  Lookbehind assertions in which every branch matches a fixed number of\n  characters (not necessarily all the same) are not constrained by this limit.\n\n. For speed, PCRE2 uses four tables for manipulating and identifying characters\n  whose code point values are less than 256. By default, it uses a set of\n  tables for ASCII encoding that is part of the distribution. If you specify\n\n  --enable-rebuild-chartables\n\n  a program called pcre2_dftables is compiled and run in the default C locale\n  when you obey \"make\". It builds a source file called pcre2_chartables.c. If\n  you do not specify this option, pcre2_chartables.c is created as a copy of\n  pcre2_chartables.c.dist. See \"Character tables\" below for further\n  information.\n\n. It is possible to compile PCRE2 for use on systems that use EBCDIC as their\n  character code (as opposed to ASCII/Unicode) by specifying\n\n  --enable-ebcdic --disable-unicode\n\n  This automatically implies --enable-rebuild-chartables (see above), in order\n  to ensure that you have the correct default character tables for your system's\n  codepage. There is an exception when you set --enable-ebcdic-ignoring-compiler\n  (see below), which allows using a default set of EBCDIC 1047 character tables\n  rather than forcing use of --enable-rebuild-chartables.\n\n  When PCRE2 is built with EBCDIC support, it always operates in EBCDIC. It\n  cannot support both EBCDIC and ASCII or UTF-8/16/32.\n\n  There is a second option, --enable-ebcdic-nl25, which specifies that the code\n  value for the EBCDIC NL character is 0x25 instead of the default 0x15.\n\n  There is a third option, --enable-ebcdic-ignoring-compiler, which disregards\n  the compiler's codepage for determining the numeric value of C character\n  constants such as 'z', and instead forces PCRE2 to use numeric constants for\n  the EBCDIC 1047 codepage instead.\n\n. If you specify --enable-debug, additional debugging code is included in the\n  build. This option is intended for use by the PCRE2 maintainers.\n\n. In environments where valgrind is installed, if you specify\n\n  --enable-valgrind\n\n  PCRE2 will use valgrind annotations to mark certain memory regions as\n  unaddressable. This allows it to detect invalid memory accesses, and is\n  mostly useful for debugging PCRE2 itself.\n\n. In environments where the gcc compiler is used and lcov is installed, if you\n  specify\n\n  --enable-coverage\n\n  the build process implements a code coverage report for the test suite. The\n  report is generated by running \"make coverage\". If ccache is installed on\n  your system, it must be disabled when building PCRE2 for coverage reporting.\n  You can do this by setting the environment variable CCACHE_DISABLE=1 before\n  running \"make\" to build PCRE2. There is more information about coverage\n  reporting in the \"pcre2build\" documentation.\n\n. When JIT support is enabled, pcre2grep automatically makes use of it, unless\n  you add --disable-pcre2grep-jit to the \"configure\" command.\n\n. There is support for calling external programs during matching in the\n  pcre2grep command, using PCRE2's callout facility with string arguments. This\n  support can be disabled by adding --disable-pcre2grep-callout to the\n  \"configure\" command. There are two kinds of callout: one that generates\n  output from inbuilt code, and another that calls an external program. The\n  latter has special support for Windows and VMS; otherwise it assumes the\n  existence of the fork() function. This facility can be disabled by adding\n  --disable-pcre2grep-callout-fork to the \"configure\" command.\n\n. The pcre2grep program currently supports only 8-bit data files, and so\n  requires the 8-bit PCRE2 library. It is possible to compile pcre2grep to use\n  libz and/or libbz2, in order to read .gz and .bz2 files (respectively), by\n  specifying one or both of\n\n  --enable-pcre2grep-libz\n  --enable-pcre2grep-libbz2\n\n  Of course, the relevant libraries must be installed on your system.\n\n. The default starting size (in bytes) of the internal buffer used by pcre2grep\n  can be set by, for example:\n\n  --with-pcre2grep-bufsize=51200\n\n  The value must be a plain integer. The default is 20480. The amount of memory\n  used by pcre2grep is actually three times this number, to allow for \"before\"\n  and \"after\" lines. If very long lines are encountered, the buffer is\n  automatically enlarged, up to a fixed maximum size.\n\n. The default maximum size of pcre2grep's internal buffer can be set by, for\n  example:\n\n  --with-pcre2grep-max-bufsize=2097152\n\n  The default is either 1048576 or the value of --with-pcre2grep-bufsize,\n  whichever is the larger.\n\n. It is possible to compile pcre2test so that it links with the libreadline\n  or libedit libraries, by specifying, respectively,\n\n  --enable-pcre2test-libreadline or --enable-pcre2test-libedit\n\n  If this is done, when pcre2test's input is from a terminal, it reads it using\n  the readline() function. This provides line-editing and history facilities.\n  Note that libreadline is GPL-licensed, so if you distribute a binary of\n  pcre2test linked in this way, there may be licensing issues. These can be\n  avoided by linking with libedit (which has a BSD licence) instead.\n\n  Enabling libreadline causes the -lreadline option to be added to the\n  pcre2test build. In many operating environments with a system-installed\n  readline library this is sufficient. However, in some environments (e.g. if\n  an unmodified distribution version of readline is in use), it may be\n  necessary to specify something like LIBS=\"-lncurses\" as well. This is\n  because, to quote the readline INSTALL, \"Readline uses the termcap functions,\n  but does not link with the termcap or curses library itself, allowing\n  applications which link with readline the option to choose an appropriate\n  library.\" If you get error messages about missing functions tgetstr, tgetent,\n  tputs, tgetflag, or tgoto, this is the problem, and linking with the ncurses\n  library should fix it.\n\n. The C99 standard defines formatting modifiers z and t for size_t and\n  ptrdiff_t values, respectively. By default, PCRE2 uses these modifiers in\n  environments other than Microsoft Visual Studio versions earlier than 2013\n  when __STDC_VERSION__ is defined and has a value greater than or equal to\n  199901L (indicating C99). However, there is at least one environment that\n  claims to be C99 but does not support these modifiers. If\n  --disable-percent-zt is specified, no use is made of the z or t modifiers.\n  Instead of %td or %zu, %lu is used, with a cast for size_t values.\n\n. There is a special option called --enable-fuzz-support for use by people who\n  want to run fuzzing tests on PCRE2. If set, it causes an extra library\n  called libpcre2-fuzzsupport.a to be built, but not installed. This contains\n  a single function called LLVMFuzzerTestOneInput() whose arguments are a\n  pointer to a string and the length of the string. When called, this function\n  tries to compile the string as a pattern, and if that succeeds, to match\n  it. This is done both with no options and with some random options bits that\n  are generated from the string. Setting --enable-fuzz-support also causes an\n  executable called pcre2fuzzcheck-{8,16,32} to be created. This is normally\n  run under valgrind or used when PCRE2 is compiled with address sanitizing\n  enabled. It calls the fuzzing function and outputs information about what it\n  is doing. The input strings are specified by arguments: if an argument\n  starts with \"=\" the rest of it is a literal input string. Otherwise, it is\n  assumed to be a file name, and the contents of the file are the test string.\n\n. Releases before 10.30 could be compiled with --disable-stack-for-recursion,\n  which caused pcre2_match() to use individual blocks on the heap for\n  backtracking instead of recursive function calls (which use the stack). This\n  is now obsolete because pcre2_match() was refactored always to use the heap\n  (in a much more efficient way than before). This option is retained for\n  backwards compatibility, but has no effect other than to output a warning.\n\nThe \"configure\" script builds the following files for the basic C library:\n\n. Makefile             the makefile that builds the library\n. src/config.h         build-time configuration options for the library\n. src/pcre2.h          the public PCRE2 header file\n. pcre2-config         script that shows the building settings such as CFLAGS\n                         that were set for \"configure\"\n. libpcre2-8.pc        )\n. libpcre2-16.pc       ) data for the pkg-config command\n. libpcre2-32.pc       )\n. libpcre2-posix.pc    )\n. libtool              script that builds shared and/or static libraries\n\nVersions of config.h and pcre2.h are distributed in the src directory of PCRE2\ntarballs under the names config.h.generic and pcre2.h.generic. These are\nprovided for those who have to build PCRE2 without using \"configure\" or CMake.\nIf you use \"configure\" or CMake, the .generic versions are not used.\n\nThe \"configure\" script also creates config.status, which is an executable\nscript that can be run to recreate the configuration, and config.log, which\ncontains compiler output from tests that \"configure\" runs.\n\nOnce \"configure\" has run, you can run \"make\". This builds whichever of the\nlibraries libpcre2-8, libpcre2-16 and libpcre2-32 are configured, and a test\nprogram called pcre2test. If you enabled JIT support with --enable-jit, another\ntest program called pcre2_jit_test is built as well. If the 8-bit library is\nbuilt, libpcre2-posix, pcre2posix_test, and the pcre2grep command are also\nbuilt. Running \"make\" with the -j option may speed up compilation on\nmultiprocessor systems.\n\nThe command \"make check\" runs all the appropriate tests. Details of the PCRE2\ntests are given below in a separate section of this document. The -j option of\n\"make\" can also be used when running the tests.\n\nYou can use \"make install\" to install PCRE2 into live directories on your\nsystem. The following are installed (file names are all relative to the\n\u003cprefix\u003e that is set when \"configure\" is run):\n\n  Commands (bin):\n    pcre2test\n    pcre2grep (if 8-bit support is enabled)\n    pcre2-config\n\n  Libraries (lib):\n    libpcre2-8      (if 8-bit support is enabled)\n    libpcre2-16     (if 16-bit support is enabled)\n    libpcre2-32     (if 32-bit support is enabled)\n    libpcre2-posix  (if 8-bit support is enabled)\n\n  Configuration information (lib/pkgconfig):\n    libpcre2-8.pc\n    libpcre2-16.pc\n    libpcre2-32.pc\n    libpcre2-posix.pc\n\n  Header files (include):\n    pcre2.h\n    pcre2posix.h\n\n  Man pages (share/man/man{1,3}):\n    pcre2grep.1\n    pcre2test.1\n    pcre2-config.1\n    pcre2.3\n    pcre2*.3 (lots more pages, all starting \"pcre2\")\n\n  HTML documentation (share/doc/pcre2/html):\n    index.html\n    *.html (lots more pages, hyperlinked from index.html)\n\n  Text file documentation (share/doc/pcre2):\n    AUTHORS\n    COPYING\n    ChangeLog\n    LICENCE\n    NEWS\n    README\n    SECURITY\n    pcre2.txt         (a concatenation of the man(3) pages)\n    pcre2test.txt     the pcre2test man page\n    pcre2grep.txt     the pcre2grep man page\n    pcre2-config.txt  the pcre2-config man page\n\nIf you want to remove PCRE2 from your system, you can run \"make uninstall\".\nThis removes all the files that \"make install\" installed. However, it does not\nremove any directories, because these are often shared with other programs.\n\n\nRetrieving configuration information\n------------------------------------\n\nRunning \"make install\" installs the command pcre2-config, which can be used to\nrecall information about the PCRE2 configuration and installation. For example:\n\n  pcre2-config --version\n\nprints the version number, and\n\n  pcre2-config --libs8\n\noutputs information about where the 8-bit library is installed. This command\ncan be included in makefiles for programs that use PCRE2, saving the programmer\nfrom having to remember too many details. Run pcre2-config with no arguments to\nobtain a list of possible arguments.\n\nThe pkg-config command is another system for saving and retrieving information\nabout installed libraries. Instead of separate commands for each library, a\nsingle command is used. For example:\n\n  pkg-config --libs libpcre2-16\n\nThe data is held in *.pc files that are installed in a directory called\n\u003cprefix\u003e/lib/pkgconfig.\n\n\nShared libraries\n----------------\n\nThe default distribution builds PCRE2 as shared libraries and static libraries,\nas long as the operating system supports shared libraries. Shared library\nsupport relies on the \"libtool\" script which is built as part of the\n\"configure\" process.\n\nThe libtool script is used to compile and link both shared and static\nlibraries. They are placed in a subdirectory called .libs when they are newly\nbuilt. The programs pcre2test and pcre2grep are built to use these uninstalled\nlibraries (by means of wrapper scripts in the case of shared libraries). When\nyou use \"make install\" to install shared libraries, pcre2grep and pcre2test are\nautomatically re-built to use the newly installed shared libraries before being\ninstalled themselves. However, the versions left in the build directory still\nuse the uninstalled libraries.\n\nTo build PCRE2 using static libraries only you must use --disable-shared when\nconfiguring it. For example:\n\n./configure --prefix=/usr/gnu --disable-shared\n\nThen run \"make\" in the usual way. Similarly, you can use --disable-static to\nbuild only shared libraries. Note, however, that when you build only static\nlibraries, binary programs such as pcre2test and pcre2grep may still be\ndynamically linked with other libraries (for example, libc) unless you set\nLDFLAGS to --static when running \"configure\".\n\n\nCross-compiling using autotools\n-------------------------------\n\nYou can specify CC and CFLAGS in the normal way to the \"configure\" command, in\norder to cross-compile PCRE2 for some other host. However, you should NOT\nspecify --enable-rebuild-chartables, because if you do, the pcre2_dftables.c\nsource file is compiled and run on the local host, in order to generate the\ninbuilt character tables (the pcre2_chartables.c file). This will probably not\nwork, because pcre2_dftables.c needs to be compiled with the local compiler,\nnot the cross compiler.\n\nWhen --enable-rebuild-chartables is not specified, pcre2_chartables.c is\ncreated by making a copy of pcre2_chartables.c.dist, which is a default set of\ntables that assumes ASCII code. Cross-compiling with the default tables should\nnot be a problem.\n\nIf you need to modify the character tables when cross-compiling, you should\nmove pcre2_chartables.c.dist out of the way, then compile pcre2_dftables.c by\nhand and run it on the local host to make a new version of\npcre2_chartables.c.dist. See the pcre2build section \"Creating character tables\nat build time\" for more details.\n\n\nMaking new tarballs\n-------------------\n\nThe command \"make dist\" creates three PCRE2 tarballs, in tar.gz, tar.bz2, and\nzip formats. The command \"make distcheck\" does the same, but then does a trial\nbuild of the new distribution to ensure that it works.\n\nIf you have modified any of the man page sources in the doc directory, you\nshould first run the maint/UpdateAlways script before making a distribution.\nThis script creates the .txt and HTML forms of the documentation from the man\npages.\n\n\nTesting PCRE2\n-------------\n\nTo test the basic PCRE2 library on a Unix-like system, run the RunTest script.\nThere is another script called RunGrepTest that tests the pcre2grep command.\nWhen the 8-bit library is built, a test program for the POSIX wrapper, called\npcre2posix_test, is compiled, and when JIT support is enabled, a test program\ncalled pcre2_jit_test is built. The scripts and the program tests are all run\nwhen you obey \"make check\". For other environments, see the instructions in\nNON-AUTOTOOLS-BUILD.\n\nThe RunTest script runs the pcre2test test program (which is documented in its\nown man page) on each of the relevant testinput files in the testdata\ndirectory, and compares the output with the contents of the corresponding\ntestoutput files. RunTest places its output in directories\ntestoutput{8,16,32}{,-jit,-dfa}. Other files whose names begin with \"test\" are\nused as working files in some tests.\n\nSome tests are relevant only when certain build-time options were selected. For\nexample, the tests for UTF-8/16/32 features are run only when Unicode support\nis available. RunTest outputs a comment when it skips a test.\n\nMany (but not all) of the tests that are not skipped are run twice if JIT\nsupport is available. On the second run, JIT compilation is forced. This\ntesting can be suppressed by putting \"-nojit\" on the RunTest command line.\n\nThe entire set of tests is run once for each of the 8-bit, 16-bit and 32-bit\nlibraries that are enabled. If you want to run just one set of tests, call\nRunTest with either the -8, -16 or -32 option.\n\nIf valgrind is installed, you can run the tests under it by putting \"-valgrind\"\non the RunTest command line. To run pcre2test on just one or more specific test\nfiles, give their numbers as arguments to RunTest, for example:\n\n  RunTest 2 7 11\n\nYou can also specify ranges of tests such as 3-6 or 3- (meaning 3 to the\nend), or a number preceded by ~ to exclude a test. For example:\n\n  Runtest 3-15 ~10\n\nThis runs tests 3 to 15, excluding test 10, and just ~13 runs all the tests\nexcept test 13. Whatever order the arguments are in, the tests are always run\nin numerical order.\n\nYou can also call RunTest with the single argument \"list\" to cause it to output\na list of tests.\n\nThe test sequence starts with \"test 0\", which is a special test that has no\ninput file, and whose output is not checked. This is because it will be\ndifferent on different hardware and with different configurations. The test\nexists in order to exercise some of pcre2test's code that would not otherwise\nbe run.\n\nTests 1 and 2 can always be run, as they expect only plain text strings (not\nUTF) and make no use of Unicode properties. The first test file can be fed\ndirectly into the perltest.sh script to check that Perl gives the same results.\nThe only difference you should see is in the first few lines, where the Perl\nversion is given instead of the PCRE2 version. The second set of tests check\nauxiliary functions, error detection, and run-time flags that are specific to\nPCRE2. It also uses the debugging flags to check some of the internals of\npcre2_compile().\n\nIf you build PCRE2 with a locale setting that is not the standard C locale, the\ncharacter tables may be different (see next paragraph). In some cases, this may\ncause failures in the second set of tests. For example, in a locale where the\nisprint() function yields TRUE for characters in the range 128-255, the use of\n[:isascii:] inside a character class defines a different set of characters, and\nthis shows up in this test as a difference in the compiled code, which is being\nlisted for checking. For example, where the comparison test output contains\n[\\x00-\\x7f] the test might contain [\\x00-\\xff], and similarly in some other\ncases. This is not a bug in PCRE2.\n\nTest 3 checks pcre2_maketables(), the facility for building a set of character\ntables for a specific locale and using them instead of the default tables. The\nscript uses the \"locale\" command to check for the availability of the \"fr_FR\",\n\"french\", or \"fr\" locale, and uses the first one that it finds. If the \"locale\"\ncommand fails, or if its output doesn't include \"fr_FR\", \"french\", or \"fr\" in\nthe list of available locales, the third test cannot be run, and a comment is\noutput to say why. If running this test produces an error like this:\n\n  ** Failed to set locale \"fr_FR\"\n\nit means that the given locale is not available on your system, despite being\nlisted by \"locale\". This does not mean that PCRE2 is broken. There are three\nalternative output files for the third test, because three different versions\nof the French locale have been encountered. The test passes if its output\nmatches any one of them.\n\nTests 4 and 5 check UTF and Unicode property support, test 4 being compatible\nwith the perltest.sh script, and test 5 checking PCRE2-specific things.\n\nTests 6 and 7 check the pcre2_dfa_match() alternative matching function, in\nnon-UTF mode and UTF-mode with Unicode property support, respectively.\n\nTest 8 checks some internal offsets and code size features, but it is run only\nwhen Unicode support is enabled. The output is different in 8-bit, 16-bit, and\n32-bit modes and for different link sizes, so there are different output files\nfor each mode and link size.\n\nTests 9 and 10 are run only in 8-bit mode, and tests 11 and 12 are run only in\n16-bit and 32-bit modes. These are tests that generate different output in\n8-bit mode. Each pair are for general cases and Unicode support, respectively.\n\nTest 13 checks the handling of non-UTF characters greater than 255 by\npcre2_dfa_match() in 16-bit and 32-bit modes.\n\nTest 14 contains some special UTF and UCP tests that give different output for\ndifferent code unit widths.\n\nTest 15 contains a number of tests that must not be run with JIT. They check,\namong other non-JIT things, the match-limiting features of the interpretive\nmatcher.\n\nTest 16 is run only when JIT support is not available. It checks that an\nattempt to use JIT has the expected behaviour.\n\nTest 17 is run only when JIT support is available. It checks JIT complete and\npartial modes, match-limiting under JIT, and other JIT-specific features.\n\nTests 18 and 19 are run only in 8-bit mode. They check the POSIX interface to\nthe 8-bit library, without and with Unicode support, respectively.\n\nTest 20 checks the serialization functions by writing a set of compiled\npatterns to a file, and then reloading and checking them.\n\nTests 21 and 22 test \\C support when the use of \\C is not locked out, without\nand with UTF support, respectively. Test 23 tests \\C when it is locked out.\n\nTests 24 and 25 test the experimental pattern conversion functions, without and\nwith UTF support, respectively.\n\nTest 26 checks Unicode property support using tests that were generated\nautomatically from the Unicode data tables. These are the archived version of\nthe tests from Unicode 15.\n\nTest 27 checks Unicode property support using tests that are generated\nautomatically from the currently-used Unicode data tables.\n\nTest 28 tests EBCDIC support, and is only run when PCRE2 is specifically\ncompiled for EBCDIC. Test 29 tests EBCDIC when NL has been configured to be\n0x25.\n\n\nCharacter tables\n----------------\n\nFor speed, PCRE2 uses four tables for manipulating and identifying characters\nwhose code point values are less than 256. By default, a set of tables that is\nbuilt into the library is used. The pcre2_maketables() function can be called\nby an application to create a new set of tables in the current locale. This are\npassed to PCRE2 by calling pcre2_set_character_tables() to put a pointer into a\ncompile context.\n\nThe source file called pcre2_chartables.c contains the default set of tables.\nBy default, this is created as a copy of pcre2_chartables.c.dist, which\ncontains tables for ASCII coding. However, if --enable-rebuild-chartables is\nspecified for ./configure, a new version of pcre2_chartables.c is built by the\nprogram pcre2_dftables (compiled from pcre2_dftables.c), which uses the ANSI C\ncharacter handling functions such as isalnum(), isalpha(), isupper(),\nislower(), etc. to build the table sources. This means that the default C\nlocale that is set for your system will control the contents of these default\ntables. You can change the default tables by editing pcre2_chartables.c and\nthen re-building PCRE2. If you do this, you should take care to ensure that the\nfile does not get automatically re-generated. The best way to do this is to\nmove pcre2_chartables.c.dist out of the way and replace it with your customized\ntables.\n\nWhen the pcre2_dftables program is run as a result of specifying\n--enable-rebuild-chartables, it uses the default C locale that is set on your\nsystem. It does not pay attention to the LC_xxx environment variables. In other\nwords, it uses the system's default locale rather than whatever the compiling\nuser happens to have set. If you really do want to build a source set of\ncharacter tables in a locale that is specified by the LC_xxx variables, you can\nrun the pcre2_dftables program by hand with the -L option. For example:\n\n  ./pcre2_dftables -L pcre2_chartables.c.special\n\nThe second argument names the file where the source code for the tables is\nwritten. The first two 256-byte tables provide lower casing and case flipping\nfunctions, respectively. The next table consists of a number of 32-byte bit\nmaps which identify certain character classes such as digits, \"word\"\ncharacters, white space, etc. These are used when building 32-byte bit maps\nthat represent character classes for code points less than 256. The final\n256-byte table has bits indicating various character types, as follows:\n\n    1   white space character\n    2   letter\n    4   lower case letter\n    8   decimal digit\n   16   alphanumeric or '_'\n\nYou can also specify -b (with or without -L) when running pcre2_dftables. This\ncauses the tables to be written in binary instead of as source code. A set of\nbinary tables can be loaded into memory by an application and passed to\npcre2_compile() in the same way as tables created dynamically by calling\npcre2_maketables(). The tables are just a string of bytes, independent of\nhardware characteristics such as endianness. This means they can be bundled\nwith an application that runs in different environments, to ensure consistent\nbehaviour.\n\nSee also the pcre2build section \"Creating character tables at build time\".\n\n\nFile manifest\n-------------\n\nThe distribution should contain the files listed below.\n\n(A) Source files for the PCRE2 library functions and their headers are found in\n    the src directory:\n\n  src/pcre2_dftables.c     auxiliary program for building pcre2_chartables.c\n                           when --enable-rebuild-chartables is specified\n\n  src/pcre2_chartables.c.dist  a default set of character tables that assume\n                           ASCII coding; unless --enable-rebuild-chartables is\n                           specified, used by copying to pcre2_chartables.c\n  src/pcre2_chartables.c.ebcdic-1047-{nl15,nl25}  a default set of character\n                           tables for EBCDIC 1047; used if\n                           --enable-ebcdic-ignoring-compiler is specified\n                           without --enable-rebuild-chartables\n\n  src/pcre2posix.c           )\n  src/pcre2_auto_possess.c   )\n  src/pcre2_chkdint.c        )\n  src/pcre2_compile.c        )\n  src/pcre2_compile_cgroup.c )\n  src/pcre2_compile_class.c  )\n  src/pcre2_config.c         )\n  src/pcre2_context.c        )\n  src/pcre2_convert.c        )\n  src/pcre2_dfa_match.c      )\n  src/pcre2_error.c          )\n  src/pcre2_extuni.c         )\n  src/pcre2_find_bracket.c   )\n  src/pcre2_jit_compile.c    )\n  src/pcre2_maketables.c     ) sources for the functions in the library,\n  src/pcre2_match.c          )   and some internal functions that they use\n  src/pcre2_match_data.c     )\n  src/pcre2_match_next.c     )\n  src/pcre2_newline.c        )\n  src/pcre2_ord2utf.c        )\n  src/pcre2_pattern_info.c   )\n  src/pcre2_script_run.c     )\n  src/pcre2_serialize.c      )\n  src/pcre2_string_utils.c   )\n  src/pcre2_study.c          )\n  src/pcre2_substitute.c     )\n  src/pcre2_substring.c      )\n  src/pcre2_tables.c         )\n  src/pcre2_ucd.c            )\n  src/pcre2_valid_utf.c      )\n  src/pcre2_xclass.c         )\n\n  src/pcre2_fuzzsupport.c  function for (optional) fuzzing support\n\n  src/config.h.in          template for config.h, when built by \"configure\"\n  src/pcre2.h.in           template for pcre2.h when built by \"configure\"\n  src/pcre2posix.h         header for the external POSIX wrapper API\n  src/pcre2_compile.h      header for internal use\n  src/pcre2_internal.h     header for internal use\n  src/pcre2_intmodedep.h   a mode-specific internal header\n  src/pcre2_jit_char_inc.h header used by JIT\n  src/pcre2_jit_match_inc.h header used by JIT\n  src/pcre2_jit_misc_inc.h header used by JIT\n  src/pcre2_jit_neon_inc.h header used by JIT\n  src/pcre2_jit_simd_inc.h header used by JIT\n  src/pcre2_printint_inc.h debugging function that is used by pcre2test\n  src/pcre2_ucp.h          header for Unicode property handling\n  src/pcre2_ucptables_inc.h header with Unicode data tables\n  src/pcre2_util.h         header for internal utils\n\n  deps/sljit/sljit_src/*   source files for the JIT compiler\n\n(B) Source files for programs that use PCRE2:\n\n  src/pcre2demo.c          simple demonstration of coding calls to PCRE2\n  src/pcre2grep.c          source of a grep utility that uses PCRE2\n  src/pcre2test.c          comprehensive test program\n  src/pcre2_jit_test.c     JIT test program\n  src/pcre2posix_test.c    POSIX wrapper API test program\n\n(C) Auxiliary files:\n\n  AUTHORS.md               information about the authors of PCRE2\n  ChangeLog                log of changes to the code\n  HACKING                  some notes about the internals of PCRE2\n  INSTALL                  generic installation instructions\n  LICENCE.md               conditions for the use of PCRE2\n  COPYING                  the same, using GNU's standard name\n  SECURITY.md              information on reporting vulnerabilities\n  Makefile.in              ) template for Unix Makefile, which is built by\n                           )   \"configure\"\n  Makefile.am              ) the automake input that was used to create\n                           )   Makefile.in\n  NEWS                     important changes in this release\n  NON-AUTOTOOLS-BUILD      notes on building PCRE2 without using autotools\n  README                   this file\n  RunTest                  a Unix shell script for running tests\n  RunGrepTest              a Unix shell script for pcre2grep tests\n  RunTest.bat              a Windows batch file for running tests\n  RunGrepTest.bat          a Windows batch file for pcre2grep tests\n  aclocal.m4               m4 macros (generated by \"aclocal\")\n  m4/*                     m4 macros (used by autoconf)\n  configure                a configuring shell script (built by autoconf)\n  configure.ac             ) the autoconf input that was used to build\n                           )   \"configure\" and config.h\n  doc/*.3                  man page sources for PCRE2\n  doc/*.1                  man page sources for pcre2grep and pcre2test\n  doc/html/*               HTML documentation\n  doc/pcre2.txt            plain text version of the man pages\n  doc/pcre2-config.txt     plain text documentation of pcre2-config script\n  doc/pcre2grep.txt        plain text documentation of grep utility program\n  doc/pcre2test.txt        plain text documentation of test program\n  libpcre2-8.pc.in         template for libpcre2-8.pc for pkg-config\n  libpcre2-16.pc.in        template for libpcre2-16.pc for pkg-config\n  libpcre2-32.pc.in        template for libpcre2-32.pc for pkg-config\n  libpcre2-posix.pc.in     template for libpcre2-posix.pc for pkg-config\n  ar-lib                   )\n  config.guess             )\n  config.sub               )\n  depcomp                  ) helper tools generated by libtool and\n  compile                  )   automake, used internally by ./configure\n  install-sh               )\n  ltmain.sh                )\n  missing                  )\n  test-driver              )\n  perltest.sh              Script for running a Perl test program\n  pcre2-config.in          source of script which retains PCRE2 information\n  testdata/testinput*      test data for main library tests\n  testdata/testoutput*     expected test results\n  testdata/grep*           input and output for pcre2grep tests\n  testdata/*               other supporting test files\n  src/libpcre2-8.sym       )\n  src/libpcre2-16.sym      ) symbol version scripts for the GNU and Sun linkers\n  src/libpcre2-32.sym      )\n  src/libpcre2-posix.sym   )\n\n(D) Auxiliary files for CMake support\n\n  cmake/COPYING-CMAKE-SCRIPTS\n  cmake/FindEditline.cmake\n  cmake/FindReadline.cmake\n  cmake/pcre2-config-version.cmake.in\n  cmake/pcre2-config.cmake.in\n  cmake/PCRE2CheckLinkerFlag.cmake\n  src/config-cmake.h.in\n  CMakeLists.txt\n\n(E) Auxiliary files for building PCRE2 \"by hand\"\n\n  src/pcre2.h.generic     ) a version of the public PCRE2 header file\n                          )   for use in non-\"configure\" environments\n  src/config.h.generic    ) a version of config.h for use in non-\"configure\"\n                          )   environments\n\n(F) Auxiliary files for building PCRE2 using other build systems\n\n  BUILD.bazel             ) files used by the Bazel\n  MODULE.bazel            )   build system\n  build.zig               file used by zig's build system\n\n(G) Auxiliary files for building PCRE2 under OpenVMS\n\n  vms/configure.com       )\n  vms/openvms_readme.txt  ) These files were contributed by a PCRE2 user.\n  vms/pcre2.h_patch       )\n  vms/stdint.h            )\n\n==============================\nLast updated: 18 December 2024\n==============================\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FPCRE2Project%2Fpcre2","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FPCRE2Project%2Fpcre2","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FPCRE2Project%2Fpcre2/lists"}