Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/petere/pgpcre
PCRE functions for PostgreSQL
https://github.com/petere/pgpcre
pcre perl postgresql postgresql-extension regular-expression
Last synced: 25 days ago
JSON representation
PCRE functions for PostgreSQL
- Host: GitHub
- URL: https://github.com/petere/pgpcre
- Owner: petere
- License: other
- Created: 2013-02-09T21:20:22.000Z (almost 12 years ago)
- Default Branch: master
- Last Pushed: 2024-03-21T15:46:38.000Z (9 months ago)
- Last Synced: 2024-04-16T01:25:30.636Z (8 months ago)
- Topics: pcre, perl, postgresql, postgresql-extension, regular-expression
- Language: C
- Size: 23.4 KB
- Stars: 21
- Watchers: 4
- Forks: 4
- Open Issues: 7
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# pgpcre
[![Build Status](https://secure.travis-ci.org/petere/pgpcre.png)](http://travis-ci.org/petere/pgpcre)
This is a module for PostgreSQL that exposes Perl-compatible regular expressions (PCRE) functionality as functions and operators. It is based on the popular [PCRE library](http://www.pcre.org/).
## Installation
You need to have libpcre installed. pkg-config will be used to find it.
To build and install this module:
make
make installor selecting a specific PostgreSQL installation:
make PG_CONFIG=/some/where/bin/pg_config
make PG_CONFIG=/some/where/bin/pg_config installAnd finally inside the database:
CREATE EXTENSION pgpcre;
## Using
A regular expression is a separate data type, named `pcre`. (This is different from how the built-in regular expressions in PostgreSQL work, which are simply values of type `text`.)
The supported regular expressions are documented on the [pcrepattern(3)](http://linux.die.net/man/3/pcrepattern) man page.
### Basic matching
Boolean operators are available for checking whether a pattern matches a string. These operators return true or false, respectively. They only return null when one of the operands is null.
Examples:
SELECT 'foo' ~ pcre 'fo+';
SELECT 'bar' !~ pcre 'fo+';You can also write it the other way around:
SELECT pcre 'fo+' ~ 'foo';
SELECT pcre 'fo+' !~ 'bar';This can be handy for writing things like
SELECT pcre 'fo+' ~ ANY(ARRAY['foo', 'bar']);
For Perl nostalgia, you can also use this operator:
SELECT 'foo' =~ pcre 'fo+';
And if this operator is unique (which it should be, unless you have
something else installed that uses it), you can also write:SELECT 'foo' =~ 'fo+';
(The `~` operator, by contrast, is not unique, of course, because it is used by the built-in regular expressions.)
To get case-insensitive matching, set the appropriate option in the pattern, for example:
SELECT 'FOO' ~ pcre '(?i)fo+';
### Extracting the matched string
To extract the substring that was matched by the pattern, use the
function `pcre_match`. It returns either a value of type text, or
null if the pattern did not match. Examples:SELECT pcre_match('fo+', 'foobar'); --> 'foo'
SELECT pcre_match('fo+', 'barbar'); --> NULLThere is no support for extracting multiple matches of a pattern in a
string, because PCRE does not (easily) support that.### Extracting captured substrings
Captured substrings (parenthesized subexpressions) are extracted using
the function `pcre_captured_substrings`. It returns either an array
of text, or null if the pattern did not match. Examples:SELECT pcre_captured_substrings('(fo+)(b..)', 'foobar'); --> ARRAY['foo','bar']
SELECT pcre_captured_substrings('(fo+)(b..)', 'abcdef'); --> NULLNote that elements of the array can be null if a substring was not used, for example:
SELECT pcre_captured_substrings('(a|(z))(bc)', 'abc'); --> ARRAY['a',NULL,'bc']
### Storing regular expressions
You can store regular expression values of type `pcre` in tables, like
any other data. Note, however, that the binary representation of the
`pcre` values contains the compiled regular expression, which is tied
to the version of the PCRE library. If you upgrade the PCRE library
and use a compiled value created by a different version, things might
not work or even crash (according to the PCRE documentation; I don't
know how likely that is). pgpcre will warn if you attempt to use a
value that was compiled by a different version of the library. If
that happens, it is advisable to recompile and rewrite all stored
`pcre` values by doing something likeUPDATE ... SET pcre_col = pcre_col::text::pcre
(To be clear, storing regular expressions in tables is not a typical
use. Normally, you store text in tables and match it against regular
expressions provided by your application.)## Discussion
Some possible advantages over the regular expression support built into PostgreSQL:
- richer pattern language, more familiar to Perl and Python programmers
- complete Unicode support
- saner operators and functionsSome disadvantages:
- no repeated matching (`'g'` flag)
- no index optimizationYou can workaround the lack of index optimization by manually augmenting queries like
column =~ '^foo'
with
AND column ~>=~ 'foo' AND column ~<~ 'fop'
and creating the appropriate `text_pattern_ops` index as you would for the built-in pattern matching.