https://github.com/postgrespro/tsexact
PostgreSQL fulltext search addon
https://github.com/postgrespro/tsexact
Last synced: 6 days ago
JSON representation
PostgreSQL fulltext search addon
- Host: GitHub
- URL: https://github.com/postgrespro/tsexact
- Owner: postgrespro
- Created: 2014-11-10T07:33:40.000Z (over 10 years ago)
- Default Branch: master
- Last Pushed: 2018-03-21T16:11:20.000Z (about 7 years ago)
- Last Synced: 2025-04-24T10:48:44.040Z (7 days ago)
- Language: C
- Size: 12.7 KB
- Stars: 12
- Watchers: 4
- Forks: 3
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
TSExact – PostgreSQL fulltext search addon
==========================================Introduction
------------TSExact – is a PostgreSQL extension with various helper function for fulltext
search. Basically TSExact contains functions which emulate phrase search on
PostgreSQL versions 9.5 and lower. If you're using PostgreSQL 9.6 and higher
you should consider using builtin phrase search rather than TSExact.Authors
-------* Alexander Korotkov , Postgres Professional, Moscow, Russia
Availability
------------TSExact is released as an extension and not available in default PostgreSQL
installation. It is available from
[github](https://github.com/postgrespro/tsexact)
under the same license as
[PostgreSQL](http://www.postgresql.org/about/licence/)
and supports PostgreSQL 9.0+.Installation
------------Before build and install TSExact you should ensure following:
* PostgreSQL version is 9.0 or higher.
* You have development package of PostgreSQL installed or you built
PostgreSQL from source.
* Your PATH variable is configured so that pg\_config command available.
Typical installation procedure may look like this:
$ git clone https://github.com/postgrespro/tsexact.git
$ cd tsexact
$ make USE_PGXS=1
$ sudo make USE_PGXS=1 install
$ make USE_PGXS=1 installcheck
$ psql DB -c "CREATE EXTENSION tsexact;"Usage
-----TSExact offers various helper functions which are listed in the table below. In particular these functions could be used for simple fulltext search.
| Function | Return type | Description |
| ----------------------------------------------------------------- | ----------- | ---------------------------------------------------------- |
| ts_exact_match(document tsvector, fragment tsvector) | bool | Check if given fragment is present in document |
| ts_exact_match(document tsvector, fragment tsvector, weight text) | bool | Check if given fragment is present in document with weight |
| ts_squeeze(document tsvector) | tsvector | Remove empty positions from document |
| setweight(query tsquery, weight text) | tsquery | Assign weight for each lexeme in tsquery |
| poslen(documents tsvector) | integer | Return total number of positions in document |`ts_exact_match(tsvector, tsvector)` function checks if given fragment appears in given document at some offset.
# SELECT ts_exact_match('cat:3 fat:2 sad:4'::tsvector, 'cat:2 fat:1 sad:4'::tsvector);
ts_exact_match
----------------
f
(1 row)# SELECT ts_exact_match('cat:3 fat:2 sad:5'::tsvector, 'cat:2 fat:1 sad:4'::tsvector);
ts_exact_match
----------------
t
(1 row)`ts_exact_match(tsvector, tsvector)` ignores lexemes weights. `ts_exact_match(tsvector, tsvector, text)` only finds fragments in given weight of document. Weights of fragment are always ignored.
# SELECT ts_exact_match('cat:3 fat:2 sad:5'::tsvector, 'cat:2 fat:1 sad:4'::tsvector, 'ABC');
ts_exact_match
----------------
f
(1 row)# SELECT ts_exact_match('cat:3A fat:2B sad:5C'::tsvector, 'cat:2 fat:1 sad:4'::tsvector, 'ABC');
ts_exact_match
----------------
t
(1 row)Since tsvectors could contain gaps in position numbering it's suitable to remove gaps using `ts_squeeze(tsvector)`.
# SELECT ts_squeeze('cat:2,9 fat:1,6 sad:4'::tsvector);
ts_squeeze
-----------------------------
'cat':2,5 'fat':1,4 'sad':3
(1 row)Fulltext search indexes doesn't support `ts_exact_match()` functions. Thus, it's useful to combine `ts_exact_match()` with `tsvector @@ tsquery` expression in order to use indexed search. Therefore, complete example of phrase search may be following.
-- Calculate tsvector using ts_squeeze() function in order to remove gaps in
-- lexemes offsets.
UPDATE tt SET ti =
ts_squeeze(
setweight(to_tsvector(coalesce(title,'')), 'A') ||
setweight(to_tsvector(coalesce(keyword,'')), 'B') ||
setweight(to_tsvector(coalesce(abstract,'')), 'C') ||
setweight(to_tsvector(coalesce(body,'')), 'D'));
-- Search for phrase. "tsvector @@ tsquery" operator is used for phrase search,
-- ts_exact_match() function is used to recheck an exact phrase match.
SELECT *
FROM tt
WHERE tt.ti @@ plainto_tsquery('fat rat') AND
ts_exact_match(tt.ti, ts_squeeze(to_tsvector('fat rat')));`setweight(tsquery, text)` assigns given weight to each lexeme of tsquery.
# SELECT setweight('fat:A & (cat:B | rat:C)'::tsquery, 'CD');
setweight
------------------------------------
'fat':CD & ( 'cat':CD | 'rat':CD )
(1 row)`poslen(tsvector)` returns total number of lexeme positions in tsvector.
# SELECT poslen('cat:3,4,5,9,10 fat:1,2,6,7,8'::tsvector);
poslen
--------
10
(1 row)