Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/asg017/sqlite-regex
A fast regular expression SQLite extension, written in Rust
https://github.com/asg017/sqlite-regex
sqlite sqlite-extension
Last synced: 5 days ago
JSON representation
A fast regular expression SQLite extension, written in Rust
- Host: GitHub
- URL: https://github.com/asg017/sqlite-regex
- Owner: asg017
- License: apache-2.0
- Created: 2022-11-04T18:31:01.000Z (about 2 years ago)
- Default Branch: main
- Last Pushed: 2024-04-22T07:46:31.000Z (8 months ago)
- Last Synced: 2024-10-18T06:05:01.801Z (2 months ago)
- Topics: sqlite, sqlite-extension
- Language: Rust
- Homepage:
- Size: 150 KB
- Stars: 171
- Watchers: 3
- Forks: 6
- Open Issues: 16
-
Metadata Files:
- Readme: README.md
- License: LICENSE-APACHE
Awesome Lists containing this project
- awesome-sqlite - `sqlite-regex` - `REGEXP`support and other regular expression functions (People / As Main Database)
README
# sqlite-regex
A fast and performant SQLite extension for regular expressions. Based on [`sqlite-loadable-rs`](https://github.com/asg017/sqlite-loadable-rs), and the [regex crate](https://crates.io/crates/regex).
See [_Introducing sqlite-regex: The fastest Regular Expression Extension for SQLite_](https://observablehq.com/@asg017/introducing-sqlite-regex) (Jan 2023) for more details!
If your company or organization finds this library useful, consider [supporting my work](#supporting)!
![](./benchmarks/dates.png)
## Usage
```sql
.load ./regex0
select 'foo' regexp 'f';```
**Find all occurrences of a pattern in a string**
```sql
select regex_find(
'[0-9]{3}-[0-9]{3}-[0-9]{4}',
'phone: 111-222-3333'
);
-- '111-222-3333'select rowid, *
from regex_find_all(
'\b\w{13}\b',
'Retroactively relinquishing remunerations is reprehensible.'
);
/*
┌───────┬───────┬─────┬───────────────┐
│ rowid │ start │ end │ match │
├───────┼───────┼─────┼───────────────┤
│ 0 │ 0 │ 13 │ Retroactively │
│ 1 │ 14 │ 27 │ relinquishing │
│ 2 │ 28 │ 41 │ remunerations │
│ 3 │ 45 │ 58 │ reprehensible │
└───────┴───────┴─────┴───────────────┘
*/
```**Extract capture group values by index or name**
```sql
select
regex_capture(captures, 0) as entire_match,
regex_capture(captures, 'title') as title,
regex_capture(captures, 'year') as year
from regex_captures(
regex("'(?P[^']+)'\s+\((?P\d{4})\)"),
"'Citizen Kane' (1941), 'The Wizard of Oz' (1939), 'M' (1931)."
);
/*
┌───────────────────────────┬──────────────────┬──────┐
│ entire_match │ title │ year │
├───────────────────────────┼──────────────────┼──────┤
│ 'Citizen Kane' (1941) │ Citizen Kane │ 1941 │
│ 'The Wizard of Oz' (1939) │ The Wizard of Oz │ 1939 │
│ 'M' (1931) │ M │ 1931 │
└───────────────────────────┴──────────────────┴──────┘
*/
```**Use RegexSets to match a string on multiple patterns in linear time**
```sql
select regexset_is_match(
regexset(
"bar",
"foo",
"barfoo"
),
'foobar'
)
```**Split the string on the given pattern delimiter**
```sql
select rowid, *
from regex_split('[ \t]+', 'a b c d e');
/*
┌───────┬──────┐
│ rowid │ item │
├───────┼──────┤
│ 0 │ a │
│ 1 │ b │
│ 2 │ c │
│ 3 │ d │
│ 4 │ e │
└───────┴──────┘
*/
```**Replace occurrences of a pattern with another string**
```sql
select regex_replace(
'(?P[^,\s]+),\s+(?P\S+)',
'Springsteen, Bruce',
'$first $last'
);
-- 'Bruce Springsteen'select regex_replace_all('a', 'abc abc', '');
-- 'bc bc'
```## Documentation
See [`docs.md`](./docs.md) for a full API reference.
## Installing
| Language | Install | |
| -------------- | -------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Python | `pip install sqlite-regex` | [![PyPI](https://img.shields.io/pypi/v/sqlite-regex.svg?color=blue&logo=python&logoColor=white)](https://pypi.org/project/sqlite-regex/) |
| Datasette | `datasette install datasette-sqlite-regex` | [![Datasette](https://img.shields.io/pypi/v/datasette-sqlite-regex.svg?color=B6B6D9&label=Datasette+plugin&logoColor=white&logo=python)](https://datasette.io/plugins/datasette-sqlite-regex) |
| Node.js | `npm install sqlite-regex` | [![npm](https://img.shields.io/npm/v/sqlite-regex.svg?color=green&logo=nodedotjs&logoColor=white)](https://www.npmjs.com/package/sqlite-regex) |
| Deno | [`deno.land/x/sqlite_regex`](https://deno.land/x/sqlite_regex) | [![deno.land/x release](https://img.shields.io/github/v/release/asg017/sqlite-regex?color=fef8d2&include_prereleases&label=deno.land%2Fx&logo=deno)](https://deno.land/x/sqlite_regex) |
| Ruby | `gem install sqlite-regex` | ![Gem](https://img.shields.io/gem/v/sqlite-regex?color=red&logo=rubygems&logoColor=white) |
| Github Release | | ![GitHub tag (latest SemVer pre-release)](https://img.shields.io/github/v/tag/asg017/sqlite-regex?color=lightgrey&include_prereleases&label=Github+release&logo=github) |
| Rust | `cargo add sqlite-regex` | [![Crates.io](https://img.shields.io/crates/v/sqlite-regex?logo=rust)](https://crates.io/crates/sqlite-regex) |The [Releases page](https://github.com/asg017/sqlite-regex/releases) contains pre-built binaries for Linux x86_64, MacOS, and Windows.
### As a loadable extension
If you want to use `sqlite-regex` as a [Runtime-loadable extension](https://www.sqlite.org/loadext.html), Download the `regex0.dylib` (for MacOS), `regex0.so` (Linux), or `regex0.dll` (Windows) file from a release and load it into your SQLite environment.
> **Note:**
> The `0` in the filename (`regex0.dylib`/ `regex0.so`/`regex0.dll`) denotes the major version of `sqlite-regex`. Currently `sqlite-regex` is pre v1, so expect breaking changes in future versions.For example, if you are using the [SQLite CLI](https://www.sqlite.org/cli.html), you can load the library like so:
```sql
.load ./regex0
select regex_version();
-- v0.1.0
```Or in Python, using the builtin [sqlite3 module](https://docs.python.org/3/library/sqlite3.html):
```python
import sqlite3
con = sqlite3.connect(":memory:")
con.enable_load_extension(True)
con.load_extension("./regex0")
print(con.execute("select regex_version()").fetchone())
# ('v0.1.0',)
```Or in Node.js using [better-sqlite3](https://github.com/WiseLibs/better-sqlite3):
```javascript
const Database = require("better-sqlite3");
const db = new Database(":memory:");
db.loadExtension("./regex0");
console.log(db.prepare("select regex_version()").get());
// { 'regex_version()': 'v0.1.0' }
```Or with [Datasette](https://datasette.io/):
```
datasette data.db --load-extension ./regex0
```## Supporting
I (Alex 👋🏼) spent a lot of time and energy on this project and [many other open source projects](https://github.com/asg017?tab=repositories&q=&type=&language=&sort=stargazers). If your company or organization uses this library (or you're feeling generous), then please [consider supporting my work](https://alexgarcia.regex/work.html), or share this project with a friend!
## See also
- [sqlite-xsv](https://github.com/asg017/sqlite-xsv), A SQLite extension for working with CSVs
- [sqlite-loadable](https://github.com/asg017/sqlite-loadable-rs), A framework for writing SQLite extensions in Rust
- [sqlite-http](https://github.com/asg017/sqlite-http), A SQLite extension for making HTTP requests