{"id":34736666,"url":"https://github.com/grindlemire/go-lucene","last_synced_at":"2026-04-28T02:06:02.559Z","repository":{"id":64887715,"uuid":"569523408","full_name":"grindlemire/go-lucene","owner":"grindlemire","description":"A pure go lucene parser with no dependencies.","archived":false,"fork":false,"pushed_at":"2026-01-15T14:20:34.000Z","size":233,"stargazers_count":46,"open_issues_count":1,"forks_count":16,"subscribers_count":3,"default_branch":"main","last_synced_at":"2026-01-15T18:15:29.555Z","etag":null,"topics":["go","lucene","parser","search"],"latest_commit_sha":null,"homepage":"","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/grindlemire.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2022-11-23T02:37:55.000Z","updated_at":"2026-01-15T14:19:35.000Z","dependencies_parsed_at":"2023-12-05T16:34:04.976Z","dependency_job_id":"6465161c-9b5f-48ae-af8d-9c16a9e281c7","html_url":"https://github.com/grindlemire/go-lucene","commit_stats":null,"previous_names":["grindlemire/go-search"],"tags_count":28,"template":false,"template_full_name":null,"purl":"pkg:github/grindlemire/go-lucene","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/grindlemire%2Fgo-lucene","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/grindlemire%2Fgo-lucene/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/grindlemire%2Fgo-lucene/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/grindlemire%2Fgo-lucene/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/grindlemire","download_url":"https://codeload.github.com/grindlemire/go-lucene/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/grindlemire%2Fgo-lucene/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28480081,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-16T11:59:17.896Z","status":"ssl_error","status_checked_at":"2026-01-16T11:55:55.838Z","response_time":107,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["go","lucene","parser","search"],"created_at":"2025-12-25T03:42:39.445Z","updated_at":"2026-04-28T02:06:02.551Z","avatar_url":"https://github.com/grindlemire.png","language":"Go","funding_links":[],"categories":[],"sub_categories":[],"readme":"# go-lucene\n\n[![Go Reference](https://pkg.go.dev/badge/github.com/grindlemire/go-lucene.svg)](https://pkg.go.dev/github.com/grindlemire/go-lucene)\n\nParse [Lucene](https://lucene.apache.org/core/9_4_2/queryparser/org/apache/lucene/queryparser/classic/package-summary.html#package.description) queries and turn them into SQL. No dependencies. PostgreSQL, SQLite, MySQL, and MariaDB work out of the box, and you can plug in your own dialect for anything else.\n\n```go\nquery := `name:\"John Doe\" AND age:[25 TO 35]`\nsql, params, err := lucene.ToParameterizedPostgres(query)\n// sql:    ((\"name\" = $1) AND (\"age\" \u003e= $2 AND \"age\" \u003c= $3))\n// params: [\"John Doe\", 25, 35]\n```\n\n## Contents\n\n- [Install](#install)\n- [Usage](#usage)\n- [Operator reference](#operator-reference)\n- [SQLite](#sqlite)\n- [MySQL](#mysql)\n- [Custom drivers](#custom-drivers)\n\n## Install\n\n```bash\ngo get github.com/grindlemire/go-lucene\n```\n\n## Usage\n\n### Parameterized queries (recommended)\n\nUse the parameterized form for anything that touches user input. It returns a SQL string with placeholders and a separate slice of values that your driver will bind safely.\n\n```go\nsql, params, err := lucene.ToParameterizedPostgres(`color:red AND type:\"gala\"`)\n// sql:    (\"color\" = $1) AND (\"type\" = $2)\n// params: [\"red\", \"gala\"]\n\nrows, err := db.Query(sql, params...)\n```\n\nSQLite has an equivalent that uses `?` placeholders:\n\n```go\nsql, params, err := lucene.ToParameterizedSQLite(`color:red AND type:\"gala\"`)\n// sql:    (\"color\" = ?) AND (\"type\" = ?)\n// params: [\"red\", \"gala\"]\n```\n\nMySQL also uses `?` placeholders and backtick-quoted identifiers:\n\n```go\nsql, params, err := lucene.ToParameterizedMySQL(`color:red AND type:\"gala\"`)\n// sql:    (`color` = ?) AND (`type` = ?)\n// params: [\"red\", \"gala\"]\n```\n\n### Inline values\n\nIf you don't need parameter binding (for example, when generating SQL for inspection), `ToPostgres`, `ToSQLite`, and `ToMySQL` embed values directly into the string:\n\n```go\nsql, err := lucene.ToPostgres(`name:\"John Doe\" AND age:[25 TO 35]`)\n// ((\"name\" = 'John Doe') AND (\"age\" \u003e= 25 AND \"age\" \u003c= 35))\n```\n\n### Default field\n\nWhen a term has no field prefix, the parser can fall back to one you supply:\n\n```go\nsql, err := lucene.ToPostgres(`red OR green`, lucene.WithDefaultField(\"color\"))\n// (\"color\" = 'red') OR (\"color\" = 'green')\n```\n\n## Operator reference\n\nOutput below is Postgres. See [SQLite](#sqlite) and [MySQL](#mysql) for where those drivers differ.\n\n| Lucene | SQL | Notes |\n|---|---|---|\n| `field:value` | `\"field\" = 'value'` | Exact match |\n| `field:\"phrase with spaces\"` | `\"field\" = 'phrase with spaces'` | Quoted phrase |\n| `a:1 AND b:2` | `(\"a\" = 1) AND (\"b\" = 2)` | Boolean AND |\n| `a:1 OR b:2` | `(\"a\" = 1) OR (\"b\" = 2)` | Boolean OR |\n| `NOT field:value` | `NOT(\"field\" = 'value')` | Negation |\n| `+field:value` | `\"field\" = 'value'` | Required term (same as no prefix) |\n| `-field:value` | `NOT(\"field\" = 'value')` | Prohibited term |\n| `field:[min TO max]` | `\"field\" \u003e= min AND \"field\" \u003c= max` | Inclusive range |\n| `field:{min TO max}` | `\"field\" \u003e min AND \"field\" \u003c max` | Exclusive range |\n| `field:[min TO *]` | `\"field\" \u003e= min` | Open upper bound |\n| `field:[* TO max]` | `\"field\" \u003c= max` | Open lower bound |\n| `field:*` | `\"field\" SIMILAR TO '%'` | Match anything non-null |\n| `field:pat*` | `\"field\" SIMILAR TO 'pat%'` | Wildcard suffix |\n| `field:pat?` | `\"field\" SIMILAR TO 'pat_'` | Single-character wildcard |\n| `field:/regex/` | `\"field\" ~ 'regex'` | Regular expression |\n| `(a:1 OR b:2) AND c:3` | `((\"a\" = 1) OR (\"b\" = 2)) AND (\"c\" = 3)` | Grouping |\n\n## SQLite\n\nSQLite lacks direct equivalents for a few Postgres operators, so the SQLite driver renders them differently:\n\n| Lucene | Postgres | SQLite |\n|---|---|---|\n| `field:*` | `\"field\" SIMILAR TO '%'` | `\"field\" IS NOT NULL` |\n| `field:pat*` | `\"field\" SIMILAR TO 'pat%'` | `\"field\" GLOB 'pat*'` |\n| `field:pat?` | `\"field\" SIMILAR TO 'pat_'` | `\"field\" GLOB 'pat?'` |\n| `field:/regex/` | `\"field\" ~ 'regex'` | `\"field\" REGEXP 'regex'` |\n| parameters | `$1, $2, ...` | `?` |\n\n### Things to watch for\n\n**GLOB is case-sensitive** and uses Unix glob syntax. Lucene's `*` and `?` map cleanly onto GLOB's `*` and `?`.\n\n**GLOB has no escape character.** To match a literal `*` or `?`, use the regex form `field:/.../`.\n\n**GLOB has no alternation.** A pattern like `field:*(a|b)*` matches the literal characters `(a|b)`, not \"a or b\". Use `field:/.*(a|b).*/` if you need alternation in SQLite.\n\n**A bare `field:*` becomes `IS NOT NULL`**, matching any row where the field has a value regardless of storage class.\n\n### Registering REGEXP\n\nSQLite ships without a `regexp()` function, so regex queries will error at query time unless you register one on your connection.\n\nWith `modernc.org/sqlite`:\n\n```go\nimport (\n    \"database/sql/driver\"\n    \"regexp\"\n\n    \"modernc.org/sqlite\"\n)\n\nfunc init() {\n    sqlite.MustRegisterDeterministicScalarFunction(\n        \"regexp\",\n        2,\n        func(ctx *sqlite.FunctionContext, args []driver.Value) (driver.Value, error) {\n            pattern, ok := args[0].(string)\n            if !ok {\n                return false, nil\n            }\n            value, ok := args[1].(string)\n            if !ok {\n                return false, nil\n            }\n            matched, err := regexp.MatchString(pattern, value)\n            if err != nil {\n                return false, nil\n            }\n            return matched, nil\n        },\n    )\n}\n```\n\nWith `mattn/go-sqlite3`, build with the `sqlite_regex` tag.\n\n## MySQL\n\nMySQL uses backticks for identifiers and doesn't have `SIMILAR TO`, so the MySQL driver routes wildcards through `LIKE ... ESCAPE '#'` and falls back to `REGEXP` when a pattern uses SIMILAR-TO-only constructs (alternation, character classes, grouping):\n\n| Lucene | Postgres | MySQL |\n|---|---|---|\n| `field:value` | `\"field\" = 'value'` | `` `field` = 'value' `` |\n| `field:*` | `\"field\" SIMILAR TO '%'` | `` `field` IS NOT NULL `` |\n| `field:pat*` | `\"field\" SIMILAR TO 'pat%'` | `` `field` LIKE 'pat%' ESCAPE '#' `` |\n| `field:pat?` | `\"field\" SIMILAR TO 'pat_'` | `` `field` LIKE 'pat_' ESCAPE '#' `` |\n| `field:100%*` (literal `%`) | `\"field\" SIMILAR TO '100\\%%'` | `` `field` LIKE '100#%%' ESCAPE '#' `` |\n| `field:*(a\\|b)*` (passed via `expr.LIKE`, see note) | `\"field\" SIMILAR TO '%(a\\|b)%'` | `` `field` REGEXP '^(.*(a\\|b).*)$' `` |\n| `field:/regex/` | `\"field\" ~ 'regex'` | `` `field` REGEXP 'regex' `` |\n| bool literal `true` | `true` | `TRUE` |\n| parameters | `$1, $2, ...` | `?` |\n\n### Things to watch for\n\n**Identifiers always quote with backticks.** Column names containing a backtick are rejected at render time. The always-quote policy handles MySQL 8.0's expanded reserved-word list (`RANK`, `LEAD`, `WINDOW`, `ROWS`, etc.) automatically.\n\n**`LIKE` vs `REGEXP` is decided by pattern content.** Simple Lucene wildcards (`*`, `?`) stay on the index-friendly `LIKE` path. Patterns containing `|`, `()`, `[]`, `{}`, or `+` fall back to `REGEXP` with an anchored `^(...)$` translation so the match semantics line up with Postgres `SIMILAR TO`. A plain capturing group is used rather than `(?:...)` because the non-capturing form is a Perl extension that POSIX ERE (MySQL 5.7 / MariaDB 10.0-10.4) does not guarantee.\n\n**The `ESCAPE '#'` clause is intentional.** Using `#` instead of the default backslash keeps the rendered SQL portable across `sql_mode` settings. Under `NO_BACKSLASH_ESCAPES`, a `\\` escape clause would be reinterpreted and break the LIKE pattern.\n\n**Non-parameterized output is best-effort under `NO_BACKSLASH_ESCAPES`.** `ToMySQL` doubles backslashes in string literals (correct under the default `sql_mode`, portable under `NO_BACKSLASH_ESCAPES` in a harmless-but-literal way). If your server runs with `NO_BACKSLASH_ESCAPES`, use `ToParameterizedMySQL` instead: parameters travel over the wire protocol and bypass string-literal parsing entirely.\n\n**Case sensitivity follows the column collation.** `LIKE` and `REGEXP` honor the operand collation: a `_ci` collation matches case-insensitively, `_bin` matches case-sensitively. The driver can't fix this without parsing column metadata; if you need explicit casing, attach `BINARY` or `COLLATE` in your query.\n\n**Regex engine varies by database and version.** MySQL 5.7 and MariaDB 10.0-10.4 use Henry Spencer POSIX regex. MySQL 8.0+ uses ICU. MariaDB 10.5+ uses PCRE2. Perl-style escapes (`\\d`, `\\w`, `\\s`) work on ICU and PCRE2 but not on Henry Spencer. For portability across all four, use POSIX bracket classes (`[[:digit:]]`, `[[:space:]]`, `[[:alnum:]_]`).\n\n**Booleans render as `TRUE`/`FALSE` in SQL and pass through as Go `bool` as a parameter.** Both forms evaluate to `1`/`0` in MySQL. Against a `TINYINT(1)` column storing values other than 0 or 1, `col = TRUE` won't match those rows because `TRUE` is exactly `1`. That's a MySQL data-modeling quirk, not a driver bug.\n\n### MariaDB\n\nMariaDB uses the same driver. `lucene.ToMariaDB` and `lucene.ToParameterizedMariaDB` are aliases over the MySQL renderers.\n\n```go\nsql, params, err := lucene.ToParameterizedMariaDB(`color:red AND type:\"gala\"`)\n// sql:    (`color` = ?) AND (`type` = ?)\n// params: [\"red\", \"gala\"]\n```\n\nEvery construct the driver emits (backtick quoting, `LIKE ... ESCAPE`, `REGEXP`, `BETWEEN`, `?` placeholders, bool literals) is identical on both databases, and the regex fallback avoids Perl extensions so it runs on every regex engine either database has shipped. No new dialect is needed; the MySQL test suite covers MariaDB by swapping `MYSQL_IMAGE=mariadb:10.x`.\n\n## Custom drivers\n\nTo target a database other than Postgres, SQLite, or MySQL, embed `driver.Base` and supply a `Dialect` that matches your database's semantics. The dialect covers the operators that actually vary between databases (wildcards, regex, standalone `*`, bool literals, string-literal escaping, identifier quoting); the simple operators (`AND`, `OR`, `=`, comparisons, `IN`, `NOT`) are handled by `driver.Base` through the shared `RenderFNs` map.\n\nThe `Dialect` interface has seven methods:\n\n```go\ntype Dialect interface {\n    RenderLike(left, right string, isRegex bool) (string, error)\n    RenderStandaloneWild(left string) (string, error)\n    PrepareLikePattern(pattern string) (transformed string, useRegex bool)\n    EscapeStringLiteral(s string) string\n    SerializeBool(b bool) string\n    BoolParam(b bool) any\n    QuoteColumn(name string) (string, error)\n}\n```\n\nHere's a sketch of a SQL Server dialect. SQL Server uses `[...]` for identifiers, `LIKE` with `%`/`_` for wildcards, no built-in regex:\n\n```go\nimport (\n    \"fmt\"\n    \"strings\"\n\n    \"github.com/grindlemire/go-lucene/pkg/driver\"\n    \"github.com/grindlemire/go-lucene/pkg/lucene/expr\"\n)\n\ntype SQLServerDriver struct {\n    driver.Base\n}\n\nfunc NewSQLServerDriver() SQLServerDriver {\n    fns := map[expr.Operator]driver.RenderFN{}\n    for op, sharedFN := range driver.Shared {\n        fns[op] = sharedFN\n    }\n    return SQLServerDriver{\n        Base: driver.Base{RenderFNs: fns, Dialect: sqlServerDialect{}},\n    }\n}\n\ntype sqlServerDialect struct{}\n\nfunc (sqlServerDialect) RenderLike(left, right string, isRegex bool) (string, error) {\n    if isRegex {\n        return \"\", fmt.Errorf(\"SQL Server has no built-in regex operator\")\n    }\n    return fmt.Sprintf(\"%s LIKE %s\", left, right), nil\n}\n\nfunc (sqlServerDialect) RenderStandaloneWild(left string) (string, error) {\n    return fmt.Sprintf(\"%s IS NOT NULL\", left), nil\n}\n\nfunc (sqlServerDialect) PrepareLikePattern(pattern string) (string, bool) {\n    // SQL Server LIKE uses % and _ like SIMILAR TO; escape literals with [].\n    pattern = strings.ReplaceAll(pattern, \"%\", \"[%]\")\n    pattern = strings.ReplaceAll(pattern, \"_\", \"[_]\")\n    pattern = strings.ReplaceAll(pattern, \"*\", \"%\")\n    pattern = strings.ReplaceAll(pattern, \"?\", \"_\")\n    return pattern, false\n}\n\nfunc (sqlServerDialect) EscapeStringLiteral(s string) string {\n    return \"'\" + strings.ReplaceAll(s, \"'\", \"''\") + \"'\"\n}\n\nfunc (sqlServerDialect) SerializeBool(b bool) string {\n    if b {\n        return \"1\"\n    }\n    return \"0\"\n}\n\nfunc (sqlServerDialect) BoolParam(b bool) any {\n    if b {\n        return 1\n    }\n    return 0\n}\n\nfunc (sqlServerDialect) QuoteColumn(name string) (string, error) {\n    if strings.ContainsRune(name, ']') {\n        return \"\", fmt.Errorf(\"column name contains a right bracket: %q\", name)\n    }\n    return \"[\" + name + \"]\", nil\n}\n```\n\nThe three built-in drivers are the best reference implementations: [`pkg/driver/postgresql.go`](pkg/driver/postgresql.go), [`pkg/driver/sqlite.go`](pkg/driver/sqlite.go), and [`pkg/driver/mysql.go`](pkg/driver/mysql.go).\n\n### Dialect defaults\n\nA driver that leaves `driver.Base.Dialect` unset inherits Postgres-flavored rendering: `SIMILAR TO` for wildcards, `~` for regex, and `true`/`false` for bool literals. Set a `Dialect` on the embedded `Base` whenever your target database diverges from that.\n\n### Migrating from pre-Dialect drivers\n\n`expr.Like`, `expr.Range`, and `expr.Regexp` are no longer dispatched through `RenderFNs`. If a custom driver used to override any of those three through the map, move the logic into a `driver.Dialect` implementation (`RenderLike`, `RenderStandaloneWild`, `PrepareLikePattern`) and set it on `Base.Dialect`. Map entries for those operators are silently ignored.\n\nThe `Dialect` interface evolved across releases: `EscapeLikePattern` was replaced by `PrepareLikePattern` (which additionally returns a `useRegex` flag so dialects can route alternation/grouping patterns to a regex path), and `EscapeStringLiteral` was added so dialects can control string-literal quoting (MySQL needs this to double backslashes under default `sql_mode`).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgrindlemire%2Fgo-lucene","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgrindlemire%2Fgo-lucene","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgrindlemire%2Fgo-lucene/lists"}