{"id":47678135,"url":"https://github.com/louis77/mulldb","last_synced_at":"2026-04-02T13:39:07.642Z","repository":{"id":340826290,"uuid":"1165513692","full_name":"louis77/mulldb","owner":"louis77","description":"A lightweight SQL database written from scratch in Go as a learning/research project. Speaks the PostgreSQL wire protocol — connect with psql or any PG driver. Supports basic CRUD, persistent WAL storage, and concurrent access.","archived":false,"fork":false,"pushed_at":"2026-02-26T18:03:57.000Z","size":60,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-02-26T23:46:11.754Z","etag":null,"topics":["database","postgres","postgresql","sql"],"latest_commit_sha":null,"homepage":"","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/louis77.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-02-24T08:46:35.000Z","updated_at":"2026-02-26T18:04:01.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/louis77/mulldb","commit_stats":null,"previous_names":["louis77/mulldb"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/louis77/mulldb","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/louis77%2Fmulldb","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/louis77%2Fmulldb/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/louis77%2Fmulldb/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/louis77%2Fmulldb/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/louis77","download_url":"https://codeload.github.com/louis77/mulldb/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/louis77%2Fmulldb/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31307187,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-02T12:59:32.332Z","status":"ssl_error","status_checked_at":"2026-04-02T12:54:48.875Z","response_time":89,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["database","postgres","postgresql","sql"],"created_at":"2026-04-02T13:39:07.513Z","updated_at":"2026-04-02T13:39:07.617Z","avatar_url":"https://github.com/louis77.png","language":"Go","funding_links":[],"categories":[],"sub_categories":[],"readme":"# mulldb\n\nA lightweight SQL database written from scratch in Go that speaks the PostgreSQL wire protocol. Standard tools like `psql` and any PG-compatible driver work out of the box.\n\nmulldb is designed for correctness and clarity over raw performance — a usable tool for light workloads, not a toy, but not aiming for Postgres-level completeness.\n\n## Table of Contents\n\n- [Features](#features)\n- [Quick Start](#quick-start)\n- [Configuration](#configuration)\n- [SQL Reference](#sql-reference)\n  - [Supported Statements](#supported-statements)\n  - [Character Encoding](#character-encoding)\n  - [Data Types](#data-types)\n  - [Aggregate Functions](#aggregate-functions)\n  - [Column Aliases (AS)](#column-aliases-as)\n  - [ORDER BY](#order-by)\n  - [INNER JOIN](#inner-join)\n  - [LIMIT and OFFSET](#limit-and-offset)\n  - [Type Casts](#type-casts)\n  - [Arithmetic Expressions](#arithmetic-expressions)\n  - [String Concatenation](#string-concatenation)\n  - [Scalar Functions](#scalar-functions)\n  - [NEST (Correlated Subquery)](#nest-correlated-subquery)\n  - [Catalog Tables](#catalog-tables)\n  - [Statement Tracing](#statement-tracing)\n  - [WHERE Expressions](#where-expressions)\n  - [Comments](#comments)\n- [Architecture](#architecture)\n  - [Design Principles](#design-principles)\n  - [Concurrency Model](#concurrency-model)\n  - [Persistence](#persistence)\n- [WAL Migration](#wal-migration)\n- [Project Structure](#project-structure)\n- [Testing](#testing)\n- [Error Handling](#error-handling)\n- [Compatibility No-Ops](#compatibility-no-ops)\n- [Limitations](#limitations)\n- [License](#license)\n\n## Features\n\n- **PostgreSQL wire protocol (v3)** — connect with `psql`, `pgx`, `node-postgres`, or any PG driver\n- **Persistent storage** — per-table write-ahead log (WAL) files with CRC32 checksums and fsync for crash recovery; DROP TABLE instantly reclaims disk space\n- **SQL support** — CREATE TABLE, DROP TABLE, ALTER TABLE (ADD/DROP COLUMN), INSERT, SELECT (with WHERE, ORDER BY, LIMIT, OFFSET, column aliases via AS, and INNER JOIN), UPDATE, DELETE\n- **Transactions** — `BEGIN`, `COMMIT`, `ROLLBACK` with deferred-execution overlay; writes are buffered until COMMIT, providing READ COMMITTED isolation; crash-safe via WAL begin/commit markers; DDL rejected inside transactions\n- **PRIMARY KEY constraints** — single-column primary keys with uniqueness enforcement, backed by B-tree indexes for O(log n) lookups\n- **NOT NULL constraints** — standalone `NOT NULL` on any column; enforced on INSERT and UPDATE; PRIMARY KEY columns are implicitly NOT NULL\n- **Secondary indexes** — `CREATE [UNIQUE] INDEX [name] ON table(column)` and `DROP INDEX name ON table`; optional index names (auto-generated as `idx_{column}`); table-scoped names; explicit `INDEXED BY \u003cname\u003e` syntax for query acceleration (no automatic index selection); NULL values not indexed (multiple NULLs allowed in UNIQUE indexes per SQL standard)\n- **Aggregate functions** — `COUNT(*)`, `COUNT(col)`, `SUM(col)`, `AVG(col)`, `MIN(col)`, `MAX(col)`\n- **String concatenation** — `||` operator (SQL standard, NULL-propagating) and `CONCAT()` function (PostgreSQL extension, NULL-skipping); implicit type coercion for integers and booleans\n- **Scalar functions** — `LENGTH()` / `CHARACTER_LENGTH()` / `CHAR_LENGTH()`, `OCTET_LENGTH()`, `CONCAT()`, `NOW()`, `VERSION()`, math functions (`ABS`, `ROUND`, `CEIL`/`CEILING`, `FLOOR`, `POWER`/`POW`, `SQRT`, `MOD`), and a registration pattern for adding more\n- **NEST(SELECT ...)** — correlated subquery that collects inner rows into parenthesized text; avoids JOIN + GROUP BY for hierarchical data; supports ORDER BY, LIMIT, OFFSET inside the subquery; optional `FORMAT JSON` (array of objects) and `FORMAT JSONA` (array of arrays) for native JSON output\n- **Data types** — INTEGER (64-bit), FLOAT (64-bit IEEE 754), TEXT, BOOLEAN, TIMESTAMP (UTC), NULL\n- **Type casts** — PostgreSQL-style `expr::type` cast syntax; supports INTEGER, TEXT, BOOLEAN, FLOAT, TIMESTAMP targets; chainable (`expr::text::integer`)\n- **Arithmetic expressions** — `+`, `-`, `*`, `/`, `%` (modulo) and unary minus on integers and floats; implicit int→float promotion in mixed arithmetic; works in SELECT, WHERE, INSERT VALUES, and UPDATE SET; NULL propagation and division-by-zero errors follow PostgreSQL semantics\n- **Pattern matching** — `LIKE` / `NOT LIKE` (case-sensitive), `ILIKE` / `NOT ILIKE` (case-insensitive, PostgreSQL extension); `%` matches zero or more characters, `_` matches exactly one Unicode codepoint; `ESCAPE` clause for literal `%`/`_`; NULL propagation\n- **IN predicate** — `IN (v1, v2, ...)` and `NOT IN (v1, v2, ...)`; SQL-standard three-valued NULL logic (NULL LHS → NULL, NULL in list with no match → NULL)\n- **BETWEEN predicate** — `BETWEEN low AND high` and `NOT BETWEEN low AND high`; inclusive bounds; SQL-standard NULL propagation (any NULL operand → NULL); works in WHERE, JOIN ON, and correlated subqueries\n- **Implicit type coercion** — comparisons and IN predicates automatically coerce literals to match column types at compile time (e.g., `WHERE id = '123'` coerces the string to integer); invalid coercions return SQLSTATE `22P02`\n- **WHERE clauses** — comparisons (`=`, `!=`, `\u003c\u003e`, `\u003c`, `\u003e`, `\u003c=`, `\u003e=`), arithmetic (`+`, `-`, `*`, `/`, `%`), `LIKE` / `ILIKE`, `IN` / `NOT IN`, `BETWEEN` / `NOT BETWEEN`, `IS NULL` / `IS NOT NULL`, logical (`AND`, `OR`, `NOT`), parenthesized expressions; NULL comparisons follow SQL standard (any comparison with NULL yields NULL, not true/false)\n- **Full UTF-8 support** — identifiers, string literals, and all data are UTF-8 throughout; no other character encoding exists\n- **Double-quoted identifiers** — use reserved words as identifiers, preserve exact casing (`\"select\"`, `\"Order\"`), Unicode identifiers (`\"café\"`, `\"名前\"`)\n- **WAL migration** — versioned WAL format with opt-in `--migrate` flag and backup preservation\n- **Concurrent access** — per-table locking allows concurrent writes to independent tables; multiple readers can run in parallel on any table\n- **Cleartext password authentication** — simple username/password access control\n- **Graceful shutdown** — drains active connections on SIGINT/SIGTERM\n- **SQL comments** — single-line (`--`) and nested block (`/* ... */`) comments\n- **Proper error codes** — PostgreSQL SQLSTATE codes in ErrorResponse messages\n\n## Quick Start\n\n### Build\n\n```bash\ngo build -o mulldb .\n```\n\n### Run\n\n```bash\n./mulldb --port 5433 --datadir ./data --user admin --password secret\n```\n\n### Connect\n\n```bash\npsql -h 127.0.0.1 -p 5433 -U admin\n```\n\n### Try it out\n\n```sql\nCREATE TABLE users (id INTEGER PRIMARY KEY, name TEXT, active BOOLEAN);\n\nINSERT INTO users (id, name, active) VALUES (1, 'alice', TRUE), (2, 'bob', FALSE);\n\nSELECT * FROM users;\n--  id | name  | active\n-- ----+-------+--------\n--   1 | alice | t\n--   2 | bob   | f\n\nSELECT name FROM users WHERE active = TRUE;\n--  name\n-- -------\n--  alice\n\nUPDATE users SET active = TRUE WHERE id = 2;\n\nDELETE FROM users WHERE id = 1;\n\nDROP TABLE users;\n```\n\n## Configuration\n\nAll options can be set via CLI flags or environment variables. Environment variables take precedence over defaults but flags take precedence over environment variables.\n\n| Flag | Env Var | Default | Description |\n|------|---------|---------|-------------|\n| `--port` | `MULLDB_PORT` | `5433` | TCP port to listen on |\n| `--datadir` | `MULLDB_DATADIR` | `./data` | Directory for WAL and data files |\n| `--user` | `MULLDB_USER` | `admin` | Username for authentication |\n| `--password` | `MULLDB_PASSWORD` | *(empty)* | Password for authentication |\n| `--log-level` | `MULLDB_LOG_LEVEL` | `0` | Log verbosity: `0` = off, `1` = log SQL statements with outcome (`OK`/`ERROR`) and row counts |\n| `--migrate` | — | `false` | Migrate WAL file format if needed (see [WAL Migration](#wal-migration)) |\n| `--fsync` | `MULLDB_FSYNC` | `true` | Enable fsync on WAL writes; disable for speed at the risk of data loss on crash |\n\nExample with environment variables:\n\n```bash\nexport MULLDB_PORT=5433\nexport MULLDB_DATADIR=/var/lib/mulldb\nexport MULLDB_USER=myuser\nexport MULLDB_PASSWORD=mypass\nexport MULLDB_LOG_LEVEL=1\n./mulldb\n```\n\n## SQL Reference\n\n### Supported Statements\n\n```sql\n-- Create a table\nCREATE TABLE \u003cname\u003e (\u003ccolumn\u003e \u003ctype\u003e, ...);\nCREATE TABLE \u003cname\u003e (\u003ccolumn\u003e \u003ctype\u003e PRIMARY KEY, ...);  -- with primary key\nCREATE TABLE \u003cname\u003e (\u003ccolumn\u003e \u003ctype\u003e NOT NULL, ...);     -- with not null constraint\n\n-- Drop a table\nDROP TABLE \u003cname\u003e;\n\n-- Alter a table\nALTER TABLE \u003cname\u003e ADD [COLUMN] \u003ccolumn\u003e \u003ctype\u003e;\nALTER TABLE \u003cname\u003e DROP [COLUMN] \u003ccolumn\u003e;\n\n-- Create / drop indexes\nCREATE INDEX [\u003cname\u003e] ON \u003ctable\u003e(\u003ccolumn\u003e);         -- non-unique index\nCREATE UNIQUE INDEX [\u003cname\u003e] ON \u003ctable\u003e(\u003ccolumn\u003e);   -- unique index\nDROP INDEX \u003cname\u003e ON \u003ctable\u003e;\n\n-- Insert one or more rows\nINSERT INTO \u003ctable\u003e (\u003ccolumns\u003e) VALUES (\u003cvalues\u003e), (\u003cvalues\u003e);\nINSERT INTO \u003ctable\u003e VALUES (\u003cvalues\u003e);  -- all columns, in order\n\n-- Query rows\nSELECT * FROM \u003ctable\u003e;\nSELECT \u003ccolumns\u003e FROM \u003ctable\u003e WHERE \u003ccondition\u003e;\nSELECT \u003cexpr\u003e AS \u003calias\u003e, ... FROM \u003ctable\u003e;  -- column aliases\nSELECT id, 'tag', 42 FROM \u003ctable\u003e;          -- literals in column list\nSELECT * FROM \u003ctable\u003e ORDER BY \u003ccol\u003e [ASC|DESC], ...;  -- sorted results\nSELECT * FROM \u003ctable\u003e ORDER BY \u003ccol\u003e LIMIT \u003cn\u003e;       -- sorted + limited\nSELECT \u003ccols\u003e FROM \u003ct1\u003e JOIN \u003ct2\u003e ON \u003ccondition\u003e;            -- inner join\nSELECT \u003ccols\u003e FROM \u003ct1\u003e a INNER JOIN \u003ct2\u003e b ON a.id = b.fk;  -- with aliases\nSELECT \u003ccols\u003e FROM \u003ct1\u003e a, \u003ct2\u003e b WHERE a.id = b.fk;         -- implicit cross-join\nSELECT * FROM \u003ctable\u003e INDEXED BY \u003cindex\u003e WHERE \u003ccol\u003e = \u003cval\u003e;  -- use named index\nSELECT * FROM \u003ctable\u003e LIMIT \u003cn\u003e;             -- return at most n rows\nSELECT * FROM \u003ctable\u003e OFFSET \u003cn\u003e;            -- skip first n rows\nSELECT * FROM \u003ctable\u003e LIMIT \u003cn\u003e OFFSET \u003cm\u003e;  -- pagination\n\n-- Type casts\nSELECT col::INTEGER FROM \u003ctable\u003e;\nSELECT col::TEXT FROM \u003ctable\u003e;\n\n-- Arithmetic expressions\nSELECT 1 + 2;\nSELECT col * 2 + 1 FROM \u003ctable\u003e;\nSELECT * FROM \u003ctable\u003e WHERE price * qty \u003e 100;\nINSERT INTO \u003ctable\u003e VALUES (1 + 2, -5);\n\n-- Static SELECT (no table required)\nSELECT 1;\nSELECT 1, 'hello', TRUE, NULL;\nSELECT VERSION();\n\n-- Aggregate queries (returns a single row)\nSELECT COUNT(*) FROM \u003ctable\u003e;\nSELECT COUNT(\u003ccolumn\u003e) FROM \u003ctable\u003e;\nSELECT SUM(\u003ccolumn\u003e) FROM \u003ctable\u003e;\nSELECT MIN(\u003ccolumn\u003e) FROM \u003ctable\u003e;\nSELECT MAX(\u003ccolumn\u003e) FROM \u003ctable\u003e;\nSELECT COUNT(*), SUM(\u003ccolumn\u003e), AVG(\u003ccolumn\u003e), MIN(\u003ccolumn\u003e), MAX(\u003ccolumn\u003e) FROM \u003ctable\u003e;\nSELECT COUNT(*) FROM \u003ctable\u003e WHERE \u003cpk_col\u003e = \u003cval\u003e;                        -- uses PK index\nSELECT COUNT(*) FROM \u003ctable\u003e INDEXED BY \u003cindex\u003e WHERE \u003ccol\u003e = \u003cval\u003e;        -- uses named index\n\n-- Update rows\nUPDATE \u003ctable\u003e SET \u003ccolumn\u003e = \u003cvalue\u003e, ... WHERE \u003ccondition\u003e;\nUPDATE \u003ctable\u003e INDEXED BY \u003cindex\u003e SET \u003ccolumn\u003e = \u003cvalue\u003e WHERE \u003ccol\u003e = \u003cval\u003e;  -- use named index\nUPDATE \u003ctable\u003e SET \u003ccolumn\u003e = \u003cvalue\u003e;  -- all rows\n\n-- Delete rows\nDELETE FROM \u003ctable\u003e WHERE \u003ccondition\u003e;\nDELETE FROM \u003ctable\u003e INDEXED BY \u003cindex\u003e WHERE \u003ccol\u003e = \u003cval\u003e;  -- use named index\nDELETE FROM \u003ctable\u003e;  -- all rows\n\n-- Transaction control\nBEGIN;                -- start a transaction (writes are buffered until COMMIT)\nCOMMIT;              -- apply all buffered changes atomically\nROLLBACK;            -- discard all buffered changes\n```\n\n### Character Encoding\n\nmulldb uses **UTF-8 exclusively** — there is no encoding configuration and no other character set. All layers handle UTF-8 natively:\n\n- **Identifiers** — table and column names can contain any Unicode letter (`café`, `名前`, `αβγ`), both unquoted and double-quoted\n- **String literals** — `'München'`, `'東京'`, `'hello 🌍'` all work as expected\n- **Storage and WAL** — strings are stored as raw UTF-8 bytes with byte-length prefixes\n- **Wire protocol** — UTF-8 bytes are sent as-is over the PostgreSQL wire protocol, which is encoding-aware\n\nString comparison is **binary** (byte-order). There is no locale-aware collation — `'a' \u003c 'b'` works, but locale-specific sort orders (e.g. German `ä` sorting with `a`) are not supported.\n\n### Data Types\n\n| Type | Go representation | Description |\n|------|------------------|-------------|\n| `INTEGER` | `int64` | 64-bit signed integer (aliases: `INT`, `INT2`, `INT4`, `INT8`, `SMALLINT`, `BIGINT`) |\n| `FLOAT` | `float64` | 64-bit IEEE 754 double-precision floating point (alias: `DOUBLE PRECISION`) |\n| `TEXT` | `string` | Variable-length UTF-8 string |\n| `BOOLEAN` | `bool` | `TRUE` or `FALSE` |\n| `TIMESTAMP` | `time.Time` | UTC timestamp with microsecond precision (aliases: `TIMESTAMPTZ`, `TIMESTAMP WITH TIME ZONE`) |\n| `NULL` | `nil` | Absence of a value (any column) |\n\n**TIMESTAMP details.** All timestamps are stored as UTC — there is no timezone configuration or session timezone. Input strings with timezone offsets are converted to UTC on insert. Accepted input formats:\n\n- `'2024-01-15 10:30:00'` — assumed UTC\n- `'2024-01-15T10:30:00Z'` — ISO 8601\n- `'2024-01-15T10:30:00+02:00'` — converted to UTC\n- `'2024-01-15'` — midnight UTC\n\nOutput format is always `2024-01-15 10:30:00+00`. The `NOW()` function returns the current UTC timestamp.\n\n### Aggregate Functions\n\nAggregate functions collapse all matching rows into a single result row. Multiple aggregates can appear in the same `SELECT`. Mixing aggregate and non-aggregate columns in the same `SELECT` is an error (SQLSTATE `42803`) — use `GROUP BY` to aggregate per group instead.\n\nAggregate queries support index acceleration: primary key lookups are automatic when the WHERE clause is a simple PK equality, and secondary indexes can be used via `INDEXED BY \u003cname\u003e`. Without an applicable index, aggregates fall back to a full table scan.\n\n| Function | Argument | Returns | Description |\n|----------|----------|---------|-------------|\n| `COUNT(*)` | — | `INTEGER` | Count of all rows |\n| `COUNT(col)` | any column | `INTEGER` | Count of non-NULL values in `col` |\n| `SUM(col)` | `INTEGER` or `FLOAT` column | same as `col` | Sum of all non-NULL values |\n| `AVG(col)` | `INTEGER` or `FLOAT` column | `FLOAT` | Average of all non-NULL values; NULL if no rows |\n| `MIN(col)` | `INTEGER`, `FLOAT`, `TEXT`, or `TIMESTAMP` column | same as `col` | Smallest non-NULL value |\n| `MAX(col)` | `INTEGER`, `FLOAT`, `TEXT`, or `TIMESTAMP` column | same as `col` | Largest non-NULL value |\n\nFunction names are case-insensitive (`sum`, `Sum`, `SUM` all work).\n\n**Examples:**\n\n```sql\nCREATE TABLE orders (amount INTEGER, status TEXT);\nINSERT INTO orders VALUES (10, 'paid'), (25, 'paid'), (5, 'pending'), (40, 'paid');\n\nSELECT COUNT(*) FROM orders;\n--  count\n-- -------\n--      4\n\nSELECT SUM(amount) FROM orders;\n--  sum\n-- -----\n--   80\n\nSELECT AVG(amount) FROM orders;\n--  avg\n-- -----\n--   20\n\nSELECT MIN(amount), MAX(amount) FROM orders;\n--  min | max\n-- -----+-----\n--    5 |  40\n\nSELECT COUNT(*), SUM(amount), AVG(amount), MIN(amount), MAX(amount) FROM orders;\n--  count | sum | avg | min | max\n-- -------+-----+-----+-----+-----\n--      4 |  80 |  20 |   5 |  40\n```\n\n### GROUP BY\n\n`GROUP BY` partitions rows into groups based on one or more columns, then applies aggregate functions to each group independently. Non-aggregate columns in `SELECT` must appear in the `GROUP BY` clause (SQLSTATE `42803`).\n\nSupports `WHERE` (pre-grouping filter), `ORDER BY`, `LIMIT`, and `OFFSET`. NULLs are grouped together per the SQL standard. `HAVING` is not yet supported. GROUP BY with JOINs returns SQLSTATE `0A000`.\n\n**Examples:**\n\n```sql\nCREATE TABLE sales (category TEXT, region TEXT, amount INTEGER);\nINSERT INTO sales VALUES ('A', 'east', 10), ('A', 'west', 20), ('B', 'east', 30), ('A', 'east', 40);\n\nSELECT category, SUM(amount) FROM sales GROUP BY category ORDER BY category;\n--  category | sum\n-- ----------+-----\n--  A        |  70\n--  B        |  30\n\nSELECT category, region, COUNT(*) FROM sales GROUP BY category, region ORDER BY category, region;\n--  category | region | count\n-- ----------+--------+-------\n--  A        | east   |     2\n--  A        | west   |     1\n--  B        | east   |     1\n\n-- GROUP BY without aggregates returns distinct groups:\nSELECT category FROM sales GROUP BY category ORDER BY category;\n--  category\n-- ----------\n--  A\n--  B\n```\n\n### Column Aliases (AS)\n\nAny column expression in a `SELECT` can be renamed with `AS \u003calias\u003e`. This works with plain columns, aggregate functions, and static expressions.\n\n**Examples:**\n\n```sql\nSELECT name AS username, id AS user_id FROM users;\n--  username | user_id\n-- ----------+---------\n--  alice    |       1\n\nSELECT COUNT(*) AS total FROM orders;\n--  total\n-- -------\n--      4\n\nSELECT COUNT(*) AS n, SUM(amount) AS total FROM orders;\n--  n | total\n-- ---+-------\n--  4 |    80\n\nSELECT 1 AS num, 'hello' AS greeting;\n--  num | greeting\n-- -----+----------\n--    1 | hello\n```\n\n### ORDER BY\n\n`ORDER BY` sorts the result set by one or more columns. Each column can specify `ASC` (ascending, the default) or `DESC` (descending). Multi-column sorts compare left-to-right — the second column only matters when the first column has equal values.\n\nNULL values always sort last, regardless of sort direction.\n\nORDER BY is applied before LIMIT and OFFSET, making it possible to get deterministic paginated results. ORDER BY is not supported with aggregate queries without GROUP BY. With GROUP BY, ORDER BY works on the grouped result columns.\n\n**Examples:**\n\n```sql\nCREATE TABLE scores (id INTEGER PRIMARY KEY, name TEXT, score INTEGER);\nINSERT INTO scores VALUES (1, 'alice', 90), (2, 'bob', 70), (3, 'charlie', 90), (4, 'dave', NULL);\n\nSELECT * FROM scores ORDER BY score;\n--  id |  name   | score\n-- ----+---------+-------\n--   2 | bob     |    70\n--   1 | alice   |    90\n--   3 | charlie |    90\n--   4 | dave    |\n\nSELECT * FROM scores ORDER BY score DESC, name;\n--  id |  name   | score\n-- ----+---------+-------\n--   1 | alice   |    90\n--   3 | charlie |    90\n--   2 | bob     |    70\n--   4 | dave    |\n\nSELECT * FROM scores ORDER BY score LIMIT 2;\n--  id | name | score\n-- ----+------+-------\n--   2 | bob  |    70\n--   1 | alice|    90\n\nSELECT * FROM scores ORDER BY score LIMIT 2 OFFSET 1;\n--  id |  name   | score\n-- ----+---------+-------\n--   1 | alice   |    90\n--   3 | charlie |    90\n```\n\n### INNER JOIN\n\n`JOIN` (or `INNER JOIN`) combines rows from two or more tables based on a related column. Only rows that satisfy the `ON` condition are included in the result. Tables can be aliased for shorter qualified column references (`table.column`).\n\nUnqualified column names work if the column name is unique across all joined tables. If it appears in multiple tables, qualify it with the table name or alias.\n\nMultiple joins can be chained: `FROM t1 JOIN t2 ON ... JOIN t3 ON ...`\n\nImplicit cross-joins are also supported via comma-separated tables in the `FROM` clause: `FROM t1 a, t2 b WHERE a.id = b.id`. This is equivalent to a cross-join filtered by the `WHERE` clause.\n\n**Examples:**\n\n```sql\nCREATE TABLE orders (id INTEGER PRIMARY KEY, customer TEXT);\nINSERT INTO orders VALUES (1, 'alice'), (2, 'bob');\n\nCREATE TABLE items (id INTEGER PRIMARY KEY, order_id INTEGER, product TEXT, qty INTEGER);\nINSERT INTO items VALUES (10, 1, 'widget', 5), (11, 1, 'gadget', 3), (12, 2, 'widget', 1);\n\nSELECT o.id, o.customer, i.product, i.qty\nFROM orders o\nJOIN items i ON o.id = i.order_id;\n--  id | customer | product | qty\n-- ----+----------+---------+-----\n--   1 | alice    | widget  |   5\n--   1 | alice    | gadget  |   3\n--   2 | bob      | widget  |   1\n\nSELECT o.id, i.product\nFROM orders o\nINNER JOIN items i ON o.id = i.order_id\nWHERE i.qty \u003e 1\nORDER BY i.product;\n--  id | product\n-- ----+---------\n--   1 | gadget\n--   1 | widget\n```\n\n### LIMIT and OFFSET\n\n`LIMIT` restricts the number of rows returned; `OFFSET` skips rows before returning. Both are optional and can appear in either order. Without `ORDER BY`, the order of rows is undefined.\n\n**Examples:**\n\n```sql\nCREATE TABLE items (id INTEGER, name TEXT);\nINSERT INTO items VALUES (1, 'a'), (2, 'b'), (3, 'c'), (4, 'd'), (5, 'e');\n\nSELECT * FROM items LIMIT 3;\n-- Returns 3 rows\n\nSELECT * FROM items OFFSET 2;\n-- Skips 2 rows, returns the remaining 3\n\nSELECT * FROM items LIMIT 2 OFFSET 1;\n-- Skips 1 row, then returns the next 2\n\nSELECT * FROM items LIMIT 0;\n-- Returns 0 rows (valid)\n\nSELECT * FROM items OFFSET 100;\n-- Returns 0 rows (offset beyond row count)\n\nSELECT * FROM items WHERE id \u003e 1 LIMIT 2;\n-- LIMIT applies after WHERE filtering\n```\n\n### Type Casts\n\nThe PostgreSQL-style `::` cast operator converts a value to a target type. It binds tighter than any other operator and can be chained.\n\n```sql\nSELECT 42::TEXT;           -- '42'\nSELECT '123'::INTEGER;     -- 123\nSELECT 1::BOOLEAN;         -- true\nSELECT 3.14::INTEGER;      -- 3\n\n-- Works in SELECT, WHERE, and with column references:\nSELECT reltuples::int8 AS count FROM pg_class WHERE relname = 'users';\n```\n\nSupported target types: `INTEGER` (and aliases `INT`, `INT8`, `BIGINT`, etc.), `TEXT`, `BOOLEAN`, `FLOAT`, `TIMESTAMP`.\n\n### Arithmetic Expressions\n\nArithmetic operators `+`, `-`, `*`, `/`, `%` (modulo) and unary minus are supported in SELECT columns, WHERE conditions, INSERT VALUES, and UPDATE SET clauses. Arithmetic works on both integers (64-bit signed) and floats (64-bit IEEE 754). When one operand is integer and the other is float, the integer is implicitly promoted to float. Division and modulo by zero return SQLSTATE `22012`.\n\nOperator precedence follows standard math rules: unary minus binds tightest, then `*` / `/` / `%`, then `+` / `-`, then comparisons, then logical operators.\n\nNULL propagation: any arithmetic with a NULL operand yields NULL.\n\n**Examples:**\n\n```sql\nSELECT 1 + 2;\n--  ?column?\n-- ----------\n--         3\n\nSELECT 2 + 3 * 4;\n--  ?column?\n-- ----------\n--        14\n\nSELECT -42;\n--  ?column?\n-- ----------\n--       -42\n\nCREATE TABLE items (price INTEGER, qty INTEGER);\nINSERT INTO items VALUES (10, 5), (20, 3);\n\nSELECT price * qty AS total FROM items;\n--  total\n-- -------\n--     50\n--     60\n\nSELECT * FROM items WHERE price * qty \u003e 50;\n--  price | qty\n-- -------+-----\n--     20 |   3\n\nINSERT INTO items VALUES (1 + 2, 10);\n-- Inserts (3, 10)\n\nSELECT 10 / 3;   -- integer division → 3\nSELECT 10 % 3;   -- modulo → 1\nSELECT NULL + 1;  -- NULL (null propagation)\nSELECT 1 / 0;     -- ERROR: division by zero (SQLSTATE 22012)\n```\n\n### String Concatenation\n\nThe `||` operator concatenates two values into a text string. At least one operand must be TEXT; the other is implicitly coerced (integers become their decimal representation, booleans become `\"true\"` or `\"false\"`). Two non-text operands produce an error (SQLSTATE `42883`). If either operand is NULL, the result is NULL (SQL standard behavior).\n\nThe `CONCAT()` function is an alternative that treats NULL as empty string — see [Scalar Functions](#scalar-functions).\n\n**Examples:**\n\n```sql\nSELECT 'hello' || ' ' || 'world';\n--  ?column?\n-- -------------\n--  hello world\n\nSELECT 'count: ' || 42;\n--  ?column?\n-- -----------\n--  count: 42\n\nSELECT 'active: ' || TRUE;\n--  ?column?\n-- ---------------\n--  active: true\n\nSELECT 'hello' || NULL;\n--  ?column?\n-- ----------\n--  (NULL)\n\nSELECT 1 || 2;  -- ERROR: operator || is not defined (42883)\n```\n\n### Scalar Functions\n\nScalar functions return a single value per row. They can be used in `SELECT` columns (with or without `FROM`) and in `WHERE` clauses.\n\n| Function | Arguments | Returns | Description |\n|----------|-----------|---------|-------------|\n| `LENGTH(text)` | 1 TEXT | `INTEGER` | Number of characters (Unicode code points, not bytes) |\n| `CHARACTER_LENGTH(text)` | 1 TEXT | `INTEGER` | SQL-standard alias for `LENGTH()` |\n| `CHAR_LENGTH(text)` | 1 TEXT | `INTEGER` | SQL-standard alias for `LENGTH()` |\n| `OCTET_LENGTH(text)` | 1 TEXT | `INTEGER` | Number of bytes (UTF-8 encoded length) |\n| `CONCAT(arg, ...)` | 1+ any | `TEXT` | Concatenates all arguments as text; NULLs are skipped (treated as empty string); never returns NULL |\n| `ABS(x)` | 1 numeric | same as input | Absolute value (preserves int/float type) |\n| `ROUND(x)` | 1 numeric | `FLOAT` | Round to nearest integer |\n| `ROUND(x, n)` | 2 numeric | `FLOAT` | Round to `n` decimal places |\n| `CEIL(x)` / `CEILING(x)` | 1 numeric | `FLOAT` | Smallest integer not less than `x` |\n| `FLOOR(x)` | 1 numeric | `FLOAT` | Largest integer not greater than `x` |\n| `POWER(x, y)` / `POW(x, y)` | 2 numeric | `FLOAT` | `x` raised to the power `y` |\n| `SQRT(x)` | 1 numeric | `FLOAT` | Square root (error on negative input, SQLSTATE `2201F`) |\n| `MOD(x, y)` | 2 numeric | same as input | Modulo (error on `y=0`, SQLSTATE `22012`) |\n| `COALESCE(val, ...)` | 1+ any | same as first non-NULL | Returns the first non-NULL value from its arguments; returns NULL if all arguments are NULL |\n| `NOW()` | 0 | `TIMESTAMP` | Current UTC timestamp |\n| `VERSION()` | 0 | `TEXT` | PostgreSQL-compatible version string identifying the mulldb build |\n\nFunction names are case-insensitive. NULL input returns NULL.\n\n**Examples:**\n\n```sql\nSELECT LENGTH('hello');\n--  length\n-- --------\n--       5\n\nSELECT LENGTH('héllo');  -- counts characters, not bytes\n--  length\n-- --------\n--       5\n\nSELECT CHARACTER_LENGTH('hello');  -- SQL-standard name\n--  length\n-- --------\n--       5\n\nCREATE TABLE t (name TEXT);\nINSERT INTO t VALUES ('hi'), ('hello'), ('hey');\n\nSELECT name, LENGTH(name) FROM t;\n--  name  | length\n-- -------+--------\n--  hi    |      2\n--  hello |      5\n--  hey   |      3\n\nSELECT * FROM t WHERE LENGTH(name) \u003e 3;\n--  name\n-- -------\n--  hello\n\nSELECT VERSION();\n--                           version\n-- ----------------------------------------------------------\n--  PostgreSQL 15.0 (mulldb dev, commit abc1234, built ...)\n```\n\nCalling an unknown function returns SQLSTATE `42883`. Calling a function with the wrong number of arguments or wrong type also returns `42883`.\n\n**COALESCE examples:**\n\n```sql\nSELECT COALESCE(NULL, 'a', 'b');\n--  coalesce\n-- ----------\n--  a\n\nSELECT COALESCE(1, 2, 3);\n--  coalesce\n-- ----------\n--         1\n\nSELECT COALESCE(NULL, NULL);\n--  coalesce\n-- ----------\n--  (NULL)\n\nCREATE TABLE t (a TEXT, b TEXT);\nINSERT INTO t VALUES ('first', 'second'), (NULL, 'fallback');\n\nSELECT COALESCE(a, b) FROM t;\n--  coalesce\n-- ----------\n--  first\n--  fallback\n```\n\n### NEST (Correlated Subquery)\n\n`NEST(SELECT ...)` wraps a correlated subquery that collects inner rows into a parenthesized text format, embedded directly in each outer row. This avoids the flatten-then-reaggregate pattern of JOIN + GROUP BY.\n\n```sql\nCREATE TABLE names (id INTEGER PRIMARY KEY, name TEXT);\nCREATE TABLE addresses (id INTEGER PRIMARY KEY, name_id INTEGER, address TEXT);\nINSERT INTO names VALUES (1, 'Louis'), (2, 'Alice');\nINSERT INTO addresses VALUES (1, 1, '123 Main St'), (2, 1, '456 Oak Ave'), (3, 2, '789 Elm St');\n\nSELECT n.id, n.name, NEST(SELECT a.address FROM addresses a WHERE a.name_id = n.id) AS addrs\nFROM names n;\n--  id | name  | addrs\n-- ----+-------+------------------------------\n--   1 | Louis | (123 Main St, 456 Oak Ave)\n--   2 | Alice | (789 Elm St)\n```\n\nMulti-column inner SELECT produces nested tuples:\n\n```sql\nNEST(SELECT street, city FROM addresses a WHERE a.name_id = n.id)\n-- ((123 Main St, Springfield), (456 Oak Ave, Shelbyville))\n```\n\n#### FORMAT JSON / FORMAT JSONA\n\nAn optional `FORMAT` clause before the closing parenthesis controls the output format. Without `FORMAT`, the default parenthesized text format is used.\n\n`FORMAT JSON` returns a JSON array of objects, with column names as keys:\n\n```sql\nSELECT n.name, NEST(SELECT a.address FROM addresses a WHERE a.name_id = n.id FORMAT JSON) AS addrs\nFROM names n WHERE n.id = 1;\n-- Louis | [{\"address\":\"123 Main St\"},{\"address\":\"456 Oak Ave\"}]\n\nNEST(SELECT street, city FROM addresses a WHERE a.name_id = n.id FORMAT JSON)\n-- [{\"street\":\"123 Main St\",\"city\":\"Springfield\"},{\"street\":\"456 Oak Ave\",\"city\":\"Shelbyville\"}]\n```\n\n`FORMAT JSONA` returns a JSON array of arrays (positional, no column names):\n\n```sql\nNEST(SELECT a.address FROM addresses a WHERE a.name_id = n.id FORMAT JSONA)\n-- [[\"123 Main St\"],[\"456 Oak Ave\"]]\n\nNEST(SELECT street, city FROM addresses a WHERE a.name_id = n.id FORMAT JSONA)\n-- [[\"123 Main St\",\"Springfield\"],[\"456 Oak Ave\",\"Shelbyville\"]]\n```\n\nJSON type mapping: integers and floats become JSON numbers, booleans become JSON booleans, strings become JSON strings, timestamps become ISO 8601 strings, and NULL becomes JSON `null`. No matching inner rows produces SQL NULL for all formats.\n\nThe inner SELECT supports `WHERE` (correlated or uncorrelated), `ORDER BY`, `LIMIT`, and `OFFSET`. No matching inner rows produces SQL NULL. Column references in the inner WHERE can be qualified with outer table aliases (e.g. `n.id`) to correlate with the outer row.\n\n**Restrictions:** The inner SELECT must have a `FROM` clause, cannot use JOINs, GROUP BY, or nested NEST. NEST is not supported in WHERE clauses. Result is TEXT over the wire.\n\n### Catalog Tables\n\nmulldb exposes virtual catalog tables that mimic PostgreSQL system catalogs. These are read-only — `INSERT`, `UPDATE`, and `DELETE` return an error (SQLSTATE `42809`).\n\nTables can be accessed with or without schema qualification. Unqualified names check `pg_catalog` first (matching PostgreSQL behavior). `information_schema` tables require explicit schema qualification.\n\n| Table | Columns | Description |\n|-------|---------|-------------|\n| `pg_type` / `pg_catalog.pg_type` | `oid` (INTEGER), `typname` (TEXT) | Type information for supported data types |\n| `pg_database` / `pg_catalog.pg_database` | `datname` (TEXT) | Database names (always returns `mulldb`) |\n| `pg_namespace` / `pg_catalog.pg_namespace` | `oid` (INTEGER), `nspname` (TEXT) | Schema/namespace information (`pg_catalog`, `public`, `information_schema`) |\n| `pg_class` / `pg_catalog.pg_class` | `oid` (INTEGER), `relname` (TEXT), `relnamespace` (INTEGER), `relkind` (TEXT), `reltuples` (INTEGER) | Table/view metadata with row counts; joinable with `pg_namespace` on `oid = relnamespace` |\n| `information_schema.tables` | `table_schema` (TEXT), `table_name` (TEXT), `table_type` (TEXT) | Lists all user tables and system catalog tables |\n| `information_schema.columns` | `table_schema` (TEXT), `table_name` (TEXT), `column_name` (TEXT), `ordinal_position` (INTEGER), `data_type` (TEXT), `is_nullable` (TEXT) | Column metadata for all tables |\n| `information_schema.table_constraints` | `constraint_catalog` (TEXT), `constraint_schema` (TEXT), `constraint_name` (TEXT), `table_catalog` (TEXT), `table_schema` (TEXT), `table_name` (TEXT), `constraint_type` (TEXT), `is_deferrable` (TEXT), `initially_deferred` (TEXT) | PRIMARY KEY and UNIQUE constraints |\n| `information_schema.key_column_usage` | `constraint_catalog` (TEXT), `constraint_schema` (TEXT), `constraint_name` (TEXT), `table_catalog` (TEXT), `table_schema` (TEXT), `table_name` (TEXT), `column_name` (TEXT), `ordinal_position` (INTEGER) | Columns participating in constraints |\n\n**Examples:**\n\n```sql\nSELECT * FROM pg_type;\nSELECT * FROM pg_catalog.pg_type;  -- same result\n\nSELECT table_name, table_type FROM information_schema.tables WHERE table_schema = 'public';\n--  table_name | table_type\n-- ------------+------------\n--  users      | BASE TABLE\n--  orders     | BASE TABLE\n\nSELECT column_name, data_type, is_nullable FROM information_schema.columns WHERE table_name = 'users';\n--  column_name | data_type | is_nullable\n-- -------------+-----------+-------------\n--  id          | integer   | NO\n--  name        | text      | YES\n--  active      | boolean   | YES\n```\n\n### Statement Tracing\n\nmulldb has built-in statement tracing for diagnosing query performance. Tracing is per-connection and off by default.\n\n```sql\nSET trace = on;   -- enable tracing\nSET trace = off;  -- disable tracing\n```\n\nWhen tracing is enabled, every statement records timing and metadata. Use `SHOW TRACE` to inspect the last statement's trace:\n\n```sql\nSET trace = on;\nSELECT * FROM users WHERE id = 1;\nSHOW TRACE;\n--  step          | duration\n-- ---------------+----------\n--  Parse         | 12.5µs\n--  Plan          | 3.2µs\n--  Execute       | 1.1µs\n--  Total         | 16.8µs\n--  Statement     | SELECT\n--  Table         | users\n--  Rows Scanned  | 1\n--  Rows Returned | 1\n--  Used Index    | PRIMARY\n```\n\nFor JOIN queries, the trace includes additional timing:\n\n```sql\nSET trace = on;\nSELECT o.id, i.product FROM orders o JOIN items i ON o.id = i.order_id ORDER BY o.id;\nSHOW TRACE;\n--  step          | duration\n-- ---------------+----------\n--  Parse         | 18.3µs\n--  Plan          | 5.1µs\n--  Execute       | 42.7µs\n--  Sort          | 2.4µs\n--  Join Loop     | 31.5µs\n--  Total         | 66.1µs\n--  Statement     | SELECT\n--  Table         | orders\n--  Rows Scanned  | 6\n--  Rows Returned | 3\n```\n\n### Fsync Control\n\nBy default, every WAL write is followed by `fsync(2)` to guarantee crash durability. For bulk loading or development, you can disable fsync at runtime for significantly faster writes — at the risk of data loss if the process crashes.\n\n```sql\nSET fsync = off;   -- disable fsync (faster writes, less durable)\nSET fsync = on;    -- re-enable fsync (default)\nSHOW FSYNC;        -- show current setting\n--  fsync\n-- -------\n--  on\n```\n\nThe initial default can also be set via the `--fsync` CLI flag or `MULLDB_FSYNC` environment variable.\n\n### Memory Introspection\n\n`SHOW MEMORY` reports per-table and per-index memory usage:\n\n```sql\nSHOW MEMORY;\n--  table  |    type      |   name   | size_bytes | size_human\n-- --------+--------------+----------+------------+------------\n--  users  | table        | users    |     102400 | 100.0 KB\n--  users  | pk_index     | id       |       8192 | 8.0 KB\n--  users  | index        | idx_name |       4096 | 4.0 KB\n--         | total        |          |     114688 | 112.0 KB\n```\n\n### WHERE Expressions\n\n- **Comparisons**: `=`, `!=`, `\u003c\u003e`, `\u003c`, `\u003e`, `\u003c=`, `\u003e=`\n- **Pattern matching**: `LIKE`, `NOT LIKE`, `ILIKE`, `NOT ILIKE`, `ESCAPE`\n- **IN predicate**: `IN (v1, v2, ...)`, `NOT IN (v1, v2, ...)`\n- **BETWEEN predicate**: `BETWEEN low AND high`, `NOT BETWEEN low AND high`\n- **Arithmetic**: `+`, `-`, `*`, `/`, `%` (integer and float, with implicit int→float promotion)\n- **Concatenation**: `||` (text, with implicit coercion)\n- **Unary minus**: `-expr`\n- **NULL predicates**: `IS NULL`, `IS NOT NULL`\n- **Logical operators**: `AND`, `OR`, `NOT`\n- **Parentheses**: `(expr)` for grouping\n- **Literals**: integers, floats (`3.14`, `.5`, `1e10`), `'single-quoted strings'`, `TRUE`, `FALSE`, `NULL`\n\n**NULL semantics.** Comparing any value to NULL with `=`, `!=`, `\u003c`, etc. yields NULL (unknown), never true or false — matching the SQL standard. Use `IS NULL` and `IS NOT NULL` to test for NULL values.\n\n```sql\nSELECT * FROM t WHERE name IS NULL;       -- rows where name is NULL\nSELECT * FROM t WHERE name IS NOT NULL;   -- rows where name is not NULL\nSELECT * FROM t WHERE name = NULL;        -- always returns 0 rows (standard behavior)\nSELECT * FROM t WHERE NOT active;         -- negate a boolean column\nSELECT * FROM t WHERE NOT (x \u003e 5);        -- negate a comparison\n```\n\n`NOT` on a NULL value yields NULL (the row is excluded). `NOT` can be chained: `NOT NOT active`.\n\n**Pattern matching.** `LIKE` performs case-sensitive pattern matching; `ILIKE` (PostgreSQL extension) is case-insensitive. `%` matches zero or more characters, `_` matches exactly one Unicode codepoint. Use `ESCAPE` to match literal `%` or `_`.\n\n```sql\nSELECT * FROM t WHERE name LIKE 'A%';           -- starts with A\nSELECT * FROM t WHERE name LIKE '_ob';           -- 3 chars ending in ob\nSELECT * FROM t WHERE name NOT LIKE '%test%';    -- does not contain test\nSELECT * FROM t WHERE name ILIKE 'alice%';       -- case-insensitive\nSELECT * FROM t WHERE val LIKE '100\\%' ESCAPE '\\';  -- literal % match\n```\n\nIf either operand is NULL, the result is NULL (the row is excluded).\n\n**IN predicate.** `IN` tests whether a value matches any element in a list. `NOT IN` negates the test. NULL semantics follow SQL standard three-valued logic.\n\n```sql\nSELECT * FROM t WHERE id IN (1, 2, 3);\nSELECT * FROM t WHERE name NOT IN ('Alice', 'Bob');\nSELECT * FROM t WHERE id IN (1 + 1, 4);            -- expressions in list\n```\n\nNULL behavior: if the LHS is NULL, the result is always NULL. If no match is found and the list contains NULL, the result is NULL (not false). This means `NOT IN` with a NULL in the list never returns true for non-matching values — a common SQL gotcha.\n\n**BETWEEN predicate.** `BETWEEN` tests whether a value falls within an inclusive range. `NOT BETWEEN` negates the test. If any of the three operands is NULL, the result is NULL.\n\n```sql\nSELECT * FROM t WHERE id BETWEEN 1 AND 10;\nSELECT * FROM t WHERE id NOT BETWEEN 5 AND 15;\nSELECT * FROM t WHERE ts BETWEEN '2024-01-01' AND '2024-12-31';\n```\n\n**Implicit type coercion.** When comparing a column to a literal of a different type, the literal is automatically coerced to the column's type at compile time. This applies to all comparison operators (`=`, `!=`, `\u003c`, `\u003e`, `\u003c=`, `\u003e=`) and `IN` lists. Invalid coercions produce an error with SQLSTATE `22P02`.\n\n```sql\n-- String literal coerced to integer for comparison:\nSELECT * FROM t WHERE id = '123';\n\n-- Works with IN lists too:\nSELECT * FROM t WHERE id IN ('1', '2', '3');\n\n-- Integer literal coerced to text:\nSELECT * FROM t WHERE name = 42;\n\n-- Invalid coercion produces an error:\nSELECT * FROM t WHERE id = 'hello';  -- ERROR: invalid input syntax for type integer: \"hello\"\n```\n\nSupported coercion paths: string→integer, string→float, string→boolean (`true/false/t/f/1/0`), string→timestamp, int→float, float→int (whole numbers only), int→text, float→text, bool→text.\n\nOperator precedence (lowest to highest): `OR` → `AND` → `NOT` → comparisons / `[NOT] LIKE` / `[NOT] ILIKE` / `[NOT] IN` / `[NOT] BETWEEN` / `IS [NOT] NULL` → `+` `-` `||` → `*` `/` `%` → unary `-` → primary.\n\n### Comments\n\nmulldb supports two SQL comment styles:\n\n- **Single-line comments** (`--`): everything from `--` to end of line is ignored\n- **Block comments** (`/* ... */`): delimited blocks are ignored, with nesting support (`/* outer /* inner */ outer */` is valid)\n\nComments are treated as whitespace and can appear anywhere whitespace is allowed. Comments inside string literals or quoted identifiers are preserved as literal content.\n\n```sql\nSELECT id -- this is ignored\nFROM users;\n\nSELECT /* inline comment */ name FROM users;\n\n/* This is a\n   multi-line comment */\nSELECT 1;\n\n/* Nested /* comments */ are supported */\nSELECT 1;\n```\n\n## Architecture\n\n```\npsql / PG drivers\n       │ TCP\n       ▼\n┌─────────────────────┐\n│   Network Layer      │  Accept connections, goroutine per connection\n│   (server/)          │\n├─────────────────────┤\n│   PG Wire Protocol   │  Startup handshake, auth, SimpleQuery,\n│   (pgwire/)          │  RowDescription, DataRow, CommandComplete\n├─────────────────────┤\n│   SQL Parser         │  Lexer → tokens → recursive descent → AST\n│   (parser/)          │\n├─────────────────────┤\n│   Query Executor     │  Walk AST, evaluate WHERE, call storage\n│   (executor/)        │\n├─────────────────────┤\n│   Storage Engine     │\n│   (storage/)         │\n│   ├─ Catalog         │  In-memory table schemas (rebuilt from WAL)\n│   ├─ Heap            │  In-memory row data per table\n│   ├─ Index           │  B-tree indexes for primary key columns\n│   └─ WAL             │  Per-table append-only logs for crash recovery\n└─────────────────────┘\n       │\n    Data dir\n    ├── catalog.wal      DDL log (CREATE/DROP TABLE)\n    └── tables/\n        └── \u003cname\u003e.wal   Per-table DML log\n```\n\n### Design Principles\n\n- **Modular via interfaces** — every layer boundary is a Go interface. Packages depend on contracts, never on concrete types from other layers.\n- **No circular dependencies** — dependency flows downward: `server` → `executor` → `parser` + `storage`. `main.go` is the composition root.\n- **Testable in isolation** — each package has unit tests that don't require a running server or real disk.\n- **WAL-first writes** — every mutation is logged to the WAL before being applied to in-memory state. On startup, the WAL is replayed to reconstruct the full database.\n\n### Concurrency Model\n\nMultiple clients can connect simultaneously. The server spawns a goroutine per connection (`server/server.go`), and all goroutines share a single stateless executor that forwards calls to the storage engine.\n\n**Per-table locking.** The storage engine (`storage/engine.go`) uses a two-level locking scheme:\n\n- A **catalog lock** (`catalogMu`) protects the table registry. DDL operations (`CreateTable`, `DropTable`) take a write lock; DML operations take a brief read lock to look up the target table, then release it.\n- Each table has its own **table lock** (`tableState.mu`). DML operations (`Insert`, `Update`, `Delete`) take the table's write lock; read operations (`Scan`, `LookupByPK`) take the table's read lock.\n\nThis means writes to different tables can proceed concurrently — inserting into table A does not block inserts into table B.\n\n| Operation | Catalog lock | Table lock |\n|-----------|-------------|------------|\n| `CreateTable` | Write (held throughout) | — |\n| `DropTable` | Write | Write |\n| `Insert`, `Update`, `Delete` | Read (brief) | Write |\n| `Scan`, `LookupByPK` | Read (brief) | Read |\n| `GetTable`, `ListTables` | Read | — |\n\nLock ordering is always catalog before table (never reversed), which prevents deadlocks.\n\n**Snapshot iterators.** `Scan` copies all matching rows into a new slice while the table's read lock is held, then returns an iterator over that private snapshot. The iterator is safe to consume after the lock is released. `LookupByPK` similarly returns a copied row.\n\n**DROP TABLE race guard.** A DML goroutine could grab a `tableState` pointer, release the catalog lock, then find the table was dropped before it acquires the table lock. Each `tableState` has a `dropped` flag that DML checks after acquiring the table lock, returning `TableNotFoundError` if set.\n\n**Atomic batch writes.** Multi-row `INSERT`, `UPDATE`, and `DELETE` validate all constraints (PK uniqueness, column count) before writing anything. If validation passes, all affected rows are written as a single WAL entry with one fsync, then applied to the in-memory heap — no partial writes on constraint violation or WAL failure.\n\n### Persistence\n\nEvery write goes through the WAL before being applied in memory:\n\n1. Caller invokes `engine.Insert(...)` (or Update, Delete, etc.)\n2. Engine acquires the table's write lock\n3. WAL entry is written to the table's WAL file and fsynced: `[4-byte length][1-byte op][payload][4-byte CRC32]`\n4. In-memory heap is updated\n5. Lock is released\n\n**Split WAL layout.** The WAL is split into per-table files:\n\n```\n\u003cdataDir\u003e/\n├── catalog.wal          # DDL only: CreateTable / DropTable entries\n└── tables/\n    ├── users.wal        # DML for \"users\" table\n    └── orders.wal       # DML for \"orders\" table\n```\n\nDDL operations (CREATE TABLE, DROP TABLE) are logged to `catalog.wal`. DML operations (INSERT, UPDATE, DELETE) are logged to the individual table's WAL file. This means DROP TABLE can instantly reclaim disk space by deleting the table's WAL file, and concurrent writes to different tables hit different files.\n\nOn startup, `Open()` performs a two-phase replay: first the catalog WAL (to learn table schemas), then each surviving table's WAL (to populate heaps). Orphan WAL files (from a crash during DROP TABLE) are cleaned up automatically.\n\nEach WAL file uses a versioned binary format (`[4-byte magic \"MWAL\"][uint16 version][entries...]`). When the format changes between releases, the `--migrate` flag must be used to upgrade. See [WAL Migration](#wal-migration).\n\n## WAL Migration\n\nThe WAL uses a versioned binary format and a per-table file layout. When a new release changes the format or layout, the engine will refuse to start:\n\n```\ndata directory uses legacy single-WAL format; restart with --migrate flag to convert to per-table WAL files\n```\n\nTo migrate, restart with `--migrate`:\n\n```bash\n./mulldb --datadir ./data --migrate\n```\n\nThe `--migrate` flag handles two kinds of migration:\n\n1. **Format version migration** — upgrades the binary entry format (e.g. v1→v2 added primary key flags). The original `wal.dat` is preserved as `wal.dat.bak`.\n2. **Split WAL migration** — converts a legacy single `wal.dat` into the per-table layout (`catalog.wal` + `tables/\u003cname\u003e.wal`). DML entries for dropped tables are discarded, immediately reclaiming space. The original `wal.dat` is preserved as `wal.dat.bak`.\n\nBoth migrations are chained automatically when needed (e.g. a v1 single-WAL file gets format-upgraded first, then split).\n\nAfter verifying the database works correctly, you can manually delete the backup file. The engine will never delete it for you.\n\nIf `--migrate` is passed but no migration is needed, the engine logs an info message and starts normally.\n\n## Project Structure\n\n```\nmulldb/\n├── main.go                 Entry point, signal handling, wiring\n├── go.mod\n├── PLAN.md                 Design document\n├── DESIGN.md               Architecture details and WAL format\n├── STANDARD.md             SQL standard (Core SQL) conformance checklist\n├── CLAUDE.md               Project conventions (AI-assistant facing)\n│\n├── config/\n│   └── config.go           CLI flags + env var parsing\n│\n├── server/\n│   ├── server.go           TCP listener, accept loop, graceful shutdown\n│   └── connection.go       Per-connection lifecycle, query dispatch\n│\n├── pgwire/\n│   ├── protocol.go         PG v3 message types and constants\n│   ├── reader.go           Read PG messages from net.Conn\n│   └── writer.go           Write PG messages to net.Conn\n│\n├── parser/\n│   ├── token.go            Token types and keywords\n│   ├── lexer.go            Tokenizer (SQL → tokens)\n│   ├── ast.go              AST node types\n│   ├── parser.go           Recursive descent parser (tokens → AST)\n│   └── parser_test.go\n│\n├── executor/\n│   ├── executor.go         Query execution (AST → storage → results)\n│   ├── scalar.go           Scalar function registry and static SELECT evaluation\n│   ├── fn_concat.go        CONCAT() implementation (registers via init())\n│   ├── fn_length.go        LENGTH() / CHARACTER_LENGTH() / CHAR_LENGTH() (registers via init())\n│   ├── fn_math.go          Math functions: ABS, ROUND, CEIL, FLOOR, POWER, SQRT, MOD (registers via init())\n│   ├── fn_now.go           NOW() implementation (registers via init())\n│   ├── fn_version.go       VERSION() implementation (registers via init())\n│   ├── result.go           Result types, QueryError, SQLSTATE mapping\n│   └── executor_test.go\n│\n├── version/\n│   └── version.go          Build-info package; Tag/GitCommit/BuildTime set via -ldflags\n│\n└── storage/\n    ├── types.go            Data types, typed errors, Engine interface\n    ├── catalog.go          In-memory table schema management\n    ├── heap.go             In-memory row storage per table\n    ├── compare.go          Type-aware value comparison\n    ├── timestamp.go        Timestamp parsing and type coercion\n    ├── wal.go              Write-ahead log (write, replay, checksums)\n    ├── wal_migrate.go      WAL format + split-WAL migration framework\n    ├── wal_test.go         WAL migration tests\n    ├── row.go              Binary row encoding/decoding\n    ├── tablefile.go        Table name ↔ filename encoding (percent-encoding)\n    ├── tablefile_test.go\n    ├── engine.go           Per-table WAL engine with per-table locking\n    ├── engine_test.go\n    │\n    └── index/\n        ├── index.go        Index interface\n        └── btree.go        In-memory B-tree index implementation\n```\n\n## Testing\n\nRun the full test suite:\n\n```bash\ngo test ./...\n```\n\nRun with the race detector:\n\n```bash\ngo test -race ./...\n```\n\nThe test suite covers:\n- **Parser**: all 9 statement types, WHERE with AND/OR/NOT/precedence, operators, IS NULL / IS NOT NULL, LIKE / NOT LIKE / ILIKE / NOT ILIKE with ESCAPE, IN / NOT IN, arithmetic expressions (+, -, *, /, %, unary minus) with precedence, aggregate and scalar function syntax, column aliases (AS), ORDER BY, INNER JOIN (with aliases, qualified columns, multi-join), implicit cross-join (comma-separated FROM), optional FROM clause, UTF-8 identifiers and string literals, SQL comments (`--` and `/* */` with nesting), error cases\n- **Storage**: CRUD operations, WAL replay across restart, typed errors, concurrent reads and writes, per-table WAL file layout, split WAL migration, orphan cleanup, concurrent writes to independent tables, transaction overlay (insert/update/delete commit and rollback, read-your-own-writes, multi-table commit, PK conflict on commit, isolation between transactions, WAL crash recovery for incomplete transactions)\n- **Executor**: full round-trip (CREATE → INSERT → SELECT → UPDATE → DELETE), arithmetic expressions (static and with FROM, in WHERE, in INSERT VALUES), division/modulo by zero, NULL propagation, aggregate functions (COUNT/SUM/AVG/MIN/MAX), ORDER BY (ASC/DESC, multi-column, NULLs last), LIMIT/OFFSET, column aliases, static SELECT (literals and scalar functions), IS NULL / IS NOT NULL, NOT operator, NULL comparison semantics, IN / NOT IN (integers, text, booleans, timestamps, NULL semantics, UPDATE/DELETE, JOIN), INNER JOIN (basic, aliases, WHERE filter, empty result, SELECT *, ambiguous column errors, ORDER BY, LIMIT/OFFSET), BEGIN/COMMIT/ROLLBACK no-ops, SQLSTATE codes, column resolution, NULL handling\n\n## Error Handling\n\nmulldb returns proper PostgreSQL SQLSTATE codes in ErrorResponse messages:\n\n| SQLSTATE | Condition | Example |\n|----------|-----------|---------|\n| `42601` | Syntax error | `FROBNICATE` |\n| `42P01` | Undefined table | `SELECT * FROM nonexistent` |\n| `42P07` | Duplicate table | `CREATE TABLE t (...)` when `t` exists |\n| `42703` | Undefined column | `SELECT bad_col FROM t` |\n| `22023` | Invalid parameter value | Wrong number of INSERT values |\n| `23505` | Unique violation | Inserting a duplicate primary key or unique index value |\n| `42803` | Grouping error | Mixing aggregate and non-aggregate columns |\n| `42809` | Wrong object type | `INSERT INTO pg_type ...` (catalog is read-only) |\n| `42883` | Undefined function | Unknown aggregate function or type mismatch |\n| `22012` | Division by zero | `SELECT 1 / 0` |\n| `42704` | Undefined object | `DROP INDEX nonexistent ON t` |\n| `0A000` | Feature not supported | ORDER BY with aggregates (no GROUP BY) |\n\n## Compatibility No-Ops\n\nSome SQL commands are accepted and silently acknowledged without performing any action. This ensures compatibility with clients like `psql` and PostgreSQL drivers that send these commands automatically.\n\n| Command | Reason |\n|---------|--------|\n| `SET \u003cparam\u003e = \u003cvalue\u003e` | `psql` sends `SET client_encoding`, `SET standard_conforming_strings`, etc. during startup. Only `SET TRACE` and `SET FSYNC` have real effects; all others are acknowledged as no-ops. |\n| `SAVEPOINT \u003cname\u003e` | `psql` sends implicit savepoints when `ON_ERROR_ROLLBACK` is enabled. Accepted but no savepoint is actually created. |\n| `RELEASE SAVEPOINT \u003cname\u003e` | Companion to `SAVEPOINT`. Accepted but no savepoint is released. |\n| `ROLLBACK TO SAVEPOINT \u003cname\u003e` | Companion to `SAVEPOINT`. Accepted but does not roll back to any savepoint — the full transaction state is preserved as-is. |\n\n## Limitations\n\nmulldb is intentionally minimal. Things it does **not** support:\n- **Multi-column primary keys** — only single-column PRIMARY KEY is supported\n- **SAVEPOINT** — no savepoints within transactions\n- **SET TRANSACTION** — isolation level is always READ COMMITTED; not configurable\n- **LEFT/RIGHT/FULL OUTER JOINs** — only INNER JOIN is supported\n- **GROUP BY / HAVING**\n- **Decimal arithmetic** — no exact-precision DECIMAL/NUMERIC types; use FLOAT for approximate numeric values\n- **Subqueries**\n- **Extended query protocol** — only SimpleQuery flow\n- **TLS/SSL** — connections are unencrypted (SSL negotiation is refused)\n- **Multiple databases** — single database per instance\n\n## License\n\nMIT License. See [LICENSE](LICENSE) for details.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flouis77%2Fmulldb","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flouis77%2Fmulldb","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flouis77%2Fmulldb/lists"}