{"id":30808569,"url":"https://github.com/fajarnugraha37/kvstore-ts","last_synced_at":"2026-05-04T00:30:58.505Z","repository":{"id":311362658,"uuid":"1043393780","full_name":"fajarnugraha37/kvstore-ts","owner":"fajarnugraha37","description":"A compact, embeddable key/value store written in TypeScript","archived":false,"fork":false,"pushed_at":"2025-08-24T00:29:35.000Z","size":239,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-04-16T17:43:53.856Z","etag":null,"topics":["bun","deno","diy","key-value-store","kv-store","lsm-tree","nodejs","persistence","scratch","sstable","typescript","write-ahead-log"],"latest_commit_sha":null,"homepage":"","language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/fajarnugraha37.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-08-23T18:56:59.000Z","updated_at":"2025-08-24T00:29:41.000Z","dependencies_parsed_at":"2025-08-25T22:18:03.062Z","dependency_job_id":null,"html_url":"https://github.com/fajarnugraha37/kvstore-ts","commit_stats":null,"previous_names":["fajarnugraha37/kvstore-ts"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/fajarnugraha37/kvstore-ts","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fajarnugraha37%2Fkvstore-ts","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fajarnugraha37%2Fkvstore-ts/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fajarnugraha37%2Fkvstore-ts/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fajarnugraha37%2Fkvstore-ts/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/fajarnugraha37","download_url":"https://codeload.github.com/fajarnugraha37/kvstore-ts/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fajarnugraha37%2Fkvstore-ts/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32590065,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-03T22:12:39.696Z","status":"ssl_error","status_checked_at":"2026-05-03T22:09:10.534Z","response_time":103,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bun","deno","diy","key-value-store","kv-store","lsm-tree","nodejs","persistence","scratch","sstable","typescript","write-ahead-log"],"created_at":"2025-09-06T03:47:54.027Z","updated_at":"2026-05-04T00:30:58.488Z","avatar_url":"https://github.com/fajarnugraha37.png","language":"TypeScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# kvstore-ts\n\nA compact, embeddable key/value store implemented in TypeScript. The repo provides a storage Engine, an\nembeddable convenience wrapper, a CLI for exploration and tooling, and reference implementations of WAL, manifest,\ncompaction, transactions, watches, and a small inverted index used for demo fuzzy/contains searches.\n\nThis README now contains deeper technical details: repository layout, architectural diagrams described in\ntext, technical decisions, detailed dataflows (write/read/tx/compaction/recovery/watch), and a project milestone\nroadmap.\n\n---\n\n## Repository layout\n\nThis project focuses on a compact, single-node durable KV storage engine implemented in TypeScript with a small set of supporting tools. Current top-level folders are the canonical surface for development and tests:\n\n- `src/` — main library entry and core TypeScript modules (engine, storage wiring, CLI adapters).\n- `wal/` — write-ahead log implementation and helpers.\n- `storage/` \n    - SSTable (datafile) writer/reader, manifest, compaction, and storage facade\n    - small manifest utilities and metadata helpers (file metadata tracking).\n    - compactor and policies (background GC / compaction worker).\n    - transaction manager and related helpers.\n- `watch/` — watch/subscription manager and matching utilities.\n- `apps/cli/` — command-line tooling and interactive shell.\n- `apps/embedded/` — embeddable wrapper and examples for embedding `KVStore` in apps.\n- `utils/` — low-level utilities (bytes, checksum helpers, fs wrappers).\n- `data/` — example DB directory used by tests and local runs (not committed in production).\n\nOther top-level files: `package.json`, `tsconfig.json`, `README.md`, `bun.lock` and test fixtures.\n\n---\n\n## Project plan\n\nBelow is a concise, prioritized plan and a tiny contract for the project. This replaces the previous long-form\narchitecture notes with an actionable roadmap and component-level responsibilities so the README matches the\nimplementation and the project's current scope.\n\nShort plan\n- Present a prioritized checklist of components (MVP first).\n- For each component: purpose, responsibilities, implementation notes, tests/edge-cases, suggested tech choices.\n- End with recommended next steps and minimal milestone plan.\n\nContract (tiny)\n- Inputs: client requests (put/get/txn/watch/lease/join), cluster messages (raft log entries / RPCs), disk I/O.\n- Outputs: persisted state, linearizable responses, cluster state changes, metrics \u0026 logs.\n- Success criteria: durable single-node correctness, then linearizability across cluster with leader election and recovery.\n- Error modes: disk failure, node crash, network partition, slow IO, inconsistent state after crash (must recover), split-brain.\n\nPrioritized checklist (MVP → extras)\n1. Core storage (single-node durable)\n2. WAL + append-only log + compaction\n3. Simple API server (gRPC/HTTP)\n4. Simple client operations (get/put/delete)\n5. Watch/subscription mechanism\n6. Snapshots \u0026 compaction\n7. Consensus (Raft) and leader state machine\n8. Cluster membership + bootstrapping\n9. Transactions / Compare-And-Swap (MVCC)\n10. Leases / TTL and lease eviction\n11. Security: TLS + auth + RBAC\n12. Metrics, health, and diagnostics\n13. Backup/restore, defragmentation\n14. Stress tests, chaos/fault-injection tests\n15. CLI tooling and admin APIs\n16. Production hardening (observability, tuning)\n\nComponent notes (condensed)\n\n1) Core storage (single-node)\n- Purpose: persist key→value and metadata (versions).\n- Responsibilities: store key/value byte arrays; support Get/Put/Delete and range scans; maintain per-key revision/version (MVCC) for transactions and linearizability.\n- Implementation notes: choose an engine (embedded DB like Level or custom WAL+SST); use lexicographic key layout for range queries.\n- Tests \u0026 edge-cases: crash recovery, large key/value sizes, binary keys, concurrency correctness.\n\n2) WAL + append-only log + compaction\n- Purpose: durability and recovery; source-of-truth for replication.\n- Responsibilities: durable append of commands, sync to disk, efficient compaction.\n- Implementation notes: compact binary header (version/type/len/checksum), batching, truncate-on-recover behavior (already implemented in WAL code).\n\n3) API server (gRPC/HTTP)\n- Purpose: expose client-facing APIs (put/get/delete/txn/watch/lease) and health/metrics.\n- Implementation notes: use gRPC for production clients, HTTP/JSON for tooling.\n\n4) State machine \u0026 client operations\n- Purpose: apply log entries to storage deterministically and idempotently.\n\n5) Watch / subscription manager\n- Purpose: allow clients to subscribe to key events; deliver events in order after entries are applied.\n\n6) Snapshots \u0026 compaction\n- Purpose: truncate WAL/log and reduce storage by creating snapshots of applied state; atomic replace of snapshot files.\n\n7) Consensus (Raft)\n- Purpose: replicated, fault-tolerant state machine; prefer reuse of mature libraries for correctness.\n\n... (other components summarized above; keep details in design docs or separate RFCs)\n\nMinimal MVP roadmap (concrete)\n1. Single-node durable KV: Storage + WAL + simple HTTP/gRPC API for put/get/delete + basic tests.\n2. Watches + simple snapshots.\n3. Add Raft (3-node cluster) using WAL as raft log backend.\n4. Transactions, leases, auth, metrics.\n5. Operational tooling: CLI, backup/restore, stress harness, monitoring.\n\nRecommended immediate next steps\n- Choose runtime and raft/storage libraries (Go + etcd/raft recommended for production; Node ok for prototyping).\n- Implement single-node storage + WAL (reuse existing WAL code) and a minimal API server.\n- Add crash-recovery tests and a stress harness.\n\nIf you prefer, I can scaffold the storage module files and implement either a minimal proof-of-concept (MemTable + SST writer + tests) or wire the memtable-\u003eSST flush path integrated with the existing WAL and manifest.\n\n---\n\n## Detailed technical flows\n\nBelow are explicit step-by-step flows you can use as a reference when reasoning about code or implementing features.\n\n### Write (single-op) flow\n\n1. API entry (client/library) calls `engine.put(key, value)`.\n2. Validation \u0026 encoding: key/value validated and possibly encoded (e.g., MessagePack) and a checksum computed.\n3. WAL append: an operation record (op type, key, metadata, location hint) is appended to the WAL and fsync'ed\n\t according to durability settings.\n4. Datafile append: the value is appended to the active datafile; the manifest is updated atomically (or deferred to\n\t a checkpoint) with the new segment references.\n5. Index update: in-memory index and small inverted index are updated to reflect the new key location.\n6. Notify subscribers: the watch manager emits the event with old/new metadata.\n7. Return: operation returns success along with a revision or version id.\n\nNotes: If `put` is part of a transaction, steps 4-6 are deferred until `tx.commit()`; WAL append still records the\ntransactional intent so recovery can reconstitute pending transactions.\n\n### Transaction commit flow\n\n1. Begin transaction: `tx = engine.beginTransaction()` — allocates ephemeral buffers and records start state.\n2. Transactional operations (tx.put/del): buffered in-memory; WAL entries with transaction id may be appended to\n\t record intent (depending on implementation choices).\n3. Commit: on `tx.commit()` the engine validates conflicts (if any), appends a transactional commit record to WAL,\n\t writes buffered entries to datafiles, updates indexes, notifies watchers, and releases transaction resources.\n4. Abort: `tx.abort()` discards buffered changes and may append an abort record.\n\n### Read flow (get/scan)\n\n1. `get(key)` consults the in-memory index for a location. If not found, look up a small per-file index or fallback to\n \t a sequential scan of index metadata.\n2. Read blob via IO helper, verify checksum, decode, and return value and revision metadata.\n3. `scan()` returns an AsyncIterable that yields rows as they are resolved; `scanStream()` writes rows directly to a\n\t provided output (stdout, file, or stream) for very large datasets.\n\n### Compaction/GC flow\n\n1. Scheduler picks a compaction target (candidate segments with ratio of dead/live data above threshold).\n2. Compactor reads live entries, writes them to a new compact segment, updates the manifest and frees old segments via\n\t the freelist.\n3. If a crash occurs during compaction, manifest WAL entries and atomic renames ensure partial compactions are rolled\n\t forward or ignored safely during recovery.\n\n### Recovery flow (startup)\n\n1. On startup, the Engine reads the last manifest/superblock to determine active segments.\n2. Replay WAL entries newer than the last checkpoint to apply missed operations or complete partial commits.\n3. Rebuild in-memory index and optional indexes from manifests and datafiles (or use persisted index files if present\n\t and validated).\n\n---\n\n## How the database stores and retrieves data (detailed)\n\nThis section summarizes the concrete storage model and the exact step-by-step paths for writes and reads so you\nhave a single reference in the README for how data moves through the system.\n\nStorage model summary\n- Append-only segmented datafiles: values are appended to immutable segments (datafiles). The manifest/superblock\n\trecords which segments are active.\n- WAL-first durability: mutating operations are first appended to the WAL so they can be recovered after a crash.\n- Hot index: an in-memory index maps keys to the latest location (segment id + offset + length).\n- Optional persisted inverted 3-gram index used by the CLI for contains/fuzzy searches.\n\nWrite path (single operation: put / del / cas)\n1. API entry: caller invokes `engine.put(key, value)`.\n2. Encode \u0026 metadata: the engine encodes the value (MessagePack or raw bytes), computes checksum, and assigns a\n\t revision/version id.\n3. WAL append: create a WAL record describing the operation (op type, key, metadata) and append it to WAL; flush\n\t depending on durability configuration. This guarantees recoverability.\n4. Datafile append: append blob (with a small header: length, checksum, rev) to the active datafile segment.\n5. In-memory index update: update the in-memory index to point the key to the new segment + offset + rev.\n6. Persisted index update: if a persisted inverted index is configured, update it incrementally or mark it for\n\t asynchronous rebuild.\n7. Notify watchers: the watch manager emits a durable change event (old/new metadata) after the WAL/datafile writes\n\t as required by the durability model.\n8. Return: client receives success and revision metadata.\n\nNotes: if `put` is done inside a transaction, datafile/index updates and watcher notifications are deferred until\ncommit; WAL may still record transactional intent.\n\nTransactional write path (commit)\n1. `beginTransaction()` allocates an in-memory transaction context and records start metadata.\n2. `tx.put()` / `tx.del()` buffer writes in the transaction context (not visible to other readers).\n3. On `tx.commit()` the engine performs conflict checks (optimistic or lock-based depending on implementation), then\n\t appends a transaction-commit record to WAL and writes buffered blobs to datafiles.\n4. Indexes are updated atomically with the commit step; watchers are notified after commit is durable.\n5. On `tx.abort()` the buffers are dropped; an abort WAL record may be written depending on the durability model.\n\nRead path (get / scan / contains)\n-- Single-key lookup (`get`): consult in-memory index -\u003e fetch blob from datafile via IO helper -\u003e verify checksum -\u003e\n \tdecode and return value + rev.\n- Range/prefix scan: walk the in-memory index in key order and yield rows lazily (AsyncIterable) or stream them\n\tdirectly with `scanStream()` to avoid buffering large results.\n- Contains/fuzzy: use persisted inverted 3-gram index to generate candidate keys, rank by Levenshtein or heuristics,\n\tthen fetch candidates using the normal lookup path.\n\nCompaction \u0026 garbage collection\n- Scheduler selects candidate segments (high dead/live ratio).\n- Compactor reads live entries and writes them into new segment(s), updating the manifest atomically and freeing\n\told segments via the freelist.\n- Crash during compaction is handled by manifest/WAL coordination: partial results are either completed or ignored\n\tsafely during recovery.\n\nRecovery (startup) summary\n1. Read last manifest/superblock to determine active segments.\n2. Replay WAL entries newer than last checkpoint to reapply missed operations or finish partial commits.\n3. Rebuild or validate in-memory indexes from manifest/datafiles (or load persisted index files if validated).\n\nConsistency, durability and trade-offs\n- Durability: an operation is durable once its WAL record is flushed. Configurable batching affects latency vs\n\tthroughput trade-offs.\n- Atomicity: single-op atomic; multi-op transactions are atomic at the commit boundary.\n- Isolation: single-node transactional isolation depends on the engine's concurrency model (optimistic/locks).\n- Performance trade-offs: WAL-first gives a clear recovery story at the cost of extra IO; background compaction trades\n\tCPU/IO for reclaimed disk space.\n\nData integrity and validation\n- Checksums are stored with data blobs and verified on reads.\n- Manifests and WAL are validated at startup; corrupted WAL typically requires truncation or restore from snapshot.\n\nCommon failure scenarios and mitigations\n- Lost power after WAL append: replay WAL on recovery to reapply the op.\n- WAL corruption: truncate/repair or restore from snapshot.\n- Compaction crash: manifest/WAL ensures partial compactions don't corrupt the visible state.\n\nPerformance knobs and recommendations\n- Batch writes when possible to reduce WAL fsync frequency.\n- Use `scanStream()` for large exports to avoid memory pressure.\n- Tune compaction thresholds and segment sizes according to workload (many small writes vs bulk loads).\n\n## On-disk layouts\n\nThis section records the exact, authoritative byte-level formats implemented in the codebase (WAL, SST, Manifest).\nUse these when building tooling, format validators, or diagrams.\n\n### Requirements checklist\n- WAL entry headers/trailers (exact offsets, types, endianness, checksum) — implemented below.\n- SST (datafile) file/header/entry/index/footer layouts (per-entry fields, varint tails, CRCs, compression flag) — implemented below.\n- Manifest on-disk representation and atomic persist semantics — implemented below.\n\n### WAL (write-ahead log)\n\n- Entry layout on disk: header (12 bytes) | payload (N bytes) | trailer (12 bytes, identical format as header)\n- Header / trailer (12 bytes) byte offsets and meaning (all numeric fields are big-endian unless noted):\n\t- offset 0 (1 byte): version (uint8)\n\t- offset 1 (1 byte): type (uint8)\n\t- offset 2-3 (2 bytes): reserved (uint16)\n\t- offset 4-7 (4 bytes): payload length (uint32 BE)\n\t- offset 8-11 (4 bytes): checksum (uint32 BE) — Adler-32 of the payload bytes\n\nNotes:\n- The WAL writer writes header + payload + trailer. On recovery the reader validates header/trailer pair and checksum and\n\twill truncate any trailing partial/corrupt bytes. Payloads are encoded with MessagePack (fall back to JSON where required).\n- Metadata files: the WAL subsystem maintains small companion files (.meta.json and .segments.json) that record global\n\tbase/next offsets and rotated segment boundaries.\n\nASCII layout (example):\n\n\t[hdr:12] [payload:N] [trailer:12]\n\t00..00  12..(12+N-1)   (12+N)..(23+N)\n\nConcrete example (JSON payload used for illustration). Payload bytes and Adler-32 were computed from the example payload:\n\n\tpayload (UTF-8): {\"op\":\"put\",\"key\":\"foo\"}\n\tpayload length: 24 bytes\n\n\tfull bytes (hex):\n\t01 01 00 00 00 00 00 18 5c 04 07 6e  7b 22 6f 70 22 3a 22 70 75 74 22 2c 22 6b 65 79 22 3a 22 66 6f 6f 22 7d  01 01 00 00 00 00 00 18 5c 04 07 6e\n\n\tInterpretation:\n\t- first 12 bytes = header: version=0x01, type=0x01, reserved=0x0000, length=0x00000018 (24), checksum=0x5c04076e\n\t- next 24 bytes = payload (UTF-8 JSON)\n\t- final 12 bytes = trailer (identical to header)\n\n### SST / datafile (SSTable) — format v3 (current)\n\n- File header: 4-byte magic \"SST1\" followed by 1-byte version (total 5 bytes).\n- Per-entry layout (v3) in sequence:\n\t- key length: 4 bytes (uint32 BE)\n\t- key: keyLen bytes\n\t- value length: 4 bytes (uint32 BE) — 0xFFFFFFFF indicates a tombstone (deleted)\n\t- value: valueLen bytes (omitted for tombstone)\n\t- revision: varint (7-bit groups, LSB-first continuation semantics)\n\t- walOffset: varint\n\t- createdAt: varint\n\n- Index layout (written after blocks):\n\t- 4 bytes: block count (uint32 BE)\n\t- for each block:\n\t\t- fklen (4 bytes uint32 BE)\n\t\t- fk (fklen bytes) — first key in block\n\t\t- offset (8 bytes uint64 BE) — file offset where block begins\n\t\t- storedLen (4 bytes uint32 BE) — length of stored bytes; high bit (0x80000000) indicates compression\n\t\t- crc32c (4 bytes uint32 BE) — CRC32C of the stored bytes\n\n- Footer (SST_FOOTER_LEN = 28 bytes):\n\t- indexOffset: 8 bytes (uint64 BE)\n\t- bloomOffset: 8 bytes (uint64 BE) — zero if no bloom present\n\t- indexCks: 4 bytes (uint32 BE) — CRC32C of the index region\n\t- footerCks: 4 bytes (uint32 BE) — CRC32C of the footer region (excludes magic)\n\t- magic: 4 bytes (uint32 BE) — 0x53535446\n\nChecksums and compression:\n- SST blocks/index/footer use CRC32C (Castagnoli) as implemented in the repository.\n- Blocks may be compressed; storedLen will have BLOCK_COMPRESSED_FLAG (0x80000000) set when compressed.\n\nConcrete SST entry example (synthetic):\n\n- Consider putting key=\"abc\" value=\"xyz\" with revision=1, walOffset=123, createdAt=1690000000.\n- Encoding (per-entry sequence):\n\t- keyLen (u32 BE): 00 00 00 03\n\t- key bytes: 61 62 63\n\t- valueLen (u32 BE): 00 00 00 03\n\t- value bytes: 78 79 7a\n\t- revision (varint for 1): 01\n\t- walOffset (varint for 123): 7b\n\t- createdAt (varint for 1690000000): 80 8f a6 f5 05  (varint representation shown as bytes)\n\nHex stream (concatenated):\n\t00 00 00 03 61 62 63 00 00 00 03 78 79 7a 01 7b 80 8f a6 f5 05\n\nNotes: createdAt varint above is illustrative; actual varint bytes depend on the varint encoder's grouping — the README varint convention above is authoritative (7-bit groups, continuation bit).\n\n### Manifest (manifest.json)\n\n- The manifest is persisted as JSON with this top-level shape: { files: PersistedFileMeta[], walOffset: number }\n- PersistedFileMeta contains: { file: string, minKeyHex: string, maxKeyHex: string, size: number, level?: number, walOffset?: number, createdAt?: number }\n- Persist semantics: write to a temporary file (manifest.json.tmp) -\u003e optional fsync on tmp -\u003e atomic rename to manifest.json -\u003e optional fsync on directory when strictAtomicity is enabled. This ensures an atomic manifest replace on platforms that support rename semantics.\n\n### Helpers \u0026 encodings\n\n- Integer endianness: all fixed-width integers (u32, u64) are big-endian on disk.\n- Varints: 7-bit groups with continuation bit; LSB-first group order (standard little-endian varint-style encoding used in the helper utilities).\n- Checksums: WAL uses Adler-32 over payload bytes; SST uses CRC32C for blocks, index and footer integrity.\n\n### Quick write/recovery sequence (ASCII)\n\n\tClient -\u003e Engine.put()\n\t\t\t -\u003e WAL.append(header,payload,trailer) + fdatasync\n\t\t\t -\u003e (optionally) Datafile/SST append or batch into SSTWriter\n\t\t\t -\u003e (optionally) manifest persist (tmp -\u003e rename -\u003e dir fsync)\n\t\t\t -\u003e in-memory index update -\u003e notify watchers\n\nRecovery (startup): read manifest -\u003e open active SST files -\u003e scan WAL entries newer than manifest.walOffset -\u003e apply operations -\u003e rebuild in-memory index\n\n## Observability and debugging\n\n- Metrics: small metrics module exports counters/histograms for ops/sec, latency, compaction stats, and WAL throughput.\n- Logging: each major subsystem (WAL, compactor, txn manager, watcher) emits structured logs to help triage issues.\n- Snapshots: use the snapshot/export facility to produce consistent dumps for debugging or offline analysis.\n\n---\n\n## Project milestones \u0026 roadmap\n\nPlanned milestones are grouped into short, medium, and long-term goals.\n\nShort-term (next 1-4 weeks)\n- Finish README and documentation (this change).\n- Add `examples/` with runnable scripts for: put/get, tx commit/abort, streaming scan, snapshot export/import.\n- Add smoke tests and a GitHub Actions workflow to run core smoke tests on push.\n\nMedium-term (1-3 months)\n- Improve the inverted index: make it incremental and persistent with configurable shard sizes.\n- Add a small HTTP admin API for metrics, compaction controls, and snapshot exports.\n- Write unit + integration tests for transactional conflict resolution and compaction edge cases.\n- Add benchmark harnesses and CI perf regression checks.\n\n---\n\n## How you can help / contribute\n\n- Open issues for bugs or feature requests and assign a small scope to each change.\n- Send PRs with tests for bug fixes and new features; run `npm test` and `npm run typecheck` locally.\n- Help build the examples and CI workflow.\n\n---\n\n## Benchmark: 1M write / 1M read\n\nThis repository includes a heavy benchmark that writes one million (1,000,000) entries and then performs two read modes over the same dataset: point-reads (one-by-one via `get`) and range scan (prefix scan). The bench is intentionally heavy to exercise WAL, memtable, SST flushing and compaction code paths and to produce realistic IO/CPU/memory characteristics.\n\n### Files and scripts\n\n- `bench/kv_bench.ts` — the heavy benchmark script. It performs:\n\t- write 1,000,000 entries with a 256B payload (keys are `k_0000000`..`k_0999999`), batched in groups of 1,000 writes.\n\t- read 1,000,000 entries by issuing batched `get()` calls.\n\t- read 1,000,000 entries by streaming a prefix scan (`scanStream({ startWith: 'k_' })`).\n\t- prints memory stats between phases.\n- `scripts/run_and_record_bench.cjs` — cross-platform runner that executes the bench and saves the raw mitata/bench output to `tmp/mitata_last.txt` (and appends run-level CPU/duration metadata).\n- `scripts/append_bench_readme.ts` — parses the captured output and appends a Markdown table under the `## Benchmarks` section in `README.md` containing per-bench avg latency, memory and captured CPU percent.\n\n### How to run\n\n- Run the cross-platform recorder (recommended on Windows/macOS/Linux):\n\n\t```powershell\n\tbun run bench:mitata:record\n\t```\n\n\tThis will:\n\t- execute `bench/kv_bench.ts`, writing and reading 1M entries (may take several minutes depending on hardware),\n\t- produce `tmp/mitata_last.txt` containing raw bench output,\n\t- append a Markdown table summary to the `## Benchmarks` section in this `README.md`.\n\nNotes and resource expectations\n- Disk usage: the bench writes ~256 bytes per value × 1,000,000 ≈ 256MB for raw values, plus WAL/metadata/overhead and SST files — plan for ~1GB peak depending on compression and compaction behavior.\n- Memory: depends on memtable and SStable readers; monitor RSS while running. The runner records a simple memory snapshot (RSS / heapUsed) printed between phases and the append script will parse average memory where available.\n- Duration: on modern laptops this may take minutes; on slower machines it may take longer. The runner records run duration and an approximate CPU busy percentage.\n\n### Interpreting README entries\n\n- After a run, the `## Benchmarks` section will contain a Markdown table with columns: `name | avg_ms | avg_mem_mb | cpu_percent | ts`.\n\t- `name` — bench name (e.g. `put-small`, `get-small`, `scan-prefix-stream`).\n\t- `avg_ms` — average milliseconds per iteration reported by mitata.\n\t- `avg_mem_mb` — parsed average memory (where available) reported in mitata output.\n\t- `cpu_percent` — approximate CPU busy percentage measured during the run.\n\t- `ts` — ISO timestamp when the bench was recorded.\n\n### Safety and cleanup\n- The benchmark will create and (by default) overwrite `./data/bench_kv`. If you have important data in that directory, move it before running the benchmark.\n- To clean generated data after the run:\n\n\t```powershell\n\trm -r ./data/bench_kv tmp\n\n### Reports\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffajarnugraha37%2Fkvstore-ts","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffajarnugraha37%2Fkvstore-ts","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffajarnugraha37%2Fkvstore-ts/lists"}