https://github.com/docmd-io/docmd-search
Offline semantic search engine that runs anywhere.
https://github.com/docmd-io/docmd-search
api docmd multi-language-support offline-first rust search-engine semantic-search typescript
Last synced: 8 days ago
JSON representation
Offline semantic search engine that runs anywhere.
- Host: GitHub
- URL: https://github.com/docmd-io/docmd-search
- Owner: docmd-io
- License: mit
- Created: 2026-04-24T07:25:09.000Z (2 months ago)
- Default Branch: main
- Last Pushed: 2026-05-31T10:26:42.000Z (about 1 month ago)
- Last Synced: 2026-05-31T12:13:32.589Z (about 1 month ago)
- Topics: api, docmd, multi-language-support, offline-first, rust, search-engine, semantic-search, typescript
- Language: TypeScript
- Homepage: https://docmd.io/search
- Size: 175 KB
- Stars: 1
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Contributing: .github/CONTRIBUTING.md
- Funding: .github/FUNDING.yml
- License: LICENSE
- Code of conduct: .github/CODE_OF_CONDUCT.md
- Security: .github/SECURITY.md
Awesome Lists containing this project
README
docmd-search
Offline semantic search engine for documentation.
Local embeddings, browser-ready indexes.
Website •
Documentation •
Report Bug
## Quick Start
**Run docmd-search instantly on any folder:**
```bash
npx docmd-search ./docs
```
**That's it.**
- Files are discovered and chunked automatically
- Embeddings are generated locally (no cloud API)
- Search is available in the terminal immediately
### Install for regular usage
```bash
npm install -g docmd-search
```
```bash
# Install ML dependencies (one-time)
npm install -g @huggingface/transformers onnxruntime-node
```
```bash
docmd-search ./docs # index + interactive search
docmd-search ./docs --ui # index + web UI
docmd-search --settings # configure model
```
## Features
Designed to work offline, ship nothing to the browser, and stay out of your way.
### Offline by default
* All embeddings generated locally with ONNX Runtime
* No data leaves your machine
* No cloud API keys needed
### Instant search
* Progressive indexing: search available from the first batch
* Incremental: only re-indexes changed files
* Resumable: interrupted indexing resumes from last checkpoint
### Tiny client
* Browser runtime is **<3KB gzipped**
* No model weights in the browser
* Hybrid scoring: keyword matching + vector similarity
### Built-in capabilities
* Multi-batch index format with automatic compression
* Navigation tree generation for web UIs
* First-run setup wizard with model selection
* Interactive terminal search with live results
## How It Works
```
Build time (Node.js) Search time (Browser, <3KB)
─────────────────── ──────────────────────────
Crawl files Load manifest.json
→ Chunk by heading → Load batch 000 (instant)
→ Embed via ONNX → Background-load rest
→ Quantize Float32 → Int8 → Keyword + cosine
→ Compress (ternary/PQ) → Ranked results
→ Save multi-batch index
```
## Models
First run prompts you to select an embedding model:
| Model | Dimensions | Size | Best for |
| :---- | :--------- | :--- | :------- |
| **MiniLM L6 v2** ★ | 384 | ~30 MB | Fast, general purpose |
| BGE Small (English) | 384 | ~45 MB | English-optimised |
| BGE Base (English) | 768 | ~110 MB | Higher quality |
| MPNet Base v2 | 768 | ~110 MB | Multilingual |
Change model later: `docmd-search --settings`
## Configuration (optional)
No configuration is required to get started.
**Global** (`~/.docmd-search/config.json`):
```json
{
"model": "Xenova/all-MiniLM-L6-v2",
"wizardCompleted": true
}
```
**Per-project** (`.docmd-search/config.json`):
```json
{
"model": "Xenova/bge-small-en-v1.5",
"chunkSize": 512,
"include": ["**/*.md"],
"exclude": ["**/drafts/**"]
}
```
Config resolution: defaults → global → project → CLI flags.
## Programmatic Usage
Use in scripts or CI pipelines:
```js
import { indexDirectory, loadAllBatches } from 'docmd-search';
const index = await indexDirectory({
rootDir: './docs',
outDir: '.docmd-search',
});
```
Browser client:
```js
import { load, search } from 'docmd-search/client';
await load('/path/to/.docmd-search');
const results = search('deploy kubernetes', 10);
```
## Project Structure
Keeps the codebase flat and modular.
```
src/
├── bin/docmd-search.ts # CLI entry point
├── client/index.ts # Browser runtime (<3KB)
├── config.ts # Config + model profiles
├── index-io.ts # Multi-batch format + compression
├── index.ts # Barrel exports
├── indexer/
│ ├── chunk.ts # Heading-aware chunking
│ ├── crawl.ts # File discovery
│ └── index.ts # Progressive pipeline
├── model.ts # ONNX embedding manager
├── tui.ts # Terminal UI
├── types.ts # Core types
└── ui/
└── launcher.ts # Web UI via docmd
```
## Part of the docmd ecosystem
docmd-search works standalone with any documentation project. It also integrates with [docmd](https://docmd.io) as a semantic search plugin.
| Tool | What it does |
| :--- | :----------- |
| [docmd](https://github.com/docmd-io/docmd) | Zero-config documentation generator |
| **docmd-search** | Offline semantic search engine |
## Community & Support
* Contributions are welcome
* If you find it useful, consider [sponsoring](https://github.com/sponsors/mgks) or starring the repo ⭐
## License
MIT License. See `LICENSE` for details.