https://github.com/orangecoding/nanosearch

A web-based full-text search tool for local directories. Easy indexing of your files, then search them instantly via a browser. Without bullshit, extremely lightweight.
https://github.com/orangecoding/nanosearch

Last synced: about 2 months ago
JSON representation

A web-based full-text search tool for local directories. Easy indexing of your files, then search them instantly via a browser. Without bullshit, extremely lightweight.

Host: GitHub
URL: https://github.com/orangecoding/nanosearch
Owner: orangecoding
License: apache-2.0
Created: 2026-04-15T10:06:51.000Z (4 months ago)
Default Branch: master
Last Pushed: 2026-04-17T10:52:34.000Z (3 months ago)
Last Synced: 2026-05-18T11:08:44.652Z (2 months ago)
Language: JavaScript
Homepage: https://orange-coding.net
Size: 1.11 MB
Stars: 1
Watchers: 0
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          # nanosearch





  





A web-based full-text search tool for local directories. Easy indexing of your files, then search them instantly via a browser. Without bullshit, extremely lightweight.



  

  

  



---

## Features

- Full-text search with SQLite FTS5 (BM25 ranking, highlighted snippets)

- Incremental re-indexing, only new or changed files are processed

- OCR for scanned PDFs and images (Tesseract.js locally, Tesseract CLI in Docker)

- Supports all sorts of files, e.g. PDF, DOCX, TXT, Markdown, JPG, PNG, TIFF

- Real-time indexing progress

- Rendering your files directly in the browser if possible

---

## Screenshots

| Search                       | Rendering View                       |

| ---------------------------- | ------------------------------------ |

| ![Search](media/screen1.png) | ![Rendering View](media/screen2.png) |

---

# Quick Start

## "Bare Metal"

**Prerequisites:** Node 22

```bash

# 1. Install dependencies + download Tesseract language data (~25 MB, once)

yarn inst

# 2. Configure

cp .env.example .env

# 4. Start the app

yarn start

```

Open `http://localhost:3000`, click **Create Index**, then search.

## Development

**Prerequisites:** Node 22

```bash

# 1. Install dependencies + download Tesseract language data (~25 MB, once)

yarn inst

# 2. Configure

cp .env.example .env

# 3. Start the backend (Terminal 1)

cd lib/backend

node --no-deprecation --watch src/server.js

# 4. Start the frontend (Terminal 2)

yarn dev

```

Open `http://localhost:5173`, click **Create Index**, then search.

---

## Docker

```bash

# 2. Configure

cp .env.example .env

# 2. Add volume mounts to docker-compose.yml for each directory in SEARCH_DIRS

# See the comment block in docker-compose.yml

# 3. Start

docker compose up -d

```

Open `http://localhost:3000`, click **Create Index**, then search.

---

## Configuration

All options are set via environment variables (or `.env` at the project root):

| Variable                | Default              | Description                                                                          |

| ----------------------- | -------------------- | ------------------------------------------------------------------------------------ |

| `SEARCH_DIRS`           | _(required)_         | Comma-separated list of directories to index                                         |

| `DB_PATH`               | `./db/nanosearch.db` | Path to the SQLite database file                                                     |

| `OCR_BACKEND`           | `js`                 | `js` (Tesseract.js) or `cli` (Tesseract CLI)                                         |

| `PORT`                  | `3000`               | Backend port                                                                         |

| `LOG_LEVEL`             | `info`               | Pino log level                                                                       |

| `RESCAN_INTERVAL_HOURS` | `0`                  | Automatically re-scan and re-index new/changed files every X hours. `0` disables it. |

| `EXTENSIONS`            | _(see .env.example)_ | Complete list of extensions to index. Nothing is indexed unless listed here.         |

---

## OCR Backends

| Backend | How                                      | When to use              |

| ------- | ---------------------------------------- | ------------------------ |

| `js`    | Tesseract.js (pure Node, no system deps) | Local development        |

| `cli`   | System `tesseract` binary                | Docker (default), faster |

Both backends recognise German and English simultaneously (`deu+eng`).

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/orangecoding/nanosearch

Awesome Lists containing this project

README