https://github.com/bokuweb/ellisii-toolkit
https://github.com/bokuweb/ellisii-toolkit
Last synced: 6 days ago
JSON representation
- Host: GitHub
- URL: https://github.com/bokuweb/ellisii-toolkit
- Owner: bokuweb
- License: other
- Created: 2026-05-15T06:34:40.000Z (about 2 months ago)
- Default Branch: main
- Last Pushed: 2026-05-30T07:59:04.000Z (about 1 month ago)
- Last Synced: 2026-05-30T08:15:38.155Z (about 1 month ago)
- Language: Rust
- Size: 1.23 MB
- Stars: 1
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# ellisii-toolkit
[](https://github.com/bokuweb/ellisii-toolkit/actions/workflows/ci.yml)
[](./LICENSE)
Reusable Rust crates for building local-first RAG / NotebookLM-style
applications. Extracted from [ellisii](https://github.com/bokuweb/ellisii)
so the lower-level building blocks can be consumed from other projects
without bringing in the Tauri app or the notebook domain layer.
## Status
Pre-1.0. APIs may change. Not published to crates.io — consume via
`git` dependency:
```toml
[dependencies]
ellisii-rag = { git = "https://github.com/bokuweb/ellisii-toolkit", rev = "" }
```
## What's inside
- **Parsers** — PDF / DOCX / XLSX / PPTX / Markdown / text / audio
(`parsers`, `parser-*`, `parsers-core`)
- **OCR** — wrapper around `ndlocr-lite-rs`
- **Chunking** (`chunker`)
- **Embedders** — trait + Japanese static embedding implementation
(`embed-core`, `embed-static-jp`, `embed-dummy`)
- **Vector stores** — trait + in-memory and SQLite (`sqlite-vec` + FTS5)
backends (`store-core`, `store-memory`, `store-sqlite`)
- **LLM backends** — trait + stub and `llama.cpp` implementations
(`llm-core`, `llm-stub`, `llm-llamacpp`, `llm-prompt`)
- **RAG pipeline** — retrieval, reranking, prompting, streaming, and
recall / answer evaluation harnesses (`rag`, `rag-answer-eval`,
`rag-eval-cli`)
- **Japanese tokenizers** (`jp-tokenizer-*`)
- **Provence reranker** (`provence-*`)
- **Query rewriter** (`query-rewriter-*`)
- **Ingest pipeline** — parse → chunk → embed → store orchestration
(`ingest`)
- **SDK** — facade crate that re-exports the common surface (`sdk`)
## License
[PolyForm Noncommercial 1.0.0](./LICENSE). Free for personal, research,
educational, and noncommercial use. Commercial use requires a separate
license — contact the author.
Third-party dependency licenses: see
[`THIRD_PARTY_LICENSES.html`](./THIRD_PARTY_LICENSES.html) (regenerate
with `cargo about generate about.hbs --all-features`).