https://github.com/kiran94/bookworm

LLM-powered bookmark search engine
https://github.com/kiran94/bookworm

bookmark-manager bookmarks chatbot chatgpt genai rag

Last synced: about 2 months ago
JSON representation

LLM-powered bookmark search engine

Host: GitHub
URL: https://github.com/kiran94/bookworm
Owner: kiran94
License: mit
Created: 2024-08-03T18:40:52.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2025-01-04T09:26:11.000Z (11 months ago)
Last Synced: 2025-09-18T17:21:06.627Z (about 2 months ago)
Topics: bookmark-manager, bookmarks, chatbot, chatgpt, genai, rag
Language: Python
Homepage: https://pypi.org/project/bookworm_genai/
Size: 153 KB
Stars: 28
Watchers: 1
Forks: 3
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

awesome-cli-apps-in-a-csv - bookworm - LLM-powered bookmark search engine. (<a name="text-search"></a>Text search (alternatives to grep))

README

          # bookworm 📖

[![main](https://github.com/kiran94/bookworm/actions/workflows/main.yml/badge.svg)](https://github.com/kiran94/bookworm/actions/workflows/main.yml) [![PyPI version](https://badge.fury.io/py/bookworm_genai.svg)](https://badge.fury.io/py/bookworm_genai)

> LLM-powered bookmark search engine

`bookworm` allows you to search from your local browser bookmarks using natural language. For times when you have a large collection of bookmarks and you can't quite remember where you put that one website you need at the moment.

[![asciicast](https://asciinema.org/a/696722.svg)](https://asciinema.org/a/696722)

*In the example above, we search for the term “Japan.” While some results don’t explicitly mention the word, terms like “Osaka” appear because they are closely related to the search term based on OpenAI embeddings.*

## Install

```bash

python -m pip install bookworm_genai

```

> [!TIP]

> If you are using [`uvx`](https://docs.astral.sh/uv/guides/tools/) then you can also just run this:

> ```bash

> uvx --from bookworm_genai bookworm --help

> ```

## Usage

```bash

export OPENAI_API_KEY=

# Run once and then anytime bookmarks across supported browsers changes

bookworm sync

# Sync bookmarks only from a specific browser

bookworm sync --browser-filter chrome

# Ask questions against the bookmark database

bookworm ask

# Ask questions against the bookmark database

# Specify the query when invoking the command

# If you omit this then you will be asked for a query when the tool is running

bookworm ask -q pandas

# Ask questions against the bookmark database and specify the number of results that should come back

bookworm ask -n 1

```

The `sync` process currently supports the following configurations:

| Operating System   | Google Chrome   | Mozilla Firefox   | Brave   | Microsoft Edge   |

| ------------------ | --------------- | ----------------- | ------- | ---------------- |

| **Linux**          | ✅              | ✅                | ✅      | ❌               |

| **macOS**          | ✅              | ✅                | ✅      | ❌               |

| **Windows**        | ❌              | ❌                | ❌      | ❌               |

> [!TIP]

> ✨ Want to contribute? See the [adding an integration](#adding-an-integration) section.

## Processes

*`bookworm sync`*

Vectorize your bookmarks across all supported browsers.

```mermaid

graph LR

subgraph Bookmarks

    Chrome(Chrome Bookmarks)

    Brave(Brave Bookmarks)

    Firefox(Firefox Bookmarks)

end

Bookworm(bookworm sync)

EmbeddingsService(Embeddings Service e.g OpenAIEmbeddings)

VectorStore(Vector Store e.g DuckDB)

Chrome -->|load bookmarks|Bookworm

Brave -->|load bookmarks|Bookworm

Firefox -->|load bookmarks|Bookworm

Bookworm -->|vectorize bookmarks|EmbeddingsService-->|store embeddings|VectorStore

```

Details

The vector database depicted above is stored locally on your machine. You can check it's location by running the following after installing this project:

```python

from platformdirs import PlatformDirs

print(PlatformDirs('bookworm').user_data_dir)

```

---

*`bookworm ask`*

Search from your bookmarks

```mermaid

graph LR

query

Bookworm(bookworm ask)

subgraph _

    LLM(LLM e.g OpenAI)

    VectorStore(Vector Store e.g DuckDB)

end

query -->|user queries for information|Bookworm

Bookworm -->|similarity search|VectorStore -->|send similar docs + user query|LLM

LLM -->|send back response|Bookworm

```

---

*`bookworm export`*

Export your bookmarks across all supported browsers into an output (e.g CSV)

```mermaid

graph LR

VectorStore

Bookworm(bookworm export)

CSV(bookmarks.csv)

VectorStore -->|extract all bookmarks|Bookworm

Bookworm -->|export into file|CSV

```

## Developer Setup

```bash

# LLMs

export OPENAI_API_KEY=

# Langchain (optional, but useful for debugging)

export LANGCHAIN_API_KEY=

export LANGCHAIN_TRACING_V2=true

export LANGCHAIN_PROJECT=bookworm

# Misc (optional)

export LOGGING_LEVEL=INFO

```

Recommendations:

- Install [`pyenv`](https://github.com/pyenv/pyenv?tab=readme-ov-file#installation) and ensure [build dependencies are installed](https://github.com/pyenv/pyenv?tab=readme-ov-file#install-python-build-dependencies) for your OS.

- Install [Poetry](https://python-poetry.org/docs/) we will be using [environment management](https://python-poetry.org/docs/managing-environments/) below.

- VS Code Extensions recommendations can be found [here](./.vscode/extensions.json) and will be suggested upon first opening the project.

```bash

poetry env use 3.9 # or path to your 3.9 installation

poetry shell

poetry install

bookworm --help

```

Running Linux tests on MacOS/Windows

If you are running on a non-linux machine, it may be helpful to run the provided [Dockerfile](./Dockerfile.linux) to verify it's working on that environment.

You can build this via:

```bash

make docker_linux

```

You will need to have Docker installed to run this.

## Adding an Integration

As you can see from [usage](#usage), bookworm supports various integrations but not all. If you find one that you want to support one, then a change is needed inside [integrations.py](./bookworm_genai/integrations.py).

You can see in that file there is a variable called `browsers` that follows this structure:

```python

browsers = {

    "BROWSER": {

        "PLATFORM": {

            ...

        }

    }

}

```

So say you wanted to add Chrome support in Windows then you would go under the Chrome key and then add a `win32` key which has all the details. You can refer to existing examples but generally the contents of those details are *where* to find the bookmarks on the user's system along with how to *interpret* them.

You can also find a full list of the document loaders supported [here](https://python.langchain.com/docs/integrations/document_loaders/).

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/kiran94/bookworm

Awesome Lists containing this project

README