{"id":44350981,"url":"https://github.com/radlab-dev-group/llm-router-plugins","last_synced_at":"2026-02-11T15:05:57.774Z","repository":{"id":326571904,"uuid":"1104166078","full_name":"radlab-dev-group/llm-router-plugins","owner":"radlab-dev-group","description":"A companion repository for llm-router containing a collection of pipeline-ready plugins. Features a masking interface for anonymizing sensitive data and a guardrail system for validating input/output safety against defined policy rules.","archived":false,"fork":false,"pushed_at":"2026-01-22T21:16:58.000Z","size":135,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-01-23T14:09:33.204Z","etag":null,"topics":["anonymization","genai","guardrail","llm","llm-gateway","llm-router","llm-router-plugins","masker","pii","plugins"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/radlab-dev-group.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-11-25T21:06:35.000Z","updated_at":"2026-01-22T21:16:19.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/radlab-dev-group/llm-router-plugins","commit_stats":null,"previous_names":["radlab-dev-group/llm-router-plugins"],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/radlab-dev-group/llm-router-plugins","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/radlab-dev-group%2Fllm-router-plugins","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/radlab-dev-group%2Fllm-router-plugins/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/radlab-dev-group%2Fllm-router-plugins/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/radlab-dev-group%2Fllm-router-plugins/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/radlab-dev-group","download_url":"https://codeload.github.com/radlab-dev-group/llm-router-plugins/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/radlab-dev-group%2Fllm-router-plugins/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29336093,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-11T14:34:07.188Z","status":"ssl_error","status_checked_at":"2026-02-11T14:34:06.809Z","response_time":97,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["anonymization","genai","guardrail","llm","llm-gateway","llm-router","llm-router-plugins","masker","pii","plugins"],"created_at":"2026-02-11T15:05:57.465Z","updated_at":"2026-02-11T15:05:57.765Z","avatar_url":"https://github.com/radlab-dev-group.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"## Overview\n\nThe **LLM‑Router** project ships with a modular plugin system that lets you plug‑in **anonymizers** (also called\n*maskers*) and **guardrails** into request‑processing pipelines.  \nEach plugin implements a tiny, well‑defined interface (`apply`) and can be composed in an ordered list to form a *\n*pipeline**. Pipelines are instantiated by the `MaskerPipeline` and `GuardrailPipeline` classes and are driven\nautomatically by the endpoint logic in `endpoint_i.py`.\n\n---  \n\n## 1. Anonymizers (Maskers)\n\n### 1.1 What they do\n\n* **Goal** – Remove or replace personally‑identifiable information (PII) from a payload before it reaches the LLM or an\n  external service.\n* **Typical strategy** – Run a pipeline of maskers that locate spans corresponding to IDs, emails, IPs, etc., and\n  replace each span with a placeholder such as `{{MASKED_ITEM}}`.\n\n### 1.2 Built‑in anonymizer plugins\n\n| Plugin                                         | Description                                                                                                                                       | Technical notes                                                                                                                                                            |\n|------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| **FastMaskerPlugin** (`fast_masker_plugin.py`) | Thin wrapper around the `FastMasker` utility class. Receives a JSON‑compatible payload and returns the same payload with all detected PII masked. | Implements `PluginInterface`. The heavy lifting is delegated to `FastMasker.mask_payload(payload)`. No extra I/O; the `FastMasker` instance is created once in `__init__`. |\n\n### 1.3 How a masker is used\n\n1. The endpoint (e.g. `EndpointI._do_masking_if_needed`) checks the global flag `FORCE_MASKING`.\n2. If enabled, it creates a `MaskerPipeline` with the list of masker plugin identifiers (e.g. `[\"fast_masker\"]`).\n3. The pipeline calls each plugin’s `apply` method sequentially, feeding the output of one as the input of the next.\n4. The final payload – now stripped of PII – proceeds to the rest of the request flow (guardrails, model dispatch,\n   etc.).\n\n---  \n\n## 2. Guardrails\n\n### 2.1 What they do\n\n* **Goal** – Verify that a request (or its response) complies with policy rules (e.g. no hateful, illegal, or unsafe\n  content).\n* **Typical strategy** – Split the payload into manageable text chunks, run a pipeline of guardrails, aggregate\n  per‑chunk scores, and decide whether the overall request is safe.\n\n### 2.2 Built‑in guardrail plugins\n\n| Plugin                                             | Description                                                                                                                                                                                                                  | Technical notes                                                                                                                                                                                                                       |\n|----------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| **NASKGuardPlugin** (`nask_guard_plugin.py`)       | HTTP‑based guardrail that forwards the payload to the external NASK guardrail service (`/nask_guard` endpoint) and returns a boolean *safe* flag together with the raw response.                                             | Inherits from `HttpPluginInterface`. The `apply` method calls `_request(payload)` (provided by the base class) and extracts `results[\"safe\"]`. Errors are caught and logged; on failure the plugin returns `(False, {})`.             |\n| **SojkaGuardPlugin** (`sojka_guard_plugin.py`)     | HTTP‑based guardrail that forwards the payload to the **Sójka** guardrail service (`/sojka_guard` endpoint) and returns a safety flag.                                                                                       | Mirrors the design of `NASKGuardPlugin`. The `endpoint_url` is built from the `LLM_ROUTER_GUARDRAIL_SOJKA_GUARD_HOST` environment variable. On success it returns `(True, response)`, otherwise `(False, {})`.                        |\n| **(Implicit) GuardrailProcessor** (`processor.py`) | Core logic used by the internal NASK guardrail Flask route (`nask_guardrail`). Tokenises the payload, creates overlapping chunks, runs a Hugging‑Face `text‑classification` pipeline, and produces a detailed safety report. | Handles model loading (`AutoTokenizer`, `pipeline(\"text‑classification\")`), chunking (`_chunk_text`), and scoring thresholds (`MIN_SCORE_FOR_SAFE`, `MIN_SCORE_FOR_NOT_SAFE`). Returns a dict: `{\"safe\": \u003cbool\u003e, \"detailed\": [...]}`. |\n\n### 2.3 How a guardrail is used\n\n1. The endpoint calls `_is_request_guardrail_safe(payload)` (or the analogous response guardrail).\n2. If `FORCE_GUARDRAIL_REQUEST` is true, a `GuardrailPipeline` is built from the configured plugin IDs (e.g.\n   `[\"nask_guard\", \"sojka_guard\"]`).\n3. The pipeline iterates over each guardrail plugin; each `apply` returns `(is_safe, message)`.\n4. The first plugin that reports `is_safe=False` short‑circuits the pipeline and the request is rejected with a 400/500\n   error payload.\n\n---  \n\n## 3. Pipelines\n\nBoth masker and guardrail pipelines share the same design pattern:\n\n| Class                                                     | Purpose                                                                            |\n|-----------------------------------------------------------|------------------------------------------------------------------------------------|\n| **MaskerPipeline** (`pipeline.py` – masker version)       | Executes a list of masker plugins in order, transforming the payload step‑by‑step. |\n| **GuardrailPipeline** (`pipeline.py` – guardrail version) | Executes guardrail plugins sequentially, stopping on the first failure.            |\n\n### 3.1 Registration\n\n* Plugins are registered lazily via `MaskerRegistry.register(name, logger)` or\n  `GuardrailRegistry.register(name, logger)`.\n* The registry maps a string identifier (e.g. `\"fast_masker\"`) to a concrete plugin class, allowing pipelines to resolve\n  the classes at runtime.\n\n### 3.2 Configuration\n\nAll plugin identifiers are stored in environment variables or constants such as:\n\n```python\nMASKING_STRATEGY_PIPELINE = [\"fast_masker\"]\nGUARDRAIL_STRATEGY_PIPELINE_REQUEST = [\"nask_guard\", \"sojka_guard\"]\n```\n\nThese lists are consumed by the endpoint initialization (`EndpointI._prepare_masker_pipeline`,\n`EndpointI._prepare_guardrails_pipeline`).\n\n---  \n\n## 4. Adding a New Plugin\n\n1. **Create a subclass** of either `PluginInterface` (for maskers) or `HttpPluginInterface` / a custom guardrail base.\n2. **Define a `name` class attribute** – this is the identifier used in pipeline configuration.\n3. **Implement `apply(self, payload: Dict) -\u003e Dict`** (masker) **or `apply(self, payload: Dict) -\u003e Tuple[bool, Dict]`\n   ** (guardrail).\n4. **Register the plugin** – either automatically via the registry’s `register` call in the pipeline constructor, or\n   manually by calling `MaskerRegistry.register(name=MyPlugin.name, logger=logger)`.\n\n*Example stub for a new masker:*\n\n```python\n# my_custom_masker.py\nfrom llm_router_plugins.maskers.plugin_interface import PluginInterface\nimport logging\nfrom typing import Dict, Optional\n\n\nclass MyCustomMasker(PluginInterface):\n    name = \"my_custom_masker\"\n\n    def __init__(self, logger: Optional[logging.Logger] = None):\n        super().__init__(logger=logger)\n        # Load any heavy resources here (e.g., a spaCy model)\n\n    def apply(self, payload: Dict) -\u003e Dict:\n        # Perform your masking logic and return the modified payload\n        return payload\n```\n\nAfter placing the file in `llm_router_plugins/maskers/plugins/`, enable it by adding `\"my_custom_masker\"` to\n`MASKING_STRATEGY_PIPELINE`.\n\n---  \n\n## 5. Retrieval‑Augmented Generation (RAG) Support\n\nThe project now includes a **LangChain‑based RAG plugin** that enables semantic search over user‑provided documents. The\nimplementation lives in `llm_router_plugins/utils/rag/langchain_plugin.py` and is driven by the helper CLI scripts\nlocated in `scripts/`.\n\n### 5.1 What the plugin does\n\n| Feature           | Description                                                                                                                                                                                                                             |\n|-------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| **Indexing**      | Reads a directory of text‑like files (`.txt`, `.md`, `.html`, `.js`, …), splits them into token‑based windows, embeds each chunk with a configurable transformer model, and stores the vectors in a FAISS (or compatible) vector store. |\n| **Searching**     | Given a user query, retrieves the most similar chunks and injects them into the payload (e.g., appends to the last user message) so that downstream LLM calls can use the retrieved context.                                            |\n| **Configuration** | All parameters (collection name, embedder model, device, chunk size, overlap, persistence directory) are driven by environment variables prefixed with `LLM_ROUTER_`. See the table below for the full list.                            |\n| **CLI helpers**   | Two ready‑to‑use scripts: `scripts/llm-router-rag-langchain-index.sh` (indexes a repository) and `scripts/llm-router-rag-langchain-search.sh` (runs a search or starts an interactive REPL).                                            |\n\n### 5.2 Environment variables\n\n| Variable                                 | Default                                                                        | Meaning                                                     |\n|------------------------------------------|--------------------------------------------------------------------------------|-------------------------------------------------------------|\n| `LLM_ROUTER_LANGCHAIN_RAG_COLLECTION`    | *must be set*                                                                  | Name of the FAISS collection (e.g. `sample_collection`).    |\n| `LLM_ROUTER_LANGCHAIN_RAG_EMBEDDER`      | `/mnt/data2/llms/models/community/google/embeddinggemma-300m`                  | Path or Hugging‑Face identifier of the embedding model.     |\n| `LLM_ROUTER_LANGCHAIN_RAG_DEVICE`        | `cuda:2`                                                                       | Torch device (`cpu`, `cuda:0`, …).                          |\n| `LLM_ROUTER_LANGCHAIN_RAG_CHUNK_SIZE`    | `1024`                                                                         | Number of tokens per chunk.                                 |\n| `LLM_ROUTER_LANGCHAIN_RAG_CHUNK_OVERLAP` | `100`                                                                          | Number of overlapping tokens between consecutive chunks.    |\n| `LLM_ROUTER_LANGCHAIN_RAG_PERSIST_DIR`   | `./workdir/plugins/utils/rag/langchain/${LLM_ROUTER_LANGCHAIN_RAG_COLLECTION}` | Directory where the FAISS index and docstore are persisted. |\n\n#### Example export block (add to your shell profile or a `.env` file)\n\n```shell script\nexport LLM_ROUTER_LANGCHAIN_RAG_COLLECTION=\"${LLM_ROUTER_LANGCHAIN_RAG_COLLECTION:-sample_collection}\"\nexport LLM_ROUTER_LANGCHAIN_RAG_EMBEDDER=\"${LLM_ROUTER_LANGCHAIN_RAG_EMBEDDER:-/mnt/data2/llms/models/community/google/embeddinggemma-300m}\"\nexport LLM_ROUTER_LANGCHAIN_RAG_DEVICE=\"${LLM_ROUTER_LANGCHAIN_RAG_DEVICE:-cuda:2}\"\nexport LLM_ROUTER_LANGCHAIN_RAG_CHUNK_SIZE=\"${LLM_ROUTER_LANGCHAIN_RAG_CHUNK_SIZE:-1024}\"\nexport LLM_ROUTER_LANGCHAIN_RAG_CHUNK_OVERLAP=\"${LLM_ROUTER_LANGCHAIN_RAG_CHUNK_OVERLAP:-100}\"\nexport LLM_ROUTER_LANGCHAIN_RAG_PERSIST_DIR=\"${LLM_ROUTER_LANGCHAIN_RAG_PERSIST_DIR:-./workdir/plugins/utils/rag/langchain/${LLM_ROUTER_LANGCHAIN_RAG_COLLECTION}}\"\n```\n\n### 5.3 Using the CLI scripts\n\n**Index a repository** (example for the documentation site):\n\n```shell script\nscripts/llm-router-rag-langchain-index.sh\n# Internally runs:\n# llm-router-rag-langchain index --path \"../.github/pages/llmrouter.cloud/\" --ext .html .js .md\n```\n\n**Search** (interactive REPL):\n\n```shell script\nscripts/llm-router-rag-langchain-search.sh\n# Internally runs:\n# llm-router-rag-langchain search\n# (you will be prompted for a query, type “exit” to quit)\n```\n\n**One‑shot search**:\n\n```shell script\nllm-router-rag-langchain search --query \"What is Retrieval‑Augmented Generation?\" --top_n 5\n```\n\nThe CLI returns the raw matching chunks together with similarity scores. The `LangchainRAGPlugin` automatically formats\nthe retrieved text and appends it to the user’s last message, prefixed with:\n\n```\nIf the context below will help answer the above question, use it.\nContext separated with double enter\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fradlab-dev-group%2Fllm-router-plugins","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fradlab-dev-group%2Fllm-router-plugins","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fradlab-dev-group%2Fllm-router-plugins/lists"}