{"id":27118090,"url":"https://github.com/devtobi/aigelb","last_synced_at":"2026-02-01T16:03:46.693Z","repository":{"id":285422975,"uuid":"939589986","full_name":"devtobi/aigelb","owner":"devtobi","description":"Evaluation of LLMs and browser implementation for browsing the Web in Easy Language (\"Leichte Sprache\")","archived":false,"fork":false,"pushed_at":"2026-01-26T18:13:03.000Z","size":2629,"stargazers_count":0,"open_issues_count":12,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2026-01-26T20:44:20.747Z","etag":null,"topics":["browser-extension","easylanguage","evaluation","leichte-sprache","llm","web"],"latest_commit_sha":null,"homepage":"","language":"Vue","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/devtobi.png","metadata":{"files":{"readme":"docs/README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-02-26T19:33:32.000Z","updated_at":"2026-01-26T18:11:20.000Z","dependencies_parsed_at":"2025-03-31T17:35:18.884Z","dependency_job_id":"896d2ede-220c-4f5d-a7af-5844ad01837c","html_url":"https://github.com/devtobi/aigelb","commit_stats":null,"previous_names":["devtobi/aigelb"],"tags_count":12,"template":false,"template_full_name":null,"purl":"pkg:github/devtobi/aigelb","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/devtobi%2Faigelb","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/devtobi%2Faigelb/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/devtobi%2Faigelb/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/devtobi%2Faigelb/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/devtobi","download_url":"https://codeload.github.com/devtobi/aigelb/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/devtobi%2Faigelb/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28981893,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-01T15:35:50.179Z","status":"ssl_error","status_checked_at":"2026-02-01T15:35:38.075Z","response_time":56,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["browser-extension","easylanguage","evaluation","leichte-sprache","llm","web"],"created_at":"2025-04-07T06:50:09.387Z","updated_at":"2026-02-01T16:03:46.687Z","avatar_url":"https://github.com/devtobi.png","language":"Vue","funding_links":[],"categories":[],"sub_categories":[],"readme":"[![Last commit][commit-shield]][commit-url]\n[![License][license-shield]][license-url]\n\n\u003c!-- PROJECT LOGO --\u003e\n\u003cbr /\u003e\n\u003cp align=\"center\"\u003e\n  \u003ca href=\"https://github.com/devtobi/aigelb\"\u003e\n    \u003cimg src=\"./assets/logo.png\" alt=\"AIGELB logo\" width=\"128\" height=\"129\"\u003e\n  \u003c/a\u003e\n\n  \u003ch3 align=\"center\"\u003eAIGELB\u003c/h3\u003e\n\n  \u003cp align=\"center\"\u003e\n    \u003cb\u003eAI\u003c/b\u003e \u003cb\u003eG\u003c/b\u003eerman \u003cb\u003eE\u003c/b\u003easy \u003cb\u003eL\u003c/b\u003eanguage \u003cb\u003eB\u003c/b\u003erowsing\n  \u003c/p\u003e\n\u003c/p\u003e\n\n\u003c!-- TABLE OF CONTENTS --\u003e\n## Table of Contents\n\n* [About the Project](#about-the-project)\n  * [Built With](#built-with)\n* Browser Extension\n  * [Installation](#installation)\n  * [Usage](#usage)\n* Evaluation\n  * [Installation](#installation-1)\n  * [Usage](#usage-1)\n* [Authors](#authors)\n* [License](#license)\n* [Citation](#citation)\n\n\u003c!-- ABOUT THE PROJECT --\u003e\n## About The Project\n\nThis project was created as part of my [master thesis](#citation)\nin Computer Science at the [Munich University of Applied Sciences](https://hm.edu/en/).\nIt contains two different parts which are as follows:\n\n* `browser-extension`: Implementation of a browser extension using local LLMs\nto translate web content into German \"Easy Language\", also known as \"Leichte Sprache\".\n* `evaluation`: Python-based scripts to check the suitability\nof different LLMs in regard to the use case \"Easy Language\" in German.\n\n### Built With\n\n#### Evaluation\n\n* Programming Language: [Python](https://www.python.org)\n* Package Management: [uv](https://docs.astral.sh/uv/)\n* Model Management: [HuggingFace Hub](https://huggingface.co)\n* LLM Inference: [llama-cpp-python](https://llama-cpp-python.readthedocs.io)\n* Evaluation Metrics:\n  * Machine Translation: [HuggingFace Evaluate](https://huggingface.co/docs/evaluate/index)\n  * Text Readability: [TextStat](https://textstat.org)\n  * Lexical Diversity: [LexicalRichness](https://lexicalrichness.readthedocs.io/en/latest/)\n\n#### Browser extension\n\n* Programming language: [TypeScript](https://www.typescriptlang.org)\n* Package Management: [Bun](https://bun.sh)\n* JavaScript Framework: [Vue](https://vuejs.org)\n* Component Framework: [Vuetify](https://vuetifyjs.com)\n* Web Extension Framework: [WXT](https://wxt.dev)\n* Web Extension Messaging: [webext-core/messaging](https://webext-core.aklinker1.io/messaging/installation)\n* Model Metadata: [HuggingFace Hub](https://huggingface.co/docs/huggingface.js/hub/README)\n* Model Management: [ollama-js](https://github.com/ollama/ollama-js)\n* LLM Inference: [AI SDK](https://ai-sdk.dev)\n* DOM Parsing: [cheerio](https://cheerio.js.org)\n\n## Browser Extension\n\n### Installation\n\nThe latest version of the extension can be downloaded on the releases page.\nThe extension is available for Chromium-based browsers, Firefox and Safari.\n\n#### Building manually\n\nIf you want to build the extension manually from source,\nyou need to have [Bun](https://bun.sh) installed on your system.\n\nTo build the extension, run the following commands in the `browser-extension` directory:\n1. Install dependencies: `bun install`\n2. Build for Chrome/Chromium: `bun run build:chrome` or\n3. Build for Firefox: `bun run zip:firefox`\n\nThe extension will be built in the `browser-extension/dist` directory.\n\n#### Loading\n\nThe extension is currently unsigned (and not distributed via dedicated extension stores).\nThat's why you need to do the following:\n\nTo load the extension in your browser, you need to follow the instructions for your specific browser:\n- In Chrome (or other Chromium-based browsers), turn on `Developer mode` and then `Load unpacked extension...` and select the `chrome-mv3` folder.\n- In Firefox, the extension can be installed via the URL `about:debugging` and then `Load Temporary Add-on...` and select the `aigelb-browser-extension-*-firefox.zip` file.\n\nAfter installation the instructions page will be displayed.\n\n### Usage\n\nTo use the extension, you need to have [Ollama](https://ollama.com) installed on your system.\n\nFurther usage instructions can be found directly in the instructions page of the extension.\n\n## Evaluation\n\n### Installation\n\nThe execution of the Python scripts require you to have a modern version of\n\n* Python as programming language and\n* uv as dependency management tool\n\ninstalled on your system.\nPlease check out the [Python documentation](https://www.python.org/downloads)\nand [uv documentation](https://docs.astral.sh/uv/getting-started/installation/)\nfor installation instructions.\n\nThe exact compatible version of Python\ncan be found in the `pyproject.toml` file inside the `evaluation` directory.\n\nWhen the requirements above are met,\nyou only need to execute `uv sync` inside the `evaluation` directory\nto set up the virtual environment and download the required packages.\n\n**Note:** To make inference and hardware acceleration work on your machine, you might have to do additional steps to use the proper backend for your architecture and platform in `llama-cpp-python`. You can pass required environment variables like `CMAKE_ARGS` directly to `uv sync`. E.g. for installing on Apple silicon using Metal acceleration execute `CMAKE_ARGS=\"-DGGML_METAL=on\" uv sync`.\nSee [official documentation](https://llama-cpp-python.readthedocs.io/en/latest/#supported-backends)\nfor further information and up-to-date instructions.\n\n### Usage\n\nAll mentioned scripts can be run via uv\nusing the following command: `uv run \u003cscript-path\u003e.py`\n\nThe files inside the `config` directory\nallow further customization of the behaviour\nand will be further explained in the sections below.\n\n### 1. Downloading models\n\n#### Configuration\n\nYou can define which models you want to download\ninside the `models.csv` file in the `evaluation` directory.\nYou can only use [GGUF](https://huggingface.co/docs/hub/gguf)-based models from the [HuggingFace](https://huggingface.co) platform.\n\nThe file has the following columns:\n\n* `_repo_id`: repository name of the model (e.g. [bartowski/Llama-3.2-3B-Instruct-GGUF](https://huggingface.co/bartowski/Llama-3.2-3B-Instruct-GGUF)).\n* `_gguf_filename`: filename to select the variant of the model for different quantizations (e.g. [Llama-3.2-3B-Instruct-Q5_K_M.gguf](https://huggingface.co/bartowski/Llama-3.2-3B-Instruct-GGUF/blob/main/Llama-3.2-3B-Instruct-Q5_K_M.gguf))\n* `_gated` (optional, default `False`): `True` or `False` whether the model is gated\n(e.g. when a license agreement consent on HuggingFace\nplatform is necessary for your account).\n* `_context_length` (optional): an `int` as individual context size for a specific model, if not set global variable `MAX_CONTEXT_LENGTH` will be used -\u003e See configuration for [running inference](#3-running-inference)\n\nRelevant environment variables for the `config.env` file are the following:\n\n* `HF_TOKEN` (optional): HuggingFace token for your account\nto fetch gated models you have access to on the platform.\nSee [HuggingFace documentation](https://huggingface.co/docs/hub/security-tokens)\nfor further information.\n* `HF_HOME` (optional): Custom directory to store cache files and models downloaded for evaluation. If not set, will use default directory `~/.cache/huggingface`\n\n#### Execution\n\nTo download the models you selected for evaluation,\nyou need to run the download script\nusing `uv run src/01_download_models.py`\nwhen you are inside the `evaluation` directory.\n\nThe script will read the content of the `models.csv` file\nand ask you to confirm the download before starting.\n\nThe downloaded models will be stored to the configured cache directory folder for later use.\n\n**Tip:** If you interrupt the model downloads by quitting the script execution,\nthe script will automatically resume the downloads where they stopped.\n\n#### Clean up\n\nWhen you experiment with different models\nyour cache folder might fill up quickly\nand unused models unnecessarly take away storage space.\n\nYou can use the cleanup script using `uv run python src/cleanup.py`\nto get rid off all the models in your cache directory.\n\n**Warning:** If you did not set a custom cache directory, this will remove all the models you ever downloaded from the HuggingFace platform, even from other projects.\n\n### 2. Preparing data\n\n#### Configuration\n\nRelevant environment variables for the `config.env` file are the following:\n\n* `SOURCES_COLUMN_NAME` (optional): Name of the column in the `.csv` file to use as sources. If not set, will default to `source`.\n* `REFERENCES_COLUMN_NAME` (optional): Name of the column in the `.csv` file to use as references. If not set, will default to `reference`.\n* `COLUMN_SEPARATOR` (optional): Configures the CSV selector character used inside the used data source `.csv` file. If not set, will expect the file to use `,` as a separator.\n* `DOWNLOAD_URL` (optional): Download URL for the `.csv` file to use as data source\n\n**Note**: If the variable `DOWNLOAD_URL` is not set, the script will try to load the data from an existing file in `data/data.csv`.\n\n#### Execution\n\nTo run the data preparation,\nyou need to run the prepare script\nusing `uv run src/02_prepare_data.py`\nwhen you are inside the `evaluation` directory.\n\nThe script will optionally download the configured `.csv` file and save it to `data/data.csv`. It then processes this file using configured columns to extract the source and reference columns. The content of those columns will be saved to `data/sources.csv` and `data/references.csv`.\n\n**Warning**: No automatic data cleaning is performed, so the evaluation highly depends on the quality and corrent sentence-alignment of the data!\n\n### 3. Running inference\n\n#### Configuration\n\n##### Source data\n\nThe source data can be manually configured in the `data/sources.csv` file (if not automatically created via [Preparing data](#2-preparing-data)). Each row in that file will be a sentence that is being passed to the LLM in the configured user prompt.\n\n**Important**: The entries must be quoted using double quotes to not interpret `,` inside the source sentences as a column separator.\n\n##### System prompt\n\nThe system prompt can be configured inside the `config/system_prompt.txt` file. Usally the role of the LLM as well as instructions are defined here. One can also include examples to guide the LLM using in-context-learning.\n\n##### User prompt\n\nThe user prompt can be configured in the `config/user_prompt.txt` file. It contains the specific task as hand (e.g. translating a specific sentence into plain language).\n\n**Important**: The user prompt must contain `{source}` to insert the specific source sentence into the user prompt at LLM inferene time.\n\n##### Inference\n\nRelevant environment variables for the `config.env` are the following:\n\n* `USE_CPU` (optional): `True` or `False` whether CPU or GPU should be used for LLM inference. If not set, will use GPU.\n* `NUM_THREADS` (optional): Number of threads to use when running CPU inference. If not set, will be automatically inferred based on system capabilities\n* `MAX_CONTEXT_LENGTH` (optional): Maximimum context length to use for inference, can speed up performance when decreased, needs to be big enough for prompt tokens to fit. If not set, will infer the context length from the `models.csv`. If not set there, will try to infer from the metadata of the given model.\n* `STRUCTURED_OUTPUT_KEY` (optional): Key for the JSON object to expect from LLM generation used to improve LLM generation via Structured Output, not part of the final result. If not set `result` will be used as key.\n* `TEMPERATURE` (optional): Temperature to use for model inference for controlling creativity. If not set `0.2` will be used.\n\n#### Execution\n\nTo run the LLM inference,\nyou need to run the inference script\nusing `uv run src/03_run_inference.py`\nwhen you are inside the `evaluation` directory.\n\nThe script will read the content of `sources.csv`, `system_prompt.csv`, `user_prompt.csv` and `models.csv` and ask for confirmation before starting inference.\n\nThe script will sequentially load the configured models and use each configured source sentence in an isolated inference execution.\nThe results are stored in the `results` folder inside a directory named by the timestamp of generation start. Inside will be a `.csv`file for each used model.\n\n**Tip**: Depending on the amount of models, the amount of configured sentences and the capabilities of the system this task can take from a few minutes to a couple of days.\nThus a lockfile mechanism has been implemented that allows for interrupting and later on resuming the inference task. A lockfile named `timestamp.lock` will be placed in the `predictions` folder in this case.\n\n### 4. Calculating metrics\n\n#### Configuration\n\n##### Metrics\n\nYou can define which metrics you want to evaluate using the `metrics.csv` file.\nThe file has the following columns:\n\n* `_name`: name of the metric to calculate, can be any method of the integrated libraries ([HuggingFace Evaluate](https://huggingface.co/docs/evaluate/index), [TextStat](https://textstat.org) or [LexicalRichness](https://lexicalrichness.readthedocs.io/en/latest/))\n* `_kwargs` (optional): Passes additional arguments as a python dictionary to the metric function (check the official docs of the specified metric for more information), must be in the form `\"{'parameter': value}\"`\n\n**Note**: A special argument in the dictionary is `target`. Because some metrics calculate the results as a dictionary, the `target` argument is required to specify which value of the dictionary to extract. Please check the library documentations for information about method outputs.\n\nExamples:\n* `wiener_sachtextformel,\"{'variant': 1}\"` calculates `wiener_sachtextformel` from TextStat using `variant: 1`\n* `ttr` calculates `ttr` from LexicalRichness without any additional configuration\n* `bertscore,\"{'lang': 'de', 'target': 'f1'}` calculates  `bertscore` from HuggingFace Evaluate using `lang: 'de'` and extracting `f1`from the calculated output dictionary\n\n**Note**: Often setting additional arguments is required for specific metrics, as otherwise no calculation is possible. Check the documentation of the libraries.\n\n**Important**: When using metrics from the [HuggingFace Evaluate](https://huggingface.co/docs/evaluate/index) library, often times additional packages are necessary, e.g. to use `bertscore` the package `bert-score` must be installed.\nThis can be done via `uv pip install \u003cpackage-name\u003e`.\n\n##### References\n\nMetrics from the Machine Translation field require a (gold standard) reference to compare to in order to be calculated. The references can manually be configured in the `references.csv` file (if not automatically created via [Preparing data](#2-preparing-data)). Each row in that file will be a sentence that is being compared to the generated sentence in the model-specific file of the `predictions` directory.\n\n**Important**: If the sentence contains special characters or commas, the sentences need to be double-quoted, as otherwise those commas will be interpreted as column separators.\n\n#### Execution\n\nTo calculate the metrics you selected,\nyou need to run the calculation script\nusing `uv run src/04_calculate_metrics.py`\nwhen you are inside the `evaluation` directory.\n\nThe script will read the content of the `models.csv` and `metrics.csv` file\nand ask you to confirm the configured models and metrics to use for calculation.\n\nThe predictions used for calculation will always be taken from the latest folder inside the `predictions` directory.\nThe results will be stored in the `results` directory inside a folder named after the generation timestamp as `.csv` files containing the timestamp of metric calculation (`results/\u003ctimestamp-generation\u003e/\u003ctimestamp-calculation\u003e.csv`). The result file contains:\n1. Results based on reference-free metrics for the input data\n2. Results based on reference-free metrics for the reference data\n3. Results based on all metrics for each model-generated data\n\n\u003c!-- AUTHORS --\u003e\n## Authors\n\n* **Tobias Stadler** - [devtobi](https://github.com/devtobi)\n\n\u003c!-- LICENSE --\u003e\n## License\n\nDistributed under the MIT License. See [LICENSE][license-url] for more information.\n\n## Citation\n\nIf you reuse my work please cite my thesis as follows:\n\n```bibtex\n```\n\nIf you are interested in reading the thesis you can find it at [ADD TITLE](https://github.com/devtobi).\n\n[license-shield]: https://img.shields.io/github/license/devtobi/aigelb.svg?style=for-the-badge\u0026logo=github\n[license-url]: https://github.com/devtobi/aigelb/blob/main/LICENSE\n\n[commit-shield]: https://img.shields.io/github/last-commit/devtobi/cv?style=for-the-badge\u0026logo=github\n[commit-url]: https://github.com/devtobi/cv/commit/main\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdevtobi%2Faigelb","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdevtobi%2Faigelb","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdevtobi%2Faigelb/lists"}