{"id":28213477,"url":"https://github.com/farithadnan/kb-answerscorer","last_synced_at":"2026-05-01T04:36:35.310Z","repository":{"id":293608038,"uuid":"980490008","full_name":"farithadnan/KB-AnswerScorer","owner":"farithadnan","description":"A tool for evaluating LLM responses against a knowledge base of expert solutions.","archived":false,"fork":false,"pushed_at":"2025-05-16T06:56:50.000Z","size":37,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-05-16T07:38:53.615Z","etag":null,"topics":["bert-score","bleu-score","f1-score","openwebui","python","rag"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/farithadnan.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-05-09T08:05:30.000Z","updated_at":"2025-05-16T06:56:54.000Z","dependencies_parsed_at":"2025-05-16T10:00:14.991Z","dependency_job_id":null,"html_url":"https://github.com/farithadnan/KB-AnswerScorer","commit_stats":null,"previous_names":["farithadnan/kb-answerscorer"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/farithadnan/KB-AnswerScorer","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/farithadnan%2FKB-AnswerScorer","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/farithadnan%2FKB-AnswerScorer/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/farithadnan%2FKB-AnswerScorer/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/farithadnan%2FKB-AnswerScorer/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/farithadnan","download_url":"https://codeload.github.com/farithadnan/KB-AnswerScorer/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/farithadnan%2FKB-AnswerScorer/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":259308157,"owners_count":22837974,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bert-score","bleu-score","f1-score","openwebui","python","rag"],"created_at":"2025-05-17T20:10:43.551Z","updated_at":"2026-05-01T04:36:30.276Z","avatar_url":"https://github.com/farithadnan.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# KB-AnswerScorer\n\nA tool for evaluating LLM responses against a knowledge base of expert solutions.\n\n## Overview\n\nKB-AnswerScorer is a utility designed to evaluate how well large language models (LLMs) answer customer support questions compared to established expert solutions. The tool processes questions, obtains model responses via OpenWebUI, compares these responses to reference solutions using semantic and lexical similarity metrics, and generates comprehensive reports.\n\nThe comparison is based on:\n\n- BERTScore (semantic similarity)\n- F1 Score (lexical similarity)\n- BLEU Score (translation quality metric)\n\n## Project Structure\n\n```yaml\nKB-AnswerScorer/\n├── main.py                  # Main script\n├── Dockerfile               # Dockerfile\n├── docker-compose.yml       # Docker compose\n├── .env                     # Environment variables\n├── data/                    # Input data directory\n│   ├── samples              # Sample input files\n│   ├── questions.xlsx       # Customer questions\n│   └── solutions.xlsx       # Expert solutions\n├── metrics/                 # Scoring modules\n│   └──  metric_evaluator.py # Combined scoring metrics and matcher\n├── opwebui/                 # OpenWebUI integration\n│   └── api_client.py        # Client for OpenWebUI API\n└── utils/                   # Utility modules\n    ├── data_extractor.py    # Parses Excel input files\n    ├── query_enhancer.py    # Enhances pre-process and pro-process prompt\n    └── evaluation_utils.py  # Quality Assesment and reporting\n```\n\n## Installation (With Docker)\n\nTo build and run the container for the first time, you need to make sure you've already **configure** the `.env` and the **input** Excel files, refer to the **Configuration** section for more details:\n\n```bash\ndocker-compose build\n```\n\nTo run the container:\n\n```bash\ndocker-compose run --rm kb-scorer\n```\n\nIf you make changes to the code, simply run `update-docker.ps1` PowerShell Script to update your container:\n\n```bash\n.\\update-docker.ps1\n```\n\n## Installation (Without Docker)\n\nClone this repository:\n\n```bash\ngit clone https://github.com/farithadnan/KB-AnswerScorer.git\ncd KB-AnswerScorer\n```\n\nCreate virtual environment:\n\n```bash\npython -m venv venv\n```\n\nActivate the virtual environment:\n\n```bash\n# cmd\npath\\to\\venv\\Scripts\\activate\n\n# powershell\n.\\path\\to\\venv\\Scripts\\Activate\n\n# bash\nsource path/to/venv/bin/activate\n```\n\n## Configuration\n\nCreate a `.env` file in the project root with the following variables:\n\n```yaml\n# Directory and file locations\nDATA_DIR_PATH=./data\nQUESTION_EXCEL=questions.xlsx\nSOLUTION_EXCEL=solutions.xlsx\nQUESTION_SHEET_NAME=Questions\n\n# OpenWebUI Configuration\nOPENWEBUI_API_URL=http://your-openwebui-instance:5000/api/chat\nOPENWEBUI_JWT_TOKEN=your_jwt_token_here\n```\n\n### Getting the OpenWebUI JWT Token\n\n\u003e Check the [official guide](https://docs.openwebui.com/getting-started/api-endpoints#authentication) for more details about this.\n\n1. Login to your OpenWebUI instance\n2. Open the browser developer tools (F12)\n3. Go to the Application tab\n4. Look for \"Local Storage\" and find the entry for your OpenWebUI domain\n5. Copy the JWT token value (it starts with \"ey...\")\n\n\n### Input Files\n\nThe tool expects two Excel files:\n\n1. **Question file**: Contains customer queries/issues.\n    - **Required structure:**\n        - Excel file with customer questions (header starts at row 3)\n        - Main question text in column B\n        - Solutions used in column C (optional)\n        - AI Solutions used in column D (optional)\n        - Each row represents a unique customer issue\n    - **Parsed into:**\n        - `id`: Automatically assigned based on row number (starting from 1)\n        - `issue`: The customer question text from column B\n        - `solutions_used`: List of solution indices that can be manually set later\n        - `ai_solutions_used`: List of AI solution indices from column D\n2. **Solutions file**: Contains expert solutions.\n    - **Required structure:**\n        - Excel file with solutions (header in row 1)\n        - Column A: Solution text with title and steps\n        - Column B: Optional error messages\n    - **Parsed into:**\n        - `id`: Automatically assigned based on row number (starting from 1)\n        - `title`: Extracted from the first line of the solution text (e.g., \"Solutions 1\")\n        - `steps`: Each line after the title becomes a step in the solution\n        - `error_message`: Any text in column B becomes the error message\n\nYou can check `data/samples` to see the sample of these two files.\n\n## Usage\n\nRun the script with various command-line options to control its behavior:\n\n```bash\npython main.py [options]\n```\n\nCommand line options:\n\n| Options                      | Descriptions                                         | Default           |\n|------------------------------|------------------------------------------------------|-------------------|\n| `--limit N`, `-l`            | Process only the first N questions                  | 0 (all questions) |\n| `--question-id ID`, `--id`   | Process only a specific question by ID              | None              |\n| `--verbose`, `-v`            | Display detailed logs and results                   | False             |\n| `--report-dir DIR`, `--rd`   | Directory to save reports                           | \"reports\"         |\n| `--wait-time SEC`, `--wt`    | Wait time between API calls in seconds              | 1.0               |\n| `--skip-report`, `--sr`      | Skip report generation                              | False             |\n| `--export-excel`, `--ee`     | Export evaluation report to Excel                   | False             |\n| `--pre-process`, `--pre`     | Enhance queries before sending to model             | False             |\n| `--post-process`, `--post`   | Generate improved prompts for low-quality responses | False             |\n| `--retry`, `-r`              | Retry with improved prompts when quality is low     | False             |\n| `--bert-threshold T`, `--bt` | BERT score threshold for quality assessment         | 0.5               |\n| `--f1-threshold T`, `--f1`   | F1 score threshold for quality assessment           | 0.3               |\n| `--bleu-threshold T`, `--bl` | BLEU score threshold for quality assessment         | 0.1               |\n| `--combined-threshold T`, `--ct` | Combined score threshold for quality assessment | 0.4               |\n\n\n### Example\n\nProcess all questions:\n\n```bash\npython main.py\n```\n\nProcess only the first 5 questions:\n\n```bash\npython main.py --limit 5\n```\n\nProcess only specific question by ID:\n\n```bash\npython main.py --question-id 2\n```\n\nGenerate detailed output for each questions:\n\n```bash\npython main.py --verbose\n```\n\nSave reports to a custom directory:\n\n```bash\npython main.py --report-dir my_reports\n```\n\nAdjust wait time between API calls:\n\n```bash\npython main.py --wait-time 2.0\n```\n\nFor Docker, it's the same as the previous example — you just need to add this in front of the command:\n\n```bash\ndocker-compose run --rm kb-scorer python main.py --limit 1\n```\n\n## Troubleshooting\n\n### No response from OpenWebUI\n\n- Verify your JWT token is valid and not expired\n- Check that the OpenWebUI API URL is correct\n- Ensure OpenWebUI is running and accessible\n\n### Missing files error\n\n- Verify that the data directory and Excel files exist\n- Check the paths in your .env file\n\n### Low scores across all questions\n\n- The model may not be suitable for your domain\n- Consider adjusting the quality thresholds\n- Review your reference solutions for clarity\n\n### Environment Variable Issues\n\nIf you update your `.env` file and changes aren't detected:\n\n- Make sure to use `#` for comments\n- Restart your terminal/command prompt\n\n\n### Excel Format Issues\n\nIf you're getting parsing errors:\n\n- Check that the header row is set correctly (defaults to row 3)\n- Verify column mappings in the DataExtractor configuration\n- For solutions, ensure they follow the \"Solution X\" format with numbered steps\n\n## License\n\nThis project is licensed under the MIT License - see the [LICENSE](./LICENSE) file for details.","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffarithadnan%2Fkb-answerscorer","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffarithadnan%2Fkb-answerscorer","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffarithadnan%2Fkb-answerscorer/lists"}