{"id":28255180,"url":"https://github.com/analyticsinmotion/werx","last_synced_at":"2025-06-16T06:31:23.761Z","repository":{"id":291581395,"uuid":"978059108","full_name":"analyticsinmotion/werx","owner":"analyticsinmotion","description":"🐍📦 Easy-to-use Python package for lightning-fast Word Error Rate analysis","archived":false,"fork":false,"pushed_at":"2025-05-19T01:20:59.000Z","size":232,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-05-25T12:50:07.513Z","etag":null,"topics":["asr","automatic-speech-recognition","levenshtein-distance","metrics","speech-to-text","stt","wer","werx","word-error-rate","word-error-rate-calculator"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/analyticsinmotion.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-05-05T12:08:26.000Z","updated_at":"2025-05-19T01:21:02.000Z","dependencies_parsed_at":"2025-05-05T13:52:09.793Z","dependency_job_id":"f4b2eff0-33ac-4d32-b717-2ec81ca287d8","html_url":"https://github.com/analyticsinmotion/werx","commit_stats":null,"previous_names":["analyticsinmotion/werx"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/analyticsinmotion/werx","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/analyticsinmotion%2Fwerx","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/analyticsinmotion%2Fwerx/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/analyticsinmotion%2Fwerx/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/analyticsinmotion%2Fwerx/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/analyticsinmotion","download_url":"https://codeload.github.com/analyticsinmotion/werx/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/analyticsinmotion%2Fwerx/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":260114268,"owners_count":22960865,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["asr","automatic-speech-recognition","levenshtein-distance","metrics","speech-to-text","stt","wer","werx","word-error-rate","word-error-rate-calculator"],"created_at":"2025-05-19T21:13:05.953Z","updated_at":"2025-06-16T06:31:23.726Z","avatar_url":"https://github.com/analyticsinmotion.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"![logo-werx](https://github.com/user-attachments/assets/26701780-4809-433d-9920-38c221bd016b)\n\n\u003ch1 align=\"center\"\u003e⚡Lightning fast Word Error Rate Calculations\u003c/h1\u003e\n\n\n\u003c!-- badges: start --\u003e\n\n\u003cdiv align=\"center\"\u003e\n  \u003ctable\u003e\n    \u003ctr\u003e\n      \u003ctd\u003e\u003cstrong\u003eMeta\u003c/strong\u003e\u003c/td\u003e\n      \u003ctd\u003e\n        \u003ca href=\"https://pypi.org/project/werx/\"\u003e\u003cimg src=\"https://img.shields.io/pypi/v/werx?label=PyPI\u0026color=blue\"\u003e\u003c/a\u003e\u0026nbsp;\n        \u003ca href=\"https://www.python.org/downloads/\"\u003e\u003cimg src=\"https://img.shields.io/badge/python-3.10%7C3.11%7C3.12%7C3.13-blue?logo=python\u0026logoColor=ffdd54\"\u003e\u003c/a\u003e\u0026nbsp;\n        \u003ca href=\"https://github.com/analyticsinmotion/werx/blob/main/LICENSE\"\u003e\u003cimg src=\"https://img.shields.io/badge/License-Apache_2.0-blue.svg\"\u003e\u003c/a\u003e\u0026nbsp;\n        \u003ca href=\"https://github.com/astral-sh/uv\"\u003e\u003cimg src=\"https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/uv/main/assets/badge/v0.json\" alt=\"uv\"\u003e\u003c/a\u003e\u0026nbsp;\n        \u003ca href=\"https://github.com/astral-sh/ruff\"\u003e\u003cimg src=\"https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/ruff/main/assets/badge/v2.json\" alt=\"Ruff\"\u003e\u003c/a\u003e\u0026nbsp;\n        \u003ca href=\"https://www.rust-lang.org\"\u003e\u003cimg src=\"https://img.shields.io/badge/Powered%20by-Rust-black?logo=rust\u0026logoColor=white\" alt=\"Powered by Rust\"\u003e\u003c/a\u003e\u0026nbsp;\n        \u003ca href=\"https://github.com/analyticsinmotion\"\u003e\u003cimg src=\"https://raw.githubusercontent.com/analyticsinmotion/.github/main/assets/images/analytics-in-motion-github-badge-rounded.svg\" alt=\"Analytics in Motion\"\u003e\u003c/a\u003e\n        \u003c!-- \u0026nbsp;\n        \u003ca href=\"https://pypi.org/project/werx/\"\u003e\u003cimg src=\"https://img.shields.io/pypi/dm/werx?label=PyPI%20downloads\"\u003e\u003c/a\u003e\u0026nbsp;\n        \u003ca href=\"https://pepy.tech/project/werx\"\u003e\u003cimg src=\"https://static.pepy.tech/badge/werx\"\u003e\u003c/a\u003e\n        --\u003e\n      \u003c/td\u003e\n    \u003c/tr\u003e\n  \u003c/table\u003e\n\u003c/div\u003e\n\n\u003c!-- badges: end --\u003e\n\n\n## What is WERx?\n\n**WERx** is a high-performance Python package for calculating Word Error Rate (WER), built with Rust for unmatched speed, memory efficiency, and stability. WERx delivers accurate results with exceptional performance, making it ideal for large-scale evaluation tasks.\n\n\u003cbr/\u003e\n\n## 🚀 Why Use WERx?\n\n⚡ **Blazing Fast:** Rust-powered core delivers outstanding performance, optimized for large datasets\u003cbr\u003e\n\n🧩 **Robust:** Designed to handle edge cases gracefully, including empty strings and mismatched sequences\u003cbr\u003e\n\n📐 **Insightful:** Provides rich word-level error breakdowns, including substitutions, insertions, deletions, and weighted error rates\u003cbr\u003e\n\n🛡️ **Production-Ready:** Minimal dependencies, memory-efficient, and engineered for stability\u003cbr\u003e \n\n\u003cbr/\u003e\n\n## ⚙️ Installation\n\nYou can install WERx either with 'uv' or 'pip'.\n\n### Using uv (recommended):\n```bash\nuv pip install werx\n```\n\n### Using pip:\n```bash\npip install werx\n```\n\n\u003cbr/\u003e\n\n## ✨ Usage\n**Import the WERx package**\n\n*Python Code:*\n```python\nimport werx\n```\n\n### Examples:\n\n### 1. Single sentence comparison\n\n*Python Code:*\n```python\nwer = werx.wer('i love cold pizza', 'i love pizza')\nprint(wer)\n```\n\n*Results Output:*\n```\n0.25\n```\n\n\u003cbr/\u003e\n\n### 2. Corpus level Word Error Rate Calculation\n\n*Python Code:*\n```python\nref = ['i love cold pizza','the sugar bear character was popular']\nhyp = ['i love pizza','the sugar bare character was popular']\nwer = werx.wer(ref, hyp)\nprint(wer)\n```\n\n*Results Output:*\n```\n0.2\n```\n\n\u003cbr/\u003e\n\n### 3. Weighted Word Error Rate Calculation\n\n*Python Code:*\n```python\nref = ['i love cold pizza', 'the sugar bear character was popular']\nhyp = ['i love pizza', 'the sugar bare character was popular']\n\n# Apply lower weight to insertions and deletions, standard weight for substitutions\nwer = werx.weighted_wer(\n    ref, \n    hyp, \n    insertion_weight=0.5, \n    deletion_weight=0.5, \n    substitution_weight=1.0\n)\nprint(wer)\n```\n\n*Results Output:*\n```\n0.15\n```\n\n\u003cbr/\u003e\n\n### 4. Complete Word Error Rate Breakdown\n\nThe `analysis()` function provides a complete breakdown of word error rates, supporting both standard WER and weighted WER calculations.\n\nIt delivers detailed, per-sentence metrics—including insertions, deletions, substitutions, and word-level error tracking, with the flexibility to customize error weights.\n\nResults are easily accessible through standard Python objects or can be conveniently converted into Pandas and Polars DataFrames for further analysis and reporting.\n\n\n#### 4a. Getting Started\n\n*Python Code:*\n```python\nref = [\"the quick brown fox\"]\nhyp = [\"the quick brown dog\"]\n\nresults = werx.analysis(ref, hyp)\n\nprint(\"Inserted:\", results[0].inserted_words)\nprint(\"Deleted:\", results[0].deleted_words)\nprint(\"Substituted:\", results[0].substituted_words)\n\n```\n\n*Results Output:*\n```\nInserted Words   : []\nDeleted Words    : []\nSubstituted Words: [('fox', 'dog')]\n```\n\n\u003cbr/\u003e\n\n#### 4b. Converting Analysis Results to a DataFrame\n\n*Note:* To use this module, you must have either `pandas` or `polars` (or both) installed.\n\n*Install Pandas / Polars for DataFrame Conversion*\n```python\nuv pip install pandas\nuv pip install polars\n```\n\n*Python Code:*\n```python\nref = [\"i love cold pizza\", \"the sugar bear character was popular\"]\nhyp = [\"i love pizza\", \"the sugar bare character was popular\"]\nresults = werx.analysis(\n    ref, hyp,\n    insertion_weight=2,\n    deletion_weight=2,\n    substitution_weight=1\n)\n```\nWe’ve created a special utility to make working with DataFrames seamless.\nJust import the following helper:\n\n```python\nimport werx\nfrom werx.utils import to_polars, to_pandas\n```\n\nYou can then easily convert analysis results to get output using **Polars**:\n```python\n# Convert to Polars DataFrame\ndf_polars = to_polars(results)\nprint(df_polars)\n```\n\nAlternatively, you can also use **Pandas** depending on your preference:\n```python\n# Convert to Pandas DataFrame\ndf_pandas = to_pandas(results)\nprint(df_pandas)\n```\n\n*Results Output:*\n\n| wer    | wwer   | ld  | n_ref | insertions | deletions | substitutions | inserted_words | deleted_words | substituted_words   |\n|--------|--------|-----|-------|------------|-----------|---------------|----------------|----------------|---------------------|\n| 0.25   | 0.50   | 1   | 4     | 0          | 1         | 0             | []             | ['cold']       | []                  |\n| 0.1667 | 0.1667 | 1   | 6     | 0          | 0         | 1             | []             | []             | [('bear', 'bare')]   |\n\n\n\u003cbr/\u003e\n\n## 📄 License\n\nThis project is licensed under the Apache License 2.0.\n\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fanalyticsinmotion%2Fwerx","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fanalyticsinmotion%2Fwerx","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fanalyticsinmotion%2Fwerx/lists"}