https://github.com/analyticsinmotion/werx

🐍📦 Easy-to-use Python package for lightning-fast Word Error Rate analysis
https://github.com/analyticsinmotion/werx

asr automatic-speech-recognition levenshtein-distance metrics speech-to-text stt wer werx word-error-rate word-error-rate-calculator

Last synced: 4 months ago
JSON representation

🐍📦 Easy-to-use Python package for lightning-fast Word Error Rate analysis

Host: GitHub
URL: https://github.com/analyticsinmotion/werx
Owner: analyticsinmotion
License: apache-2.0
Created: 2025-05-05T12:08:26.000Z (5 months ago)
Default Branch: main
Last Pushed: 2025-05-19T01:20:59.000Z (5 months ago)
Last Synced: 2025-05-25T12:50:07.513Z (5 months ago)
Topics: asr, automatic-speech-recognition, levenshtein-distance, metrics, speech-to-text, stt, wer, werx, word-error-rate, word-error-rate-calculator
Language: Python
Homepage:
Size: 227 KB
Stars: 2
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE

Awesome Lists containing this project

README

          ![logo-werx](https://github.com/user-attachments/assets/26701780-4809-433d-9920-38c221bd016b)

⚡Lightning fast Word Error Rate Calculations




  

    

      Meta

      

         

         

         

         

         

         

        

        

      

    

  



## What is WERx?

**WERx** is a high-performance Python package for calculating Word Error Rate (WER), built with Rust for unmatched speed, memory efficiency, and stability. WERx delivers accurate results with exceptional performance, making it ideal for large-scale evaluation tasks.




## 🚀 Why Use WERx?

⚡ **Blazing Fast:** Rust-powered core delivers outstanding performance, optimized for large datasets


🧩 **Robust:** Designed to handle edge cases gracefully, including empty strings and mismatched sequences


📐 **Insightful:** Provides rich word-level error breakdowns, including substitutions, insertions, deletions, and weighted error rates


🛡️ **Production-Ready:** Minimal dependencies, memory-efficient, and engineered for stability
 




## ⚙️ Installation

You can install WERx either with 'uv' or 'pip'.

### Using uv (recommended):

```bash

uv pip install werx

```

### Using pip:

```bash

pip install werx

```




## ✨ Usage

**Import the WERx package**

*Python Code:*

```python

import werx

```

### Examples:

### 1. Single sentence comparison

*Python Code:*

```python

wer = werx.wer('i love cold pizza', 'i love pizza')

print(wer)

```

*Results Output:*

```

0.25

```




### 2. Corpus level Word Error Rate Calculation

*Python Code:*

```python

ref = ['i love cold pizza','the sugar bear character was popular']

hyp = ['i love pizza','the sugar bare character was popular']

wer = werx.wer(ref, hyp)

print(wer)

```

*Results Output:*

```

0.2

```




### 3. Weighted Word Error Rate Calculation

*Python Code:*

```python

ref = ['i love cold pizza', 'the sugar bear character was popular']

hyp = ['i love pizza', 'the sugar bare character was popular']

# Apply lower weight to insertions and deletions, standard weight for substitutions

wer = werx.weighted_wer(

    ref, 

    hyp, 

    insertion_weight=0.5, 

    deletion_weight=0.5, 

    substitution_weight=1.0

)

print(wer)

```

*Results Output:*

```

0.15

```




### 4. Complete Word Error Rate Breakdown

The `analysis()` function provides a complete breakdown of word error rates, supporting both standard WER and weighted WER calculations.

It delivers detailed, per-sentence metrics—including insertions, deletions, substitutions, and word-level error tracking, with the flexibility to customize error weights.

Results are easily accessible through standard Python objects or can be conveniently converted into Pandas and Polars DataFrames for further analysis and reporting.

#### 4a. Getting Started

*Python Code:*

```python

ref = ["the quick brown fox"]

hyp = ["the quick brown dog"]

results = werx.analysis(ref, hyp)

print("Inserted:", results[0].inserted_words)

print("Deleted:", results[0].deleted_words)

print("Substituted:", results[0].substituted_words)

```

*Results Output:*

```

Inserted Words   : []

Deleted Words    : []

Substituted Words: [('fox', 'dog')]

```




#### 4b. Converting Analysis Results to a DataFrame

*Note:* To use this module, you must have either `pandas` or `polars` (or both) installed.

*Install Pandas / Polars for DataFrame Conversion*

```python

uv pip install pandas

uv pip install polars

```

*Python Code:*

```python

ref = ["i love cold pizza", "the sugar bear character was popular"]

hyp = ["i love pizza", "the sugar bare character was popular"]

results = werx.analysis(

    ref, hyp,

    insertion_weight=2,

    deletion_weight=2,

    substitution_weight=1

)

```

We’ve created a special utility to make working with DataFrames seamless.

Just import the following helper:

```python

import werx

from werx.utils import to_polars, to_pandas

```

You can then easily convert analysis results to get output using **Polars**:

```python

# Convert to Polars DataFrame

df_polars = to_polars(results)

print(df_polars)

```

Alternatively, you can also use **Pandas** depending on your preference:

```python

# Convert to Pandas DataFrame

df_pandas = to_pandas(results)

print(df_pandas)

```

*Results Output:*

| wer    | wwer   | ld  | n_ref | insertions | deletions | substitutions | inserted_words | deleted_words | substituted_words   |

|--------|--------|-----|-------|------------|-----------|---------------|----------------|----------------|---------------------|

| 0.25   | 0.50   | 1   | 4     | 0          | 1         | 0             | []             | ['cold']       | []                  |

| 0.1667 | 0.1667 | 1   | 6     | 0          | 0         | 1             | []             | []             | [('bear', 'bare')]   |




## 📄 License

This project is licensed under the Apache License 2.0.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/analyticsinmotion/werx

Awesome Lists containing this project

README

⚡Lightning fast Word Error Rate Calculations