https://github.com/assemblyai-solutions/asr-benchmarking

Last synced: 12 months ago
JSON representation

Host: GitHub
URL: https://github.com/assemblyai-solutions/asr-benchmarking
Owner: AssemblyAI-Solutions
Created: 2024-05-28T18:41:30.000Z (about 2 years ago)
Default Branch: main
Last Pushed: 2024-06-13T23:40:57.000Z (about 2 years ago)
Last Synced: 2025-03-17T17:25:56.531Z (over 1 year ago)
Language: Python
Size: 38.1 KB
Stars: 0
Watchers: 3
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

## Open Source ASR Benchmarking

1) Run `pip install -r requirements.txt`

2) Add audio files to `/audios` and add your human transcript labels to `/truth`

3) Add your respective API keys to a `.env` file (depending on the vendors you want to use) in the respective transcriber scripts. See `.env_sample`

4) Run `run_benchmark_local.py` to execute benchmarks on local files

5) If you want to also run benchmarks against a hugging face dataset (the default is librispeech test clean), then you can run run_benchmark_local.py. Note that you will likely need to download librispeech for the first time and change line 88 in `utils.py` to ensure that you are pointing to the correct place that the dataset was loaded in:
```
base_dir = "/Users/samflamini/.cache/huggingface/datasets/downloads/extracted" #note - you should replace this with your path to the hugging face directory in .cache
```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/assemblyai-solutions/asr-benchmarking

Awesome Lists containing this project

README