Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://gitlab.com/ifrz/asr-multi-lite

Testing of the main ASR frameworks with reduced models for low-resource languages speech recognition
https://gitlab.com/ifrz/asr-multi-lite

distilhubert wav2vec2 whisper

Last synced: 4 months ago
JSON representation

Testing of the main ASR frameworks with reduced models for low-resource languages speech recognition

Host: gitlab.com
URL: https://gitlab.com/ifrz/asr-multi-lite
Owner: ifrz
Created: 2022-11-29T18:58:55.578Z (about 2 years ago)
Default Branch: main
Last Synced: 2024-07-30T21:03:16.454Z (7 months ago)
Topics: distilhubert, wav2vec2, whisper
Stars: 0
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

        # asr-multi-lite

Multi-language speech recognition models generated with three of the main ASR frameworks (wav2vec2, whisper, distilHubert)

3 different models, finetuned with 20h of audio label

dynamic quantization only

wav2vec2-base, whisper-tiny, distilled (distiHubert from wav2vec2-large)

Model | Parameters (M) | Size (MB) | WER (%) | 

--- | --- | --- | --- | 

wav2vec2-base      | 94.4 | 361 | 23.2

wav2vec2-base-qint8| 94.4 | 117 | 25.4

whisper-tiny       | 37.8 | 144 | 21.0

whisper-tiny-qint8 | 37.8 | 116 | 22.1 

distilled          | 38.3 | 147 | 28.9

distilled-qint8    | 38.3 | 73  | 31.0

5 second inference (short audios ideal for voice assistants)

Running on RPi 4 (CPU only)

latency (audio inference only)

Model | Avg time (s) | 

--- | --- | 

wav2vec2-base      | 2.70 |

wav2vec2-base-qint8| 2.46 | 

whisper-tiny       | 2.97 | 

whisper-tiny-qint8 | 2.51 | 

distilled          | 2.04 | 

distilled-qint8    | 1.88 |