Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://gitlab.com/ifrz/asr-multi-lite
Testing of the main ASR frameworks with reduced models for low-resource languages speech recognition
https://gitlab.com/ifrz/asr-multi-lite
distilhubert wav2vec2 whisper
Last synced: 17 days ago
JSON representation
Testing of the main ASR frameworks with reduced models for low-resource languages speech recognition
- Host: gitlab.com
- URL: https://gitlab.com/ifrz/asr-multi-lite
- Owner: ifrz
- Created: 2022-11-29T18:58:55.578Z (almost 2 years ago)
- Default Branch: main
- Last Synced: 2024-07-30T21:03:16.454Z (3 months ago)
- Topics: distilhubert, wav2vec2, whisper
- Stars: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# asr-multi-lite
Multi-language speech recognition models generated with three of the main ASR frameworks (wav2vec2, whisper, distilHubert)3 different models, finetuned with 20h of audio label
dynamic quantization onlywav2vec2-base, whisper-tiny, distilled (distiHubert from wav2vec2-large)
Model | Parameters (M) | Size (MB) | WER (%) |
--- | --- | --- | --- |
wav2vec2-base | 94.4 | 361 | 23.2
wav2vec2-base-qint8| 94.4 | 117 | 25.4
whisper-tiny | 37.8 | 144 | 21.0
whisper-tiny-qint8 | 37.8 | 116 | 22.1
distilled | 38.3 | 147 | 28.9
distilled-qint8 | 38.3 | 73 | 31.05 second inference (short audios ideal for voice assistants)
Running on RPi 4 (CPU only)latency (audio inference only)
Model | Avg time (s) |
--- | --- |
wav2vec2-base | 2.70 |
wav2vec2-base-qint8| 2.46 |
whisper-tiny | 2.97 |
whisper-tiny-qint8 | 2.51 |
distilled | 2.04 |
distilled-qint8 | 1.88 |