https://github.com/ccoreilly/wav2vec2-service
https://github.com/ccoreilly/wav2vec2-service
Last synced: about 1 year ago
JSON representation
- Host: GitHub
- URL: https://github.com/ccoreilly/wav2vec2-service
- Owner: ccoreilly
- License: mit
- Created: 2022-01-13T21:15:10.000Z (over 4 years ago)
- Default Branch: master
- Last Pushed: 2022-01-14T16:55:47.000Z (over 4 years ago)
- Last Synced: 2025-03-30T17:51:13.727Z (over 1 year ago)
- Language: Python
- Size: 392 KB
- Stars: 38
- Watchers: 2
- Forks: 8
- Open Issues: 3
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Wav2Vec2 simple service
Mimics the Huggingface Inference API for a speech recognition model.
## Usage
This is an example with an image which includes a Wav2Vec2 model for the catalan language
```sh
> docker run -p 8000:8000 -d ghcr.io/ccoreilly/wav2vec2-catala:0.1.0
> curl -X POST localhost:8000/recognize -F "file=@sample.wav"
{"text":"bon vesprà a totes i tots donem començament al ple ordinari convocat per avui trenta de setembre de dos mil vint-i-u a les vuit hores en el saló de plens d'ací de l'ajuntament de massanassa"}
```
## Converting Wav2Vec2 to ONNX format
Using the ONNX model format results in an increase in inference speed when using a CPU. You can convert any Wav2Vec2ForCTC model from the huggingface model hub using the `convert_torch_to_onnx.py` script:
```sh
> python3 -m venv .venv
> source .venv/bin/activate
> pip install -r requirements.txt
> python convert_torch_to_onnx.py --model ccoreilly/wav2vec2-large-xlsr-catala
```
You can also quantize the model to reduce its size
```sh
> python convert_torch_to_onnx.py --model ccoreilly/wav2vec2-large-xlsr-catala --quantize
```