https://github.com/abb128/april-asr
Speech-to-text library in C
https://github.com/abb128/april-asr
Last synced: 7 days ago
JSON representation
Speech-to-text library in C
- Host: GitHub
- URL: https://github.com/abb128/april-asr
- Owner: abb128
- License: gpl-3.0
- Created: 2022-11-13T03:43:35.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2025-02-27T21:13:51.000Z (about 2 months ago)
- Last Synced: 2025-04-01T08:37:54.345Z (14 days ago)
- Language: C
- Homepage:
- Size: 2.27 MB
- Stars: 182
- Watchers: 9
- Forks: 15
- Open Issues: 11
-
Metadata Files:
- Readme: README.md
- License: COPYING
Awesome Lists containing this project
- stars - abb128/april-asr - Speech-to-text library in C (C)
README
# april-asr
[aprilasr](https://github.com/abb128/april-asr) is a minimal library that provides an API for offline streaming speech-to-text applications
[Documentation](https://abb128.github.io/april-asr/concepts.html)
## Status
This library is currently facing some major rewrites over 2025 to improve efficiency and properly fulfill the API contract of multi-session support. The model format is going to change.## Language support
The core library is written in C, and has a C API. [Python](https://abb128.github.io/april-asr/python.html) and [C#](https://abb128.github.io/april-asr/csharp.html) bindings are available.## Example in Python
Install via `pip install april-asr`
```py
import april_asr as april
import librosa# Change these values
model_path = "aprilv0_en-us.april"
audio_path = "audio.wav"model = april.Model(model_path)
def handler(result_type, tokens):
s = ""
for token in tokens:
s = s + token.token
if result_type == april.Result.FINAL_RECOGNITION:
print("@"+s)
elif result_type == april.Result.PARTIAL_RECOGNITION:
print("-"+s)
else:
print("")session = april.Session(model, handler)
data, sr = librosa.load(audio_path, sr=model.get_sample_rate(), mono=True)
data = (data * 32767).astype("short").astype(" The application was unable to start correctly (0xc000007b)To fix this, you need to make onnxruntime.dll available. One way to do this is to copy onnxruntime.dll from lib/lib/onnxruntime.dll to build/Debug and build/Release. You may need to distribute the dll together with your application.
## Applications
Currently I'm developing [Live Captions](https://github.com/abb128/LiveCaptions), a Linux desktop app that provides live captioning.## Acknowledgements
Thanks to the [k2-fsa/icefall](https://github.com/k2-fsa/icefall) contributors for creating the speech recognition recipes and models.This project makes use of a few libraries:
* pocketfft, authored by Martin Reinecke, Copyright (C) 2008-2018 Max-Planck-Society, licensed under BSD-3-Clause
* Sonic library, authored by Bill Cox, Copyright (C) 2010 Bill Cox, licensed under Apache 2.0 license
* tinycthread, authored by Marcus Geelnard and Evan Nemerson, licensed under zlib/libpng licenseThe bindings are based on the [Vosk API bindings](https://github.com/alphacep/vosk-api), which is another speech recognition library based on previous-generation Kaldi. Vosk is Copyright 2019 Alpha Cephei Inc. and licensed under the Apache 2.0 license.