An open API service indexing awesome lists of open source software.

https://github.com/abb128/april-asr

Speech-to-text library in C
https://github.com/abb128/april-asr

Last synced: 7 days ago
JSON representation

Speech-to-text library in C

Awesome Lists containing this project

README

        

# april-asr

[aprilasr](https://github.com/abb128/april-asr) is a minimal library that provides an API for offline streaming speech-to-text applications

[Documentation](https://abb128.github.io/april-asr/concepts.html)

## Status
This library is currently facing some major rewrites over 2025 to improve efficiency and properly fulfill the API contract of multi-session support. The model format is going to change.

## Language support
The core library is written in C, and has a C API. [Python](https://abb128.github.io/april-asr/python.html) and [C#](https://abb128.github.io/april-asr/csharp.html) bindings are available.

## Example in Python

Install via `pip install april-asr`

```py
import april_asr as april
import librosa

# Change these values
model_path = "aprilv0_en-us.april"
audio_path = "audio.wav"

model = april.Model(model_path)

def handler(result_type, tokens):
s = ""
for token in tokens:
s = s + token.token

if result_type == april.Result.FINAL_RECOGNITION:
print("@"+s)
elif result_type == april.Result.PARTIAL_RECOGNITION:
print("-"+s)
else:
print("")

session = april.Session(model, handler)

data, sr = librosa.load(audio_path, sr=model.get_sample_rate(), mono=True)
data = (data * 32767).astype("short").astype(" The application was unable to start correctly (0xc000007b)

To fix this, you need to make onnxruntime.dll available. One way to do this is to copy onnxruntime.dll from lib/lib/onnxruntime.dll to build/Debug and build/Release. You may need to distribute the dll together with your application.

## Applications
Currently I'm developing [Live Captions](https://github.com/abb128/LiveCaptions), a Linux desktop app that provides live captioning.

## Acknowledgements
Thanks to the [k2-fsa/icefall](https://github.com/k2-fsa/icefall) contributors for creating the speech recognition recipes and models.

This project makes use of a few libraries:
* pocketfft, authored by Martin Reinecke, Copyright (C) 2008-2018 Max-Planck-Society, licensed under BSD-3-Clause
* Sonic library, authored by Bill Cox, Copyright (C) 2010 Bill Cox, licensed under Apache 2.0 license
* tinycthread, authored by Marcus Geelnard and Evan Nemerson, licensed under zlib/libpng license

The bindings are based on the [Vosk API bindings](https://github.com/alphacep/vosk-api), which is another speech recognition library based on previous-generation Kaldi. Vosk is Copyright 2019 Alpha Cephei Inc. and licensed under the Apache 2.0 license.