https://github.com/software-mansion-labs/elixir-ibm-speech-to-text
Elixir client library for IBM Cloud Speech to Text service
https://github.com/software-mansion-labs/elixir-ibm-speech-to-text
Last synced: about 2 months ago
JSON representation
Elixir client library for IBM Cloud Speech to Text service
- Host: GitHub
- URL: https://github.com/software-mansion-labs/elixir-ibm-speech-to-text
- Owner: software-mansion-labs
- License: apache-2.0
- Created: 2019-05-15T09:35:21.000Z (about 6 years ago)
- Default Branch: master
- Last Pushed: 2020-08-27T08:08:14.000Z (almost 5 years ago)
- Last Synced: 2025-04-11T02:13:50.908Z (about 2 months ago)
- Language: Elixir
- Size: 694 KB
- Stars: 8
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# IBM Cloud Speech to Text
[](https://hex.pm/packages/ibm_speech_to_text)
[](https://circleci.com/gh/software-mansion-labs/elixir-ibm-speech-to-text)Elixir client for [IBM Cloud Speech to Text service](https://cloud.ibm.com/docs/services/speech-to-text)
## Installation
The package can be installed by adding `:ibm_speech_to_text` to your list of dependencies in `mix.exs`:
```elixir
def deps do
[
{:ibm_speech_to_text, "~> 0.3"}
]
end
```The docs can be found on [hexdocs.pm](https://hexdocs.pm/ibm_speech_to_text)
## Usage
1. Start the client process. For that you need to pass API URL or region as an atom,
API key obtained from IBM Cloud console. `start_link` also accepts parameters for the endpoint,
see the docs for more details.```elixir
{:ok, pid} = IBMSpeechToText.Client.start_link(:frankfurt, "API_KEY", model: "en-GB_BroadbandModel")
```2. Send "start" event with configuration for speech recognition
```elixir
start_message = %IBMSpeechToText.Start{content_type: :flac}
IBMSpeechToText.Client.send_message(pid, start_message)
```3. Start audio streaming
```elixir
IBMSpeechToText.Client.send_data(pid, audio_data)
```4. Stop streaming by sending "stop" message
```elixir
stop_message = %IBMSpeechToText.Stop{}
IBMSpeechToText.Client.send_message(pid, stop_message)
```5. You will receive results via message with struct `IBMSpeechToText.Response`
```elixir
%IBMSpeechToText.Response{
result_index: 0,
results: [
%IBMSpeechToText.RecognitionResult{
alternatives: [
%IBMSpeechToText.RecognitionAlternative{
confidence: 0.87,
timestamps: nil,
transcript: "to Sherlock Holmes she's always the woman ",
word_confidence: nil
}
],
final: true,
keywords_result: nil,
word_alternatives: nil
}, ...
],
speaker_labels: nil,
warnings: nil
}
```## Testing
Test tagged `:external` is excluded by default since it contacts the real API and requires
an API key provided via config.
This can be done by adding `config/test.secret.exs` file with the following content:```elixir
use Mix.Configconfig :ibm_speech_to_text, api_key: "YOUR_API_KEY"
```## Fixture
A recording fragment in `test/fixtures` comes from an audiobook
"The adventures of Sherlock Holmes (version 2)" available on [LibriVox](https://librivox.org/the-adventures-of-sherlock-holmes-by-sir-arthur-conan-doyle/)## Status
There are a few things that are not implemented in current version:
- parsing "word_alternatives" and "keywords_result" in RecognitionResult
- better way to pass endpoint options to client## Copyright and License
Copyright 2019, [Software Mansion](https://swmansion.com/?utm_source=git&utm_medium=readme&utm_campaign=elixir-ibm-speech-to-text)
[](https://swmansion.com/?utm_source=git&utm_medium=readme&utm_campaign=elixir-ibm-speech-to-text)
Licensed under the [Apache License, Version 2.0](LICENSE)