An open API service indexing awesome lists of open source software.

https://github.com/software-mansion-labs/elixir-ibm-speech-to-text

Elixir client library for IBM Cloud Speech to Text service
https://github.com/software-mansion-labs/elixir-ibm-speech-to-text

Last synced: about 2 months ago
JSON representation

Elixir client library for IBM Cloud Speech to Text service

Awesome Lists containing this project

README

        

# IBM Cloud Speech to Text

[![Hex.pm](https://img.shields.io/hexpm/v/ibm_speech_to_text.svg)](https://hex.pm/packages/ibm_speech_to_text)
[![CircleCI](https://circleci.com/gh/software-mansion-labs/elixir-ibm-speech-to-text.svg?style=svg)](https://circleci.com/gh/software-mansion-labs/elixir-ibm-speech-to-text)

Elixir client for [IBM Cloud Speech to Text service](https://cloud.ibm.com/docs/services/speech-to-text)

## Installation

The package can be installed by adding `:ibm_speech_to_text` to your list of dependencies in `mix.exs`:

```elixir
def deps do
[
{:ibm_speech_to_text, "~> 0.3"}
]
end
```

The docs can be found on [hexdocs.pm](https://hexdocs.pm/ibm_speech_to_text)

## Usage

1. Start the client process. For that you need to pass API URL or region as an atom,
API key obtained from IBM Cloud console. `start_link` also accepts parameters for the endpoint,
see the docs for more details.

```elixir
{:ok, pid} = IBMSpeechToText.Client.start_link(:frankfurt, "API_KEY", model: "en-GB_BroadbandModel")
```

2. Send "start" event with configuration for speech recognition

```elixir
start_message = %IBMSpeechToText.Start{content_type: :flac}
IBMSpeechToText.Client.send_message(pid, start_message)
```

3. Start audio streaming

```elixir
IBMSpeechToText.Client.send_data(pid, audio_data)
```

4. Stop streaming by sending "stop" message

```elixir
stop_message = %IBMSpeechToText.Stop{}
IBMSpeechToText.Client.send_message(pid, stop_message)
```

5. You will receive results via message with struct `IBMSpeechToText.Response`

```elixir
%IBMSpeechToText.Response{
result_index: 0,
results: [
%IBMSpeechToText.RecognitionResult{
alternatives: [
%IBMSpeechToText.RecognitionAlternative{
confidence: 0.87,
timestamps: nil,
transcript: "to Sherlock Holmes she's always the woman ",
word_confidence: nil
}
],
final: true,
keywords_result: nil,
word_alternatives: nil
}, ...
],
speaker_labels: nil,
warnings: nil
}
```

## Testing

Test tagged `:external` is excluded by default since it contacts the real API and requires
an API key provided via config.
This can be done by adding `config/test.secret.exs` file with the following content:

```elixir
use Mix.Config

config :ibm_speech_to_text, api_key: "YOUR_API_KEY"
```

## Fixture

A recording fragment in `test/fixtures` comes from an audiobook
"The adventures of Sherlock Holmes (version 2)" available on [LibriVox](https://librivox.org/the-adventures-of-sherlock-holmes-by-sir-arthur-conan-doyle/)

## Status

There are a few things that are not implemented in current version:

- parsing "word_alternatives" and "keywords_result" in RecognitionResult
- better way to pass endpoint options to client

## Copyright and License

Copyright 2019, [Software Mansion](https://swmansion.com/?utm_source=git&utm_medium=readme&utm_campaign=elixir-ibm-speech-to-text)

[![Software Mansion](https://membraneframework.github.io/static/logo/swm_logo_readme.png)](https://swmansion.com/?utm_source=git&utm_medium=readme&utm_campaign=elixir-ibm-speech-to-text)

Licensed under the [Apache License, Version 2.0](LICENSE)