https://github.com/software-mansion-labs/elixir-ibm-speech-to-text

Elixir client library for IBM Cloud Speech to Text service
https://github.com/software-mansion-labs/elixir-ibm-speech-to-text

Last synced: about 2 months ago
JSON representation

Elixir client library for IBM Cloud Speech to Text service

Host: GitHub
URL: https://github.com/software-mansion-labs/elixir-ibm-speech-to-text
Owner: software-mansion-labs
License: apache-2.0
Created: 2019-05-15T09:35:21.000Z (about 6 years ago)
Default Branch: master
Last Pushed: 2020-08-27T08:08:14.000Z (almost 5 years ago)
Last Synced: 2025-04-11T02:13:50.908Z (about 2 months ago)
Language: Elixir
Size: 694 KB
Stars: 8
Watchers: 2
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

        # IBM Cloud Speech to Text

[![Hex.pm](https://img.shields.io/hexpm/v/ibm_speech_to_text.svg)](https://hex.pm/packages/ibm_speech_to_text)

[![CircleCI](https://circleci.com/gh/software-mansion-labs/elixir-ibm-speech-to-text.svg?style=svg)](https://circleci.com/gh/software-mansion-labs/elixir-ibm-speech-to-text)

Elixir client for [IBM Cloud Speech to Text service](https://cloud.ibm.com/docs/services/speech-to-text)

## Installation

The package can be installed by adding `:ibm_speech_to_text` to your list of dependencies in `mix.exs`:

```elixir

def deps do

  [

    {:ibm_speech_to_text, "~> 0.3"}

  ]

end

```

The docs can be found on [hexdocs.pm](https://hexdocs.pm/ibm_speech_to_text)

## Usage

1. Start the client process. For that you need to pass API URL or region as an atom,

   API key obtained from IBM Cloud console. `start_link` also accepts parameters for the endpoint,

   see the docs for more details.

    ```elixir

    {:ok, pid} = IBMSpeechToText.Client.start_link(:frankfurt, "API_KEY", model: "en-GB_BroadbandModel")

    ```

2. Send "start" event with configuration for speech recognition

    ```elixir

    start_message = %IBMSpeechToText.Start{content_type: :flac}

    IBMSpeechToText.Client.send_message(pid, start_message)

    ```

3. Start audio streaming

    ```elixir

    IBMSpeechToText.Client.send_data(pid, audio_data)

    ```

4. Stop streaming by sending "stop" message

    ```elixir

    stop_message = %IBMSpeechToText.Stop{}

    IBMSpeechToText.Client.send_message(pid, stop_message)

    ```

5. You will receive results via message with struct `IBMSpeechToText.Response`

    ```elixir

    %IBMSpeechToText.Response{

      result_index: 0,

      results: [

        %IBMSpeechToText.RecognitionResult{

          alternatives: [

            %IBMSpeechToText.RecognitionAlternative{

              confidence: 0.87,

              timestamps: nil,

              transcript: "to Sherlock Holmes she's always the woman ",

              word_confidence: nil

            }

          ],

          final: true,

          keywords_result: nil,

          word_alternatives: nil

        }, ...

      ],

      speaker_labels: nil,

      warnings: nil

    }

    ```

## Testing

Test tagged `:external` is excluded by default since it contacts the real API and requires

an API key provided via config.

This can be done by adding `config/test.secret.exs` file with the following content:

```elixir

use Mix.Config

config :ibm_speech_to_text, api_key: "YOUR_API_KEY"

```

## Fixture

A recording fragment in `test/fixtures` comes from an audiobook

"The adventures of Sherlock Holmes (version 2)" available on [LibriVox](https://librivox.org/the-adventures-of-sherlock-holmes-by-sir-arthur-conan-doyle/)

## Status

There are a few things that are not implemented in current version:

- parsing "word_alternatives" and "keywords_result" in RecognitionResult

- better way to pass endpoint options to client

## Copyright and License

Copyright 2019, [Software Mansion](https://swmansion.com/?utm_source=git&utm_medium=readme&utm_campaign=elixir-ibm-speech-to-text)

[![Software Mansion](https://membraneframework.github.io/static/logo/swm_logo_readme.png)](https://swmansion.com/?utm_source=git&utm_medium=readme&utm_campaign=elixir-ibm-speech-to-text)

Licensed under the [Apache License, Version 2.0](LICENSE)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/software-mansion-labs/elixir-ibm-speech-to-text

Awesome Lists containing this project

README