https://github.com/membraneframework/membrane_element_gcloud_speech_to_text

Membrane plugin providing speech recognition via Google Cloud Speech-to-Text API
https://github.com/membraneframework/membrane_element_gcloud_speech_to_text

Last synced: 3 months ago
JSON representation

Membrane plugin providing speech recognition via Google Cloud Speech-to-Text API

Host: GitHub
URL: https://github.com/membraneframework/membrane_element_gcloud_speech_to_text
Owner: membraneframework
License: apache-2.0
Created: 2019-07-23T12:53:11.000Z (almost 6 years ago)
Default Branch: master
Last Pushed: 2023-11-17T14:55:54.000Z (over 1 year ago)
Last Synced: 2025-04-10T00:39:02.613Z (3 months ago)
Language: Elixir
Homepage:
Size: 269 KB
Stars: 0
Watchers: 1
Forks: 1
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

        # Membrane Multimedia Framework: GCloud Speech To Text

[![Hex.pm](https://img.shields.io/hexpm/v/membrane_element_gcloud_speech_to_text.svg)](https://hex.pm/packages/membrane_element_gcloud_speech_to_text)

[![CircleCI](https://circleci.com/gh/membraneframework/membrane_element_gcloud_speech_to_text.svg?style=svg)](https://circleci.com/gh/membraneframework/membrane_element_gcloud_speech_to_text)

This package provides a Sink wrapping [Google Cloud Speech To Text Streaming API client](https://hex.pm/packages/gcloud_speech_grpc).

Currently supports only audio streams in FLAC format.

The docs can be found at [HexDocs](https://hexdocs.pm/membrane_element_gcloud_speech_to_text).

## Installation

The package can be installed by adding `membrane_element_gcloud_speech_to_text` to your list of dependencies in `mix.exs`:

```elixir

def deps do

  [

    {:membrane_element_gcloud_speech_to_text, "~> 0.10.0"}

  ]

end

```

## Configuration

To use the element you need a `config/config.exs` file with Google credentials:

```elixir

use Mix.Config

config :goth, json: "a_path/to/google/credentials/creds.json" |> File.read!()

```

More info on how to configure credentials can be found in [README of Goth library](https://github.com/peburrows/goth#installation)

used for authentication.

## Usage

The input stream for this element should be parsed, so most of the time it should be

placed in pipeline right after [FLACParser](https://github.com/membraneframework/membrane-element-flac-parser)

Here's an example of pipeline streaming audio file to speech recognition API:

```elixir

defmodule SpeechRecognition do

  use Membrane.Pipeline

  alias Google.Cloud.Speech.V1.StreamingRecognizeResponse

  @impl true

  def handle_init(_ctx, _options) do

    spec =

      child(%Membrane.File.Source{location: "sample.flac"})

      |> child(Membrane.FLAC.Parser)

      |> child(%Membrane.Element.GCloud.SpeechToText{

          interim_results: false,

          language_code: "en-GB",

          word_time_offsets: true

        })

    {[spec: links], %{}}

  end

  @impl true

  def handle_child_notification(%StreamingRecognizeResponse{} = response, _element, _ctx, state) do

    IO.inspect(response)

    {[], state}

  end

  @impl true

  def handle_child_notification(_notification, _element, _ctx, state) do

    {[], state}

  end

end

```

The pipeline also requires [a config file](#configuration) and the following dependencies:

```elixir

[

  {:membrane_core, "~> 1.0"},

  {:membrane_file_plugin, "~> 0.16.0"},

  {:membrane_flac_plugin, "~> 0.11.0"},

	{:membrane_element_gcloud_speech_to_text, "~> 0.10.0"}

]

```

## Testing

Tests tagged `:external` are excluded by default since they contact the real API and require

configuration of credentials. See [Configuration](#configuration)

## Fixture

A recording fragment in `test/fixtures` comes from an audiobook

"The adventures of Sherlock Holmes (version 2)" available on [LibriVox](https://librivox.org/the-adventures-of-sherlock-holmes-by-sir-arthur-conan-doyle/)

## Copyright and License

Copyright 2019, [Software Mansion](https://swmansion.com/?utm_source=git&utm_medium=readme&utm_campaign=membrane-element-gcloud-speech-to-text)

[![Software Mansion](https://logo.swmansion.com/logo?color=white&variant=desktop&width=200&tag=membrane-github)](https://swmansion.com/?utm_source=git&utm_medium=readme&utm_campaign=membrane-element-gcloud-speech-to-text)

Licensed under the [Apache License, Version 2.0](LICENSE)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/membraneframework/membrane_element_gcloud_speech_to_text

Awesome Lists containing this project

README