An open API service indexing awesome lists of open source software.

https://github.com/perfectmemory/azure_stt

API Wrapper for the Microsoft Azure Speech Services Speech-to-text REST API 3.1 (Cognitive Services).
https://github.com/perfectmemory/azure_stt

azure ruby speech-to-text

Last synced: 9 months ago
JSON representation

API Wrapper for the Microsoft Azure Speech Services Speech-to-text REST API 3.1 (Cognitive Services).

Awesome Lists containing this project

README

          

# azure_stt

[![Gem Version](https://badge.fury.io/rb/azure_stt.svg)](https://badge.fury.io/rb/azure_stt)
[![CI](https://github.com/PerfectMemory/azure_stt/actions/workflows/ci.yml/badge.svg)](https://github.com/PerfectMemory/azure_stt/actions/workflows/ci.yml)
[![Coverage Status](https://coveralls.io/repos/github/PerfectMemory/azure_stt/badge.svg)](https://coveralls.io/github/PerfectMemory/azure_stt)
[![Maintainability](https://api.codeclimate.com/v1/badges/375190d3122da56a9fe1/maintainability)](https://codeclimate.com/github/PerfectMemory/azure_stt/maintainability)

API Wrapper for the [Microsoft Azure Speech Services Speech-to-text REST API 3.1](https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/rest-speech-to-text) (Cognitive Services).

## Installation

Add this line to your application's Gemfile:

```ruby
gem 'azure_stt'
```

And then execute:

```bash
bundle
```

Or install it yourself as:

```bash
gem install azure_stt
```

## Azure Speech-to-text Subscription key

To be able to use the gem, you must have a subscription key.
You can generate one on your Azure account.

* If you don't have an Azure account, you can create one for free on [this page](https://azure.microsoft.com/en-us/free/).
* Once logged on your [Azure portal](https://portal.azure.com/), subscribe to Speech in Microsoft Cognitive Services.
* You will find two subscription keys available in 'RESOURCE MANAGEMENT > Keys' ('KEY 1' and 'KEY 2').

## Usage

### Configuration

Two environment variables are used:

- 'REGION': the region of your subscription

- 'SUBSCRIPTION_KEY': the API key you can generate on your Azure account.

You can look at the file `env.sample` and change the values.
If you do not want to use environment variables, you can configure the values like so:

```ruby
AzureSTT.configure do |config|
config.region = 'your_region'
config.subscription_key = 'your_key'
end
```

Finally, the class `AzureSTT::Session` uses by the default the values from the configuration, but you can initialize the session with custom values:

```ruby
session = AzureSTT::Session.new(region: 'your_region', subscription_key: 'your_key')
```

### Start a transcription

```ruby
require 'azure_stt'

properties = {
"diarizationEnabled" => false,
"wordLevelTimestampsEnabled" => false,
"punctuationMode" => "DictatedAndAutomatic",
"profanityFilterMode" => "Masked"
}

content_urls = [ 'https://path.com/audio.ogg', 'https://path.com/audio1.ogg']

session = AzureSTT::Session.new

transcription = session.create_transcription(
content_urls: content_urls,
properties: properties,
locale: 'en-US',
display_name: 'The name of the transcription')

# You can the retrieve the results of your transcription with the id
puts transcription.id
# Outputs 'your_transcription_id'

```

### Get a transcription

```ruby
require 'azure_stt'

session = AzureSTT::Session.new

transcription = session.get_transcription('your_transcription_id')

# Returns
# #false,
# # "wordLevelTimestampsEnabled"=>false, "channels"=>[0, 1],
# # "punctuationMode"=>"DictatedAndAutomatic", "profanityFilterMode"=>"Masked",
# # "duration"=>"PT5M18S"}
# # links={"files"=>"https://uscentral.api.cognitive.microsoft.com/speechtotext/v3.1/transcriptions/d35a802d-70ae-4358-a35d-b5faa0c75457/files"}
# # last_action_date_time=# created_date_time=#
# # status="Succeeded" locale="en-US" display_name="Transcription name" files=[]>

if transcription.succeeded?
# You can then access to the text, for instance :
result = transcription.results.first
puts result.text
end
```

### Delete a transcription

```ruby
require 'azure_stt'

session = AzureSTT::Session.new

transcription = session.delete_transcription('your_transcription_id')
```

The API doesn't seem to send 404 errors when the id is unknown, but always send a 204 response.
So the `Session#delete_transcription` returns `true` even when the transcription didn't exist.

### Starting a transcription, fetching the results and deleting the transcription

```ruby
require 'azure_stt'

session = AzureSTT::Session.new

properties = {
"diarizationEnabled" => false,
"wordLevelTimestampsEnabled" => false,
"punctuationMode" => "DictatedAndAutomatic",
"profanityFilterMode" => "Masked"
}

content_urls = [ 'https://path.com/audio.ogg' ]

session = AzureSTT::Session.new

transcription = session.create_transcription(
content_urls: content_urls,
properties: properties,
locale: 'en-US',
display_name: 'The name of the transcription')

id = transcription.id

while(!transcription.finished?) do
sleep(30)
transcription = session.get_transcription(id)
end

if(transcription.succeeded?)
puts transcription.results.first.text
end

session.delete_transcription(id)
```

## Development

After checking out the repo, run `bin/setup` to install dependencies. You can also run `bin/console` for an interactive prompt that will allow you to experiment.

## Contributing

Bug reports and pull requests are welcome on GitHub at https://github.com/PerfectMemory/azure_stt. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the [Contributor Covenant](http://contributor-covenant.org) code of conduct.

## Code of Conduct

Everyone interacting in the AzureStt project’s codebases, issue trackers, chat rooms and mailing lists is expected to follow the [code of conduct](https://github.com/PerfectMemory/azure_stt/blob/master/CODE_OF_CONDUCT.md).