https://github.com/crispengari/speech-to-text-python-ibm_watson

This is a simple Artificial Intelligence Application that converts audios to speech using `ibm_watson`.
https://github.com/crispengari/speech-to-text-python-ibm_watson

ai ibm-cloud ibm-watson jupiter-notebook machine-learning python python2 python3

Last synced: 8 months ago
JSON representation

This is a simple Artificial Intelligence Application that converts audios to speech using `ibm_watson`.

Host: GitHub
URL: https://github.com/crispengari/speech-to-text-python-ibm_watson
Owner: CrispenGari
Created: 2021-01-11T07:44:49.000Z (about 5 years ago)
Default Branch: main
Last Pushed: 2021-01-11T08:29:49.000Z (about 5 years ago)
Last Synced: 2025-04-03T18:52:03.453Z (12 months ago)
Topics: ai, ibm-cloud, ibm-watson, jupiter-notebook, machine-learning, python, python2, python3
Language: PowerShell
Homepage:
Size: 11.6 MB
Stars: 0
Watchers: 2
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

          # Speech To Text (stt)

This is a simple Artificial Intelligence Application that converts audios to speech using `ibm_watson`.

#### Application capabilities

This app is capable of:

* reading an audio from `audios` folder

* converts the audio to speech using `ibm_watson SpeechToTextV1()`

* write the converted speech to an external file `speech.txt` in the `files` folder

### Getting started

##### Installation

####### First you need to install `ibm_watson`

````shell

$pip install ibm_watson

````

###### Second you need to install `pip install PyJWT==1.7.1`

```shell

$pip install PyJWT==1.7.1

```

Then you are ready to go

##### Getting an API key and service URL

To get the service URL go to [IBM WATSON](https://cloud.ibm.com/catalog/services/)

* Create an account or login if you are a member

* Go to service

* Go to AI

* Look for Speech To Text and click

* Create a new project 

* Hunt for API keys in the docs

###### Importing packages

````

from ibm_watson import SpeechToTextV1, ApiException

from ibm_cloud_sdk_core.authenticators import IAMAuthenticator

import json

````

###### Keys Variables

```

url = "API_KEY"

api_key = "URL"

```

###### Setting the authentication

```

try:

    auth = IAMAuthenticator(api_key)

    stt = SpeechToTextV1(authenticator=auth)

    stt.set_service_url(url)

except ApiException as e:

    print(e)

```

###### Converting audio to speech

```` 

with open("audios/long.mp3", "rb") as audio:

    res = stt.recognize(audio=audio, content_type="audio/mp3", model="en-AU_NarrowbandModel", continuous=True).get_result()

````

###### Write all the speech in a text file

```

sentences = res["results"]

sentence_list = []

for sentence in sentences:

    # adding a sentence with confidence that is greater than 50%

    sentence_list.append(str(sentence["alternatives"][0]["transcript"]).strip() if sentence["alternatives"][0]["confidence"] > 0.5 else "")

# print(json.dumps(sentence_list, indent=2))

with open("files/speech.txt", "w") as writter:

    for line in sentence_list:

        if line == "%HESITATION":

            writter.write(",")

        else:

            writter.write(line+" ")

print("DONE")

```

###### All the code in one file `main.py`

````

# importing packages

from ibm_watson import SpeechToTextV1, ApiException

from ibm_cloud_sdk_core.authenticators import IAMAuthenticator

import json

# service credentials

url = "API_KEY"

api_key = "URL"

# Setting the authentication

try:

    auth = IAMAuthenticator(api_key)

    stt = SpeechToTextV1(authenticator=auth)

    stt.set_service_url(url)

except ApiException as e:

    print(e)

# converting audio to speech

with open("audios/long.mp3", "rb") as audio:

    res = stt.recognize(audio=audio, content_type="audio/mp3", model="en-AU_NarrowbandModel", continuous=True).get_result()

""""

* We are getting a python list of number of results

* We want to loop through them and create sentences

"""

sentences = res["results"]

sentence_list = []

for sentence in sentences:

    # adding a sentence with confidence that is greater than 50%

    sentence_list.append(str(sentence["alternatives"][0]["transcript"]).strip() if sentence["alternatives"][0]["confidence"] > 0.5 else "")

# print(json.dumps(sentence_list, indent=2))

with open("files/speech.txt", "w") as writter:

    for line in sentence_list:

        if line == "%HESITATION":

            writter.write(",")

        else:

            writter.write(line+" ")

print("DONE")

````

##### Changes 

There are a list of models and you can change the code based on what you want to achive

* Modes URL [HERE](https://cloud.ibm.com/apidocs/speech-to-text?code=python)

* Speech To Text Docs [HERE](https://cloud.ibm.com/apidocs/speech-to-text)

#### Why this simple Application.

This program was built for practical purposes

### Credits:

* [None](https//localhost)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/crispengari/speech-to-text-python-ibm_watson

Awesome Lists containing this project

README