{"id":23093088,"url":"https://github.com/crispengari/speech-to-text-python-ibm_watson","last_synced_at":"2026-05-16T08:43:25.410Z","repository":{"id":141095026,"uuid":"328586118","full_name":"CrispenGari/speech-to-text-python-ibm_watson","owner":"CrispenGari","description":"This is a simple Artificial Intelligence Application that converts audios to speech using `ibm_watson`.","archived":false,"fork":false,"pushed_at":"2021-01-11T08:29:49.000Z","size":12115,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-04-03T18:52:03.453Z","etag":null,"topics":["ai","ibm-cloud","ibm-watson","jupiter-notebook","machine-learning","python","python2","python3"],"latest_commit_sha":null,"homepage":"","language":"PowerShell","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/CrispenGari.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-01-11T07:44:49.000Z","updated_at":"2021-01-11T08:31:30.000Z","dependencies_parsed_at":null,"dependency_job_id":"628b3e3b-76b1-4e69-8e28-c03f537598f3","html_url":"https://github.com/CrispenGari/speech-to-text-python-ibm_watson","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/CrispenGari/speech-to-text-python-ibm_watson","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CrispenGari%2Fspeech-to-text-python-ibm_watson","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CrispenGari%2Fspeech-to-text-python-ibm_watson/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CrispenGari%2Fspeech-to-text-python-ibm_watson/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CrispenGari%2Fspeech-to-text-python-ibm_watson/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/CrispenGari","download_url":"https://codeload.github.com/CrispenGari/speech-to-text-python-ibm_watson/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CrispenGari%2Fspeech-to-text-python-ibm_watson/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":266041915,"owners_count":23867958,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","ibm-cloud","ibm-watson","jupiter-notebook","machine-learning","python","python2","python3"],"created_at":"2024-12-16T21:46:27.510Z","updated_at":"2026-05-16T08:43:20.383Z","avatar_url":"https://github.com/CrispenGari.png","language":"PowerShell","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Speech To Text (stt)\n\nThis is a simple Artificial Intelligence Application that converts audios to speech using `ibm_watson`.\n\n\n#### Application capabilities\nThis app is capable of:\n* reading an audio from `audios` folder\n* converts the audio to speech using `ibm_watson SpeechToTextV1()`\n* write the converted speech to an external file `speech.txt` in the `files` folder\n\n### Getting started\n##### Installation\n####### First you need to install `ibm_watson`\n````shell\n$pip install ibm_watson\n````\n###### Second you need to install `pip install PyJWT==1.7.1`\n```shell\n$pip install PyJWT==1.7.1\n```\nThen you are ready to go\n##### Getting an API key and service URL\nTo get the service URL go to [IBM WATSON](https://cloud.ibm.com/catalog/services/)\n* Create an account or login if you are a member\n* Go to service\n* Go to AI\n* Look for Speech To Text and click\n* Create a new project \n* Hunt for API keys in the docs\n\n###### Importing packages\n\n````\nfrom ibm_watson import SpeechToTextV1, ApiException\nfrom ibm_cloud_sdk_core.authenticators import IAMAuthenticator\nimport json\n````\n###### Keys Variables\n```\nurl = \"API_KEY\"\napi_key = \"URL\"\n\n```\n\n###### Setting the authentication\n```\ntry:\n    auth = IAMAuthenticator(api_key)\n    stt = SpeechToTextV1(authenticator=auth)\n    stt.set_service_url(url)\nexcept ApiException as e:\n    print(e)\n```\n###### Converting audio to speech\n```` \nwith open(\"audios/long.mp3\", \"rb\") as audio:\n    res = stt.recognize(audio=audio, content_type=\"audio/mp3\", model=\"en-AU_NarrowbandModel\", continuous=True).get_result()\n````\n\n###### Write all the speech in a text file\n\n```\nsentences = res[\"results\"]\nsentence_list = []\nfor sentence in sentences:\n    # adding a sentence with confidence that is greater than 50%\n    sentence_list.append(str(sentence[\"alternatives\"][0][\"transcript\"]).strip() if sentence[\"alternatives\"][0][\"confidence\"] \u003e 0.5 else \"\")\n\n# print(json.dumps(sentence_list, indent=2))\n\nwith open(\"files/speech.txt\", \"w\") as writter:\n    for line in sentence_list:\n        if line == \"%HESITATION\":\n            writter.write(\",\")\n        else:\n            writter.write(line+\" \")\n\nprint(\"DONE\")\n```\n\n###### All the code in one file `main.py`\n\n````\n\n# importing packages\nfrom ibm_watson import SpeechToTextV1, ApiException\nfrom ibm_cloud_sdk_core.authenticators import IAMAuthenticator\nimport json\n\n# service credentials\nurl = \"API_KEY\"\napi_key = \"URL\"\n\n# Setting the authentication\ntry:\n    auth = IAMAuthenticator(api_key)\n    stt = SpeechToTextV1(authenticator=auth)\n    stt.set_service_url(url)\nexcept ApiException as e:\n    print(e)\n\n# converting audio to speech\nwith open(\"audios/long.mp3\", \"rb\") as audio:\n    res = stt.recognize(audio=audio, content_type=\"audio/mp3\", model=\"en-AU_NarrowbandModel\", continuous=True).get_result()\n\n\n\"\"\"\"\n* We are getting a python list of number of results\n* We want to loop through them and create sentences\n\"\"\"\n\nsentences = res[\"results\"]\nsentence_list = []\nfor sentence in sentences:\n    # adding a sentence with confidence that is greater than 50%\n    sentence_list.append(str(sentence[\"alternatives\"][0][\"transcript\"]).strip() if sentence[\"alternatives\"][0][\"confidence\"] \u003e 0.5 else \"\")\n\n# print(json.dumps(sentence_list, indent=2))\n\nwith open(\"files/speech.txt\", \"w\") as writter:\n    for line in sentence_list:\n        if line == \"%HESITATION\":\n            writter.write(\",\")\n        else:\n            writter.write(line+\" \")\n\nprint(\"DONE\")\n````\n##### Changes \nThere are a list of models and you can change the code based on what you want to achive\n* Modes URL [HERE](https://cloud.ibm.com/apidocs/speech-to-text?code=python)\n* Speech To Text Docs [HERE](https://cloud.ibm.com/apidocs/speech-to-text)\n#### Why this simple Application.\n\nThis program was built for practical purposes\n\n### Credits:\n* [None](https//localhost)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcrispengari%2Fspeech-to-text-python-ibm_watson","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcrispengari%2Fspeech-to-text-python-ibm_watson","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcrispengari%2Fspeech-to-text-python-ibm_watson/lists"}