{"id":28909214,"url":"https://github.com/tekyaygilfethi/googlespeechtotextpythonimplementation","last_synced_at":"2025-06-28T21:35:13.140Z","repository":{"id":50972135,"uuid":"491132907","full_name":"TekyaygilFethi/GoogleSpeechToTextPythonImplementation","owner":"TekyaygilFethi","description":"Google API Sppech To Text Python Implementation","archived":false,"fork":false,"pushed_at":"2022-05-11T15:35:35.000Z","size":10,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2023-03-06T05:42:44.145Z","etag":null,"topics":["google","python","speech-to-text"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/TekyaygilFethi.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2022-05-11T13:53:55.000Z","updated_at":"2023-02-14T03:33:10.000Z","dependencies_parsed_at":"2022-09-11T01:32:26.868Z","dependency_job_id":null,"html_url":"https://github.com/TekyaygilFethi/GoogleSpeechToTextPythonImplementation","commit_stats":null,"previous_names":[],"tags_count":null,"template":null,"template_full_name":null,"purl":"pkg:github/TekyaygilFethi/GoogleSpeechToTextPythonImplementation","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TekyaygilFethi%2FGoogleSpeechToTextPythonImplementation","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TekyaygilFethi%2FGoogleSpeechToTextPythonImplementation/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TekyaygilFethi%2FGoogleSpeechToTextPythonImplementation/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TekyaygilFethi%2FGoogleSpeechToTextPythonImplementation/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/TekyaygilFethi","download_url":"https://codeload.github.com/TekyaygilFethi/GoogleSpeechToTextPythonImplementation/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TekyaygilFethi%2FGoogleSpeechToTextPythonImplementation/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":261162073,"owners_count":23118221,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["google","python","speech-to-text"],"created_at":"2025-06-21T17:08:12.492Z","updated_at":"2025-06-28T21:35:13.134Z","avatar_url":"https://github.com/TekyaygilFethi.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Google Speech To Text Python Implementation\nThis module is implementation of Google Speech To Text API. Google Speech To Text supports multiple languages such as English, Turkish, German while converting speeches into text.\n\n# SETUP\n## Setting Up The Google Cloud Platform\nNOTE: If you have your JSON credential file for your GCP, skip to the step 17!\n\n### GCP Access\n1. Login or Sign in to Google Cloud Platform Console through \u003ca href=\"https://cloud.google.com/\"\u003ehere\u003c/a\u003e\n\n### Creating New Project\n2. Create a new project: \n\u003cimg width=\"1394\" alt=\"image\" src=\"https://user-images.githubusercontent.com/28951869/167868246-7ed8f796-ba2e-4ea4-8ba9-df2ce2e3f95d.png\"\u003e\n\n### Enabling Speech To Text API\n3. Navigate to your newly created project:\n\u003cimg width=\"1392\" alt=\"image\" src=\"https://user-images.githubusercontent.com/28951869/167869636-31d690d0-92d6-4e5e-b5e6-111e44bf95b5.png\"\u003e\n\n4. Navigate to APIs \u0026 Services:\n\u003cimg width=\"1392\" alt=\"image\" src=\"https://user-images.githubusercontent.com/28951869/167869160-ec5d3ac2-ce22-484c-96f4-6fe529b18ba8.png\"\u003e\n\n5. Click ENABLE APIS AND SERVICES button:\n\u003cimg width=\"1388\" alt=\"image\" src=\"https://user-images.githubusercontent.com/28951869/167869866-c1ede81d-a635-4495-9cf6-e750dfa352c8.png\"\u003e\n\n6. Search for Cloud Speech To Text API and click it:\n\u003cimg width=\"1391\" alt=\"image\" src=\"https://user-images.githubusercontent.com/28951869/167870278-e6c783d1-0cd2-40cb-a2c8-22a0976d5ed3.png\"\u003e\n\n7. Click enable (You need to add payment to enable this service but don't worry Google gives you free credits at the beginning):\n\u003cimg width=\"641\" alt=\"image\" src=\"https://user-images.githubusercontent.com/28951869/167870513-07446259-1665-401d-a946-5d3596472376.png\"\u003e\n\n### Creating a New Service Account and Gathering The JSON file for Authorization\n8. Navigate to the IAM \u0026 Admin -\u003e Service Accounts:\n\u003cimg width=\"1386\" alt=\"image\" src=\"https://user-images.githubusercontent.com/28951869/167871032-cfbdaddd-b9a0-4621-a438-b8ff4d6b289e.png\"\u003e\n\n9. Click to Create Service Account button:\n\u003cimg width=\"947\" alt=\"image\" src=\"https://user-images.githubusercontent.com/28951869/167871377-3cbcd63c-ace8-4f6f-b6a7-3bb3146b1fb3.png\"\u003e\n\n10. You can give any name you want as service account name. For 2nd step of service account creation, you may give \"Owner\" role as the image suggests but you can give any desired role of course of this service account:\n\u003cimg width=\"865\" alt=\"image\" src=\"https://user-images.githubusercontent.com/28951869/167871814-5bd1cb3f-cb9c-4679-a079-5ff23981bba8.png\"\u003e\n\n11. For 3rd step, after writing yout service account's name you can clearly see Google suggeste you the mail address that is related to yout newly created service account. You should fill two fields with this email address:\n\u003cimg width=\"566\" alt=\"image\" src=\"https://user-images.githubusercontent.com/28951869/167872405-76404b00-2687-44d0-a082-7eef4320ab52.png\"\u003e\n\n12. Now you can see you service account has been created!\n\n13. Click the 3 dots that is at right corner of your service account and select Manage Keys option:\n\u003cimg width=\"1119\" alt=\"image\" src=\"https://user-images.githubusercontent.com/28951869/167872941-1cabd80f-8a07-42f8-ae8a-d3c79a8da6bc.png\"\u003e\n\n14. Click Add Key -\u003e Create New Key:\n\u003cimg width=\"918\" alt=\"image\" src=\"https://user-images.githubusercontent.com/28951869/167873204-3a418705-4241-402b-8406-83d2f53bb48b.png\"\u003e\n\n15. Select JSON and click Create:\n\u003cimg width=\"1021\" alt=\"image\" src=\"https://user-images.githubusercontent.com/28951869/167873471-706f71ee-b070-495b-8f2f-e51d828471d4.png\"\u003e\n\n16. Your JSON file should be downloaded to your PC. Save it with safe! We will be using this JSON file for Google API Authentication.\n\n### Setting Up GCP Storage\n\n17. Search for Storage from the searchbox and select the Cloud Storage:\n\u003cimg width=\"1394\" alt=\"image\" src=\"https://user-images.githubusercontent.com/28951869/167875497-e229c349-8c17-445b-84c0-352bf910548a.png\"\u003e\n\n18. Create a new bucket:\n\u003cimg width=\"1091\" alt=\"image\" src=\"https://user-images.githubusercontent.com/28951869/167875783-69e02628-881f-45ee-aa5b-8d24b3e8fea4.png\"\u003e\n\n19. Name your bucket (which should be unique worlwide) and click Next for every step until you create a bucket. Now you are in your bucket\n\u003cimg width=\"1089\" alt=\"image\" src=\"https://user-images.githubusercontent.com/28951869/167876156-41696e6f-baa1-485a-87e0-07728bc2d32a.png\"\u003e\nThis bucket will hold you audio files which will be translated into texts. You may upload you files here.\n\n\n20. I'm uploading my file name 'uzuntrim.wav' here. My audio file here has the wav extension.\n\n\n# Demo\n21. Clone the repository:\n```bash\ngit clone https://github.com/UserVision/GoogleSpeechToTextPythonImplementation.git\n```\n\n22. Create a new virtual environemnt:\n```bash\npython3 -m venv myvenv\n```\n\n23-1. Activate your virtual environment (FOR MAC):\n```bash\nsource myvenv/bin/activate\n```\n\n23-2. Activate your virtual environment (FOR WINDOWS):\n```bash\nsource myvenv/bin/activate\n```\n\nNOTE: If you got error like this;\n\u003cimg width=\"1352\" alt=\"image\" src=\"https://user-images.githubusercontent.com/28951869/167883817-9d48042c-99cb-4c64-b885-88dfd0ca7c7e.png\"\u003e\nYou should do the following:\n\n- Start Windows Powershell as Administrator\n- Type ```Set-ExecutionPolicy RemoteSigned``` and hit enter\n- When Powershell waits for your input, enter ```A``` and hit Enter button\n\nNow you should be able to move forward with the venv activation script\n\n24. Install all the requirements via the following command:\n```bash\npip3 install -r requirements.txt\n```\n25. Add your JSON file that is being downloaded at step 16 to the main directory of the project.\n26. Add .env file to you main directory of the project. This file should contain the name of your JSON file with the key of JSON_NAME.\n```\nJSON_NAME={name}.json\n```\n27. In ```main.py``` file, you should change this line:\n```python\naudio = dict(uri=\"gs://diarizationuv/uzuntrim.wav\")\n```\nwith the corresponding url you've taken from GCP Storage. To achieve this just select your desired audio file from Cloud platform and click to the 3 dots at the right o that line. Then choose ```Copy gsutil URI``` option.\n\u003cimg width=\"935\" alt=\"image\" src=\"https://user-images.githubusercontent.com/28951869/167887027-4dfb0b26-a2ce-41f3-9aad-347fa89d17b5.png\"\u003e\n\n28. If you have a recoırding more than 1 minute, you should use ```speech_to_text_long``` but if you have a recording that is not exceeding 1 minute, you may use ```speech_to_text``` functions in ```main.py``` file.\n\n29. You should set your ```language_code``` parameter in your config dictionary which is in ```main.py``` file according to your audio's language. For example if your audio contains Turkish language, then you should set the language_code parameter as ```tr-TR'```\n\n30. To run the python script, enter the following command:\n```bash\npython3 main.py\n```\n31. And you should be able to see the results:\n\u003cimg width=\"393\" alt=\"image\" src=\"https://user-images.githubusercontent.com/28951869/167888024-077ef51b-2986-4dc6-ab5a-919e656ada9c.png\"\u003e\n\n32. CONGRATULATIONS! Now you converted your speech audio into text :)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftekyaygilfethi%2Fgooglespeechtotextpythonimplementation","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftekyaygilfethi%2Fgooglespeechtotextpythonimplementation","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftekyaygilfethi%2Fgooglespeechtotextpythonimplementation/lists"}