{"id":16420729,"url":"https://github.com/andrew-chen-wang/your-speech-recognition","last_synced_at":"2025-10-05T17:12:03.255Z","repository":{"id":103627605,"uuid":"352502942","full_name":"Andrew-Chen-Wang/Your-Speech-Recognition","owner":"Andrew-Chen-Wang","description":"UI for Making DeepSpeech Voice Recognition Fine-Tuned to Your Voice Easier","archived":false,"fork":false,"pushed_at":"2021-03-29T16:54:05.000Z","size":15,"stargazers_count":0,"open_issues_count":0,"forks_count":1,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-06-10T21:49:18.765Z","etag":null,"topics":["deepspeech","fine-tuning","voice-assistant","voice-recognition"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Andrew-Chen-Wang.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-03-29T03:18:24.000Z","updated_at":"2021-03-29T16:54:07.000Z","dependencies_parsed_at":"2023-04-28T12:32:42.300Z","dependency_job_id":null,"html_url":"https://github.com/Andrew-Chen-Wang/Your-Speech-Recognition","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/Andrew-Chen-Wang/Your-Speech-Recognition","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Andrew-Chen-Wang%2FYour-Speech-Recognition","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Andrew-Chen-Wang%2FYour-Speech-Recognition/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Andrew-Chen-Wang%2FYour-Speech-Recognition/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Andrew-Chen-Wang%2FYour-Speech-Recognition/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Andrew-Chen-Wang","download_url":"https://codeload.github.com/Andrew-Chen-Wang/Your-Speech-Recognition/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Andrew-Chen-Wang%2FYour-Speech-Recognition/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":278486308,"owners_count":25994945,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-05T02:00:06.059Z","response_time":54,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["deepspeech","fine-tuning","voice-assistant","voice-recognition"],"created_at":"2024-10-11T07:29:00.947Z","updated_at":"2025-10-05T17:12:03.235Z","avatar_url":"https://github.com/Andrew-Chen-Wang.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Your Speech Recognition\n\nEasy way to fine tune [DeepSpeech](https://github.com/mozilla/DeepSpeech)\nvoice recognition to your voice by making a website to do it\nall for you. Why? Because I couldn't swear, it got distracted by my\nstatic voice due to mic issues (unlike Siri),\nand I wanted an automated way for me to train the data rather than constantly\nfinding something to say then writing it down.\n\nBy: [Andrew-Chen-Wang](https://github.com/Andrew-Chen-Wang)\n\nDate Created: 28 March 2020\n\n---\n### Requirements\n\n- Python 3.5-3.7 (Mac has this installed already)\n    - 3.7 upper bound due to Tensorflow\n    - If TensorFlow isn't installed, try creating your virtualenv using \n      `python3.7 -m venv venv` or whatever other python 3.x version you have installed.\n\nFor the voice recognition:\n- Portaudio\n\n---\n### Usage\n\n1. Run in your terminal or command prompt, depending on your system:\n    - Windows: `virtualenv venv \u0026\u0026 venv\\Scripts\\activate`\n    - Mac/Linux: `virtualenv venv \u0026\u0026 source venv/bin/activate`\n    - If you need to use a specific version of Python instead of your default (perhaps \n      due to the upper Python version bound), then run  `python3.7 -m venv venv` or \n      whatever other python 3.x version you have installed.\n2. First you need to prepare some data for training. Run: \n   `python app.py`\n3. Open your browser, head to the website http://localhost:5000/ and follow\n   the instructions on the website.\n4. Now it's time for the \"machine learning.\" Depending on your system:\n    - Mac/Linux: `sh train.sh`\n    - Note: You can specify some other parameters via the scripts or by\n      yourself via your terminal/command-prompt.\n5. Finally, you can run the voice recognition software. Depending on your system: \n\nWindows:\n```shell\npython recognizer.py -w media \\\n--model output_models\\deepspeech-0.9.3-models.pbmm \\\n--scorer models\\deepspeech-0.9.3-models.scorer\n```\n\nMac/Linux:\n```shell\npython recognizer.py -w media \\\n--model output_models/deepspeech-0.9.3-models.pbmm \\\n--scorer models/deepspeech-0.9.3-models.scorer\n```\n\nNow, just talk and see the output working!\n\n---\n### TODO\n\n- Need to write the website for collecting user speech\n- Allow the website to also do \"VAD\" by making audio files by\n  intervals of 1.75 seconds during recording. Unfortunately,\n  I don't think we can do automatic detection and\n  thus humans will think of the software poorly, but at \n  least it somewhat works.\n- Actually write the NT script for training.\n- Compare the amount of time required for this to be pretty good. I.e. \n  check if 30 minutes of training, an hour, etc. is the marginal benefit \n  worth the trouble of more effort?\n\n---\n### License\n\n```text\nCopyright 2021 Andrew Chen Wang\n\n   Licensed under the Apache License, Version 2.0 (the \"License\");\n   you may not use this file except in compliance with the License.\n   You may obtain a copy of the License at\n\n       http://www.apache.org/licenses/LICENSE-2.0\n\n   Unless required by applicable law or agreed to in writing, software\n   distributed under the License is distributed on an \"AS IS\" BASIS,\n   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n   See the License for the specific language governing permissions and\n   limitations under the License.\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fandrew-chen-wang%2Fyour-speech-recognition","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fandrew-chen-wang%2Fyour-speech-recognition","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fandrew-chen-wang%2Fyour-speech-recognition/lists"}