{"id":13467292,"url":"https://github.com/synesthesiam/voice2json","last_synced_at":"2025-05-16T10:06:59.999Z","repository":{"id":41442856,"uuid":"209681878","full_name":"synesthesiam/voice2json","owner":"synesthesiam","description":"Command-line tools for speech and intent recognition on Linux","archived":false,"fork":false,"pushed_at":"2024-03-07T10:30:03.000Z","size":32521,"stargazers_count":1104,"open_issues_count":38,"forks_count":64,"subscribers_count":25,"default_branch":"master","last_synced_at":"2025-04-09T04:07:10.837Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/synesthesiam.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":"AUTHORS","dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-09-20T01:45:44.000Z","updated_at":"2025-04-08T03:24:31.000Z","dependencies_parsed_at":"2024-01-16T10:36:46.845Z","dependency_job_id":"cfb9c5ac-9cbf-4e7a-8723-f1018ea544c4","html_url":"https://github.com/synesthesiam/voice2json","commit_stats":null,"previous_names":[],"tags_count":5,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/synesthesiam%2Fvoice2json","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/synesthesiam%2Fvoice2json/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/synesthesiam%2Fvoice2json/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/synesthesiam%2Fvoice2json/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/synesthesiam","download_url":"https://codeload.github.com/synesthesiam/voice2json/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254509476,"owners_count":22082891,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-07-31T15:00:54.816Z","updated_at":"2025-05-16T10:06:54.990Z","avatar_url":"https://github.com/synesthesiam.png","language":"Python","funding_links":[],"categories":["Python","[:robot: machine-learning]([robot-machine-learning)](\u003chttps://github.com/stars/ketsapiwiq/lists/robot-machine-learning\u003e))"],"sub_categories":[],"readme":"![voice2json logo](docs/img/voice2json.svg)\n\n`voice2json` is a collection of [command-line tools](https://voice2json.org/commands.html) for \u003cstrong\u003eoffline speech/intent recognition\u003c/strong\u003e on Linux. It is free, open source ([MIT](https://opensource.org/licenses/MIT)), and [supports 18 human languages](#supported-languages). \n\n* [Getting Started](https://voice2json.org/#getting-started)\n* [Commands](https://voice2json.org/commands.html)\n    * [Data Formats](https://voice2json.org/formats.html)\n* [Profiles](https://github.com/synesthesiam/voice2json-profiles)\n* [Recipes](https://voice2json.org/recipes.html)\n* [Node-RED Plugin](https://github.com/johanneskropf/node-red-contrib-voice2json)\n* [About](https://voice2json.org/about.html)\n    * [Whitepaper](https://voice2json.org/whitepaper.html)\n\nFrom the command-line:\n\n```bash\n$ voice2json -p en transcribe-wav \\\n      \u003c turn-on-the-light.wav | \\\n      voice2json -p en recognize-intent | \\\n      jq .\n```\n\nproduces a [JSON event](https://voice2json.org/formats.html) like:\n\n```json\n{\n    \"text\": \"turn on the light\",\n    \"intent\": {\n        \"name\": \"LightState\"\n    },\n    \"slots\": {\n        \"state\": \"on\"\n    }\n}\n```\n\nwhen trained with this [template](https://voice2json.org/sentences.html):\n\n```\n[LightState]\nstates = (on | off)\nturn (\u003cstates\u003e){state} [the] light\n```\n\n`voice2json` is \u003cstrong\u003eoptimized for\u003c/strong\u003e:\n\n* Sets of voice commands that are described well [by a grammar](https://voice2json.org/sentences.html)\n* Commands with [uncommon words or pronunciations](https://voice2json.org/commands.html#pronounce-word)\n* Commands or intents that [can vary at runtime](#unique-features)\n\nIt can be used to:\n\n* Add voice commands to [existing applications or Unix-style workflows](https://voice2json.org/recipes.html#create-an-mqtt-transcription-service)\n* Provide basic [voice assistant functionality](https://voice2json.org/recipes.html#set-and-run-timers) completely offline on modest hardware\n* Bootstrap more [sophisticated speech/intent recognition systems](https://voice2json.org/recipes.html#train-a-rasa-nlu-bot)\n\nSupported speech to text systems include:\n\n* CMU's [pocketsphinx](https://github.com/cmusphinx/pocketsphinx)\n* Dan Povey's [Kaldi](https://kaldi-asr.org)\n* Mozilla's [DeepSpeech](https://github.com/mozilla/DeepSpeech) 0.9\n* Kyoto University's [Julius](https://github.com/julius-speech/julius)\n\n---\n\n## Supported Languages\n\n* Catalan (`ca`)\n    * [`ca-es_pocketsphinx-cmu`](https://github.com/synesthesiam/ca-es_pocketsphinx-cmu)\n* Czech (`cs`)\n    * [`cs-cz_kaldi-rhasspy`](https://github.com/rhasspy/cs_kaldi-rhasspy)\n* German (`de`)\n    * [`de_deepspeech-aashishag`](https://github.com/synesthesiam/de_deepspeech-aashishag)\n    * [`de_deepspeech-jaco`](https://github.com/rhasspy/de_deepspeech-jaco)\n    * [`de_kaldi-zamia`](https://github.com/synesthesiam/de_kaldi-zamia) (default)\n    * [`de_pocketsphinx-cmu`](https://github.com/synesthesiam/de_pocketsphinx-cmu)\n* Greek (`el`)\n    * [`el-gr_pocketsphinx-cmu`](https://github.com/synesthesiam/el-gr_pocketsphinx-cmu)\n* English (`en`)\n    * [`en-in_pocketsphinx-cmu`](https://github.com/synesthesiam/en-in_pocketsphinx-cmu)\n    * [`en-us_deepspeech-mozilla`](https://github.com/synesthesiam/en-us_deepspeech-mozilla)\n    * [`en-us_kaldi-rhasspy`](https://github.com/rhasspy/en-us_kaldi-rhasspy)\n    * [`en-us_kaldi-zamia`](https://github.com/synesthesiam/en-us_kaldi-zamia) (default)\n    * [`en-us_pocketsphinx-cmu`](https://github.com/synesthesiam/en-us_pocketsphinx-cmu)\n* Spanish (`es`)\n    * [`es_deepspeech-jaco`](https://github.com/rhasspy/es_deepspeech-jaco)\n    * [`es_kaldi-rhasspy`](https://github.com/rhasspy/es_kaldi-rhasspy) (default)\n    * [`es-mexican_pocketsphinx-cmu`](https://github.com/synesthesiam/es-mexican_pocketsphinx-cmu)\n    * [`es_pocketsphinx-cmu`](https://github.com/synesthesiam/es_pocketsphinx-cmu)\n* French (`fr`)\n    * [`fr_deepspeech-jaco`](https://github.com/rhasspy/fr_deepspeech-jaco)\n    * [`fr_kaldi-guyot`](https://github.com/synesthesiam/fr_kaldi-guyot) (default)\n    * [`fr_kaldi-rhasspy`](https://github.com/rhasspy/fr_kaldi-rhasspy)\n    * [`fr_pocketsphinx-cmu`](https://github.com/synesthesiam/fr_pocketsphinx-cmu)\n* Hindi (`hi`)\n    * [`hi_pocketsphinx-cmu`](https://github.com/synesthesiam/hi_pocketsphinx-cmu)\n* Italian (`it`)\n    * [`it_deepspeech-jaco`](https://github.com/rhasspy/it_deepspeech-jaco)\n    * [`it_deepspeech-mozillaitalia`](https://github.com/rhasspy/it_deepspeech-mozillaitalia) (default)\n    * [`it_kaldi-rhasspy`](https://github.com/rhasspy/it_kaldi-rhasspy)\n    * [`it_pocketsphinx-cmu`](https://github.com/synesthesiam/it_pocketsphinx-cmu)\n* Korean (`ko`)\n    * [`ko-kr_kaldi-montreal`](https://github.com/synesthesiam/ko-kr_kaldi-montreal)\n* Kazakh (`kz`)\n    * [`kz_pocketsphinx-cmu`](https://github.com/synesthesiam/kz_pocketsphinx-cmu)\n* Dutch (`nl`)\n    * [`nl_kaldi-cgn`](https://github.com/synesthesiam/nl_kaldi-cgn) (default)\n    * [`nl_kaldi-rhasspy`](https://github.com/rhasspy/nl_kaldi-rhasspy)\n    * [`nl_pocketsphinx-cmu`](https://github.com/synesthesiam/nl_pocketsphinx-cmu)\n* Polish (`pl`)\n    * [`pl_deepspeech-jaco`](https://github.com/rhasspy/pl_deepspeech-jaco) (default)\n    * [`pl_julius-github`](https://github.com/synesthesiam/pl_julius-github)\n* Portuguese (`pt`)\n    * [`pt-br_pocketsphinx-cmu`](https://github.com/synesthesiam/pt-br_pocketsphinx-cmu)\n* Russian (`ru`)\n    * [`ru_kaldi-rhasspy`](https://github.com/rhasspy/ru_kaldi-rhasspy) (default)\n    * [`ru_pocketsphinx-cmu`](https://github.com/synesthesiam/ru_pocketsphinx-cmu)\n* Swedish (`sv`)\n    * [`sv_kaldi-montreal`](https://github.com/synesthesiam/sv_kaldi-montreal)\n    * [`sv_kaldi-rhasspy`](https://github.com/rhasspy/sv_kaldi-rhasspy) (default)\n* Vietnamese (`vi`)\n    * [`vi_kaldi-montreal`](https://github.com/synesthesiam/vi_kaldi-montreal)\n* Mandarin (`zh`)\n    * [`zh-cn_pocketsphinx-cmu`](https://github.com/synesthesiam/zh-cn_pocketsphinx-cmu)\n\n---\n\n## Unique Features\n\n`voice2json` is more than just a wrapper around open source speech to text systems!\n\n* Training produces **both** a speech and intent recognizer. By describing your voice commands with `voice2json`'s [templating language](https://voice2json.org/sentences.html), you get [more than just transcriptions](https://voice2json.org/formats.html#intents) for free.\n* Re-training is **fast enough** to be done at runtime (usually \u003c 5s), even up to [millions of possible voice commands](https://voice2json.org/recipes.html#set-and-run-times). This means you can change [referenced slot](https://voice2json.org/sentences.html#slot-references) values or [add/remove intents](https://voice2json.org/commands.html#intent-whitelist) on the fly.\n* All of the [available commands](#commands) are designed to work well in Unix pipelines, typically consuming/emitting plaintext or [newline-delimited JSON](http://jsonlines.org). Audio input/output is [file-based](https://voice2json.org/commands.html#audio-sources), so you can receive audio from [any source](https://voice2json.org/recipes.html#stream-microphone-audio-over-a-network).\n\n## Commands\n\n* [download-profile](https://voice2json.org/commands.html#download-profile) - Download missing files for a profile\n* [train-profile](https://voice2json.org/commands.html#train-profile) - Generate speech/intent artifacts\n* [transcribe-wav](https://voice2json.org/commands.html#transcribe-wav) - Transcribe WAV file to text\n    * Add `--open` for unrestricted speech to text\n* [transcribe-stream](https://voice2json.org/commands.html#transcribe-stream) - Transcribe live audio stream to text\n    * Add `--open` for unrestricted speech to text\n* [recognize-intent](https://voice2json.org/commands.html#recognize-intent) - Recognize intent from JSON or text\n* [wait-wake](https://voice2json.org/commands.html#wait-wake) - Listen to live audio stream for wake word\n* [record-command](https://voice2json.org/commands.html#record-command) - Record voice command from live audio stream\n* [pronounce-word](https://voice2json.org/commands.html#pronounce-word) - Look up or guess how a word is pronounced\n* [generate-examples](https://voice2json.org/commands.html#generate-examples) - Generate random intents\n* [record-examples](https://voice2json.org/commands.html#record-examples) - Generate and record speech examples\n* [test-examples](https://voice2json.org/commands.html#test-examples) - Test recorded speech examples\n* [show-documentation](https://voice2json.org/commands.html#show-documentation) - Run HTTP server locally with documentation\n* [print-profile](https://voice2json.org/commands.html#print-profile) - Print profile settings\n* [print-downloads](https://voice2json.org/commands.html#print-downloads) - Print profile file download information\n* [print-files](https://voice2json.org/commands.html#print-files) - Print user profile files for backup\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsynesthesiam%2Fvoice2json","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsynesthesiam%2Fvoice2json","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsynesthesiam%2Fvoice2json/lists"}