{"id":18728767,"url":"https://github.com/rubyonworld/pocketsphinx-server","last_synced_at":"2025-07-17T07:03:07.370Z","repository":{"id":174008032,"uuid":"542163821","full_name":"RubyOnWorld/pocketsphinx-server","owner":"RubyOnWorld","description":"Ruby-based web service for speech recognition, using the PocketSphinx gstreamer module.","archived":false,"fork":false,"pushed_at":"2022-09-28T00:51:04.000Z","size":268,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-05-19T20:32:47.132Z","etag":null,"topics":["gstreamer","module","pocket","pocketsphinx","ruby","server","sphinx"],"latest_commit_sha":null,"homepage":"","language":"Ruby","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/RubyOnWorld.png","metadata":{"files":{"readme":"README.rdoc","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-09-27T15:42:09.000Z","updated_at":"2022-09-28T01:33:58.000Z","dependencies_parsed_at":null,"dependency_job_id":"1664e8cd-b933-4182-b53f-ef04fec4007a","html_url":"https://github.com/RubyOnWorld/pocketsphinx-server","commit_stats":null,"previous_names":["rubyonworld/pocketsphinx-server"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/RubyOnWorld/pocketsphinx-server","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RubyOnWorld%2Fpocketsphinx-server","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RubyOnWorld%2Fpocketsphinx-server/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RubyOnWorld%2Fpocketsphinx-server/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RubyOnWorld%2Fpocketsphinx-server/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/RubyOnWorld","download_url":"https://codeload.github.com/RubyOnWorld/pocketsphinx-server/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RubyOnWorld%2Fpocketsphinx-server/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":262699208,"owners_count":23350256,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["gstreamer","module","pocket","pocketsphinx","ruby","server","sphinx"],"created_at":"2024-11-07T14:24:17.527Z","updated_at":"2025-06-30T02:34:42.872Z","avatar_url":"https://github.com/RubyOnWorld.png","language":"Ruby","readme":"= Introduction\n\nRuby-based web service for speech recognition, using the PocketSphinx gstreamer module.\n\n= Requirements\n\n* Ruby 1.8\n* Sinatra\n* Rack\n* Unicorn\n* PocketSphinx (NOTE: some features of the server require patched PocketSphinx, see below)\n* Some acoustic and language models for PocketSphinx\n\n\n= Installing\n\n== CMU Sphinx\n\n* Install sphinxbase from SVN (make, make install)\n\n=== Apply PocketSphinx patch\n\nIn cmusphinx/pocketsphinx directory:\n\n  wget http://www.phon.ioc.ee/~tanela/ps_gst.patch\n  patch  -p0 -i ps_gst.patch\n\n\nMake sure you have GStreamer devevelopment packages installed. In Debian Squeeze:\n\n  apt-get install libgstreamer0.10-dev libgstreamer-plugins-base0.10-dev\n  \nAnd configure, make, make install as usual.\n\n== Install Ruby gems: Unicorn and Sinatra, UUID tools, JSON, locale\n\nThis assumes you have ruby and rubygems installed.\n\nYou might want to do this as root:\n\n  gem install unicorn\n  gem install sinatra\n  gem install uuidtools\n  gem install json\n  gem install locale\n  \nInstall ruby-gstreamer package (might vary depending on your distribution):\n  \n  apt-get install libgst-ruby1.8\n\n== Additional tools\n\nEnglish GF-based recognizer also need:\n\n* libtext-unidecode-perl\n* Phonetisaurus, Phonetisaurus prebuilt model for English (http://code.google.com/p/phonetisaurus/downloads/detail?name=g014b2b.tgz)\n* Python\n\n\n== Run ruby-pocketsphinx-server\n\nClone the git repository:\n\n  git clone git://github.com/alumae/ruby-pocketsphinx-server.git\n  \nBefore executing, add `/usr/local/lib` to the path where GStreamer plugins are looked for:\n\n  export GST_PLUGIN_PATH=/usr/local/lib\n\n= Running\n\n unicorn -c unicorn.conf.rb config.ru\n \nIf you installed Unicorn as a Ruby gem, you might need to execute:\n \n /var/lib/gems/1.8/bin/unicorn -c unicorn.conf.rb config.ru\n \nTest the default configuration (English WSJ language model with HUB4 acostic models), using a raw audio file in the PocketSphinx test directory \n(replace `$(POCKETSPHINX_DIR)` with the Pocketsphinx source directory):\n\n  curl -T $(POCKETSPHINX_DIR)/test/data/wsj/n800_440c0207.wav -H \"Content-Type: audio/x-wav\"  \"http://localhost:8080/recognize\"\n  \nResponse should be:\n\n  {\n    \"status\": 0,\n    \"hypotheses\": [\n      {\n        \"utterance\": \"the agency isn't likely to take any action until the union's rank and file votes on the contract into three weeks\"\n      },\n      {\n        \"utterance\": \"the agency isn't likely to take any action until the union's rank and file puts on the contract into three weeks\"\n      },\n      {\n        \"utterance\": \"the agency isn't likely to take any action until the union's rank and file funds from the contract into three weeks\"\n      },\n      {\n        \"utterance\": \"the agency isn't likely to take any action until the union's rank and file for from the contract into three weeks\"\n      },\n      {\n        \"utterance\": \"the agency isn't likely to take any action until the union's rank and file parts of the contract into three weeks\"\n      }\n    ],\n    \"id\": \"8686a37b5674cbdc63deb13f73de81a5\"\n  }\n\n\n= Configuration\n\n== Web service\n\nUnicorn configuration is in file unicorn.conf.rb. See http://unicorn.bogomips.org/examples/unicorn.conf.rb for\nmore info. \n\n== Recognizer\n\nSee conf.yaml\n\n= Using the web service\n\nSome of the more advanced examples below are specific to the Estonian configuration.\n\n==Example 1\n\nRecord a sentence to a wav file, in mono (hit Ctrl-C when done speaking):\n\n rec -c 1 sentence.wav\n \n \nSend it to the web service:\n\n curl   -X POST --data-binary @sentence.wav -H \"Content-Type: audio/x-wav\"  http://localhost:8080/recognize\n\nOutput (encoded using json, the example uses Estonian models):\n\n  {\n    \"status\": 0,\n    \"hypotheses\": [\n      {\n        \"utterance\": [\n          \"t\\u00e4na on v\\u00e4ljas \\u00fcsna ilus ilm\"\n        ]\n      }\n    ],\n    \"id\": \"e30f54561135d681599915562d77d240\"\n  }\n \n== Example 2\n\nRecord a raw file using arecord:\n\n arecord --format=S16_LE  --file-type raw  --channels 1 --rate 16000 \u003e sentence2.raw\n\nSend it to web service:\n \n curl -X POST --data-binary @sentence2.raw -H \"Content-Type: audio/x-raw-int; rate=16000\"  http://localhost:8080/recognize\n \n== Example 3\n\nRecord a 5 second audio, pipe it to curl, which streams it directly to web service using PUT (and gets almost instant response):\n\n arecord --format=S16_LE --file-type raw --channels 1 --rate 16000 --duration 5 | curl -vv -T - -H \"Content-Type: audio/x-raw-int; rate=16000\"  http://localhost:8080/recognize\n \n \n= Support for JSGF grammars\n\nUsers can use their own grammars to recognize certain sentences. The grammars should be in JSGF format.\n\nExample JSGF (let's call it robot.jsgf)\n\n #JSGF V1.0;\n  \n grammar robot;\n   \n public \u003ccommand\u003e = (liigu | mine ) [ ( üks | kaks | kolm | neli | viis ) meetrit ] (edasi | tagasi);\n \nNB! Grammars should be in the same charset that the server is using for dictionary, which currently is latin-1 (sorry for that). \n \nYou need to upload the JSGF file to somewhere where the server can fetch it, let's say http://www.example.com/robot.txt\n \nNow, let the server download and compile it:\n\n curl -vv  http://localhost:8080/fetch-lm?url=http://www.example.com/robot.jsgf\n\nThis should result in HTTP/1.1 200 OK.\n\nNow you can use the grammar to recognize a sentence that is accepted by the grammar:\n\n arecord --format=S16_LE --file-type raw --channels 1 --rate 16000 --duration 5 | \\\n curl -vv -T - -H \"Content-Type: audio/x-raw-int; rate=16000\"  http://localhost:8080/recognize?lm=http://www.example.com/robot.jsgf\n\nResult:\n \n {\n   \"status\": 0,\n   \"hypotheses\": [\n     {\n       \"utterance\": \"mine viis meetrit tagasi\"\n     }\n   ],\n   \"id\": \"9e3895e9ee0b5138e73c6fca30f51a58\"\n }\n\nIf you update the grammar on the server, you need to make the /fetch-jsgf request again, as the server doesn't check for changes every time\na recognition request is done (for efficiency reasons).\n\n= Support for GF grammars\n\nGF (Grammatical Framework) grammars are supported. \n\nA GF grammar must be compiled into a .pgf file. To upload it to the server, use the fetch-pgf API call, e.g.:\n  \n  curl \"http://bark.phon.ioc.ee/speech-api/v1/fetch-lm?url=http://kaljurand.github.com/Grammars/grammars/pgf/Calc.pgf\u0026lang=Est\"\n  \nThe 'lang' attribute (defaults to 'Est') specifies input languages of the grammar. Many comma-separated languages can be specified, e.g lang=Est,Est2\n\nTo recognize with a GF, use similar request as with JSGF, e.g.:\n\n  arecord --format=S16_LE --file-type raw --channels 1 --rate 16000 --duration 5 | curl -vv -T - -H \"Content-Type: audio/x-raw-int; rate=16000\"  \"http://localhost:8080/recognize?lm=http://kaljurand.github.com/Grammars/grammars/pgf/Calc.pgf\n  \nYou can also specify output language(s) that will be used to linearize the raw recognition result, e.g.:\n \n arecord --format=S16_LE --file-type raw --channels 1 --rate 16000 --duration 5 | curl -vv -T - -H \"Content-Type: audio/x-raw-int; rate=16000\"  \"http://localhost:8080/recognize?lm=http://kaljurand.github.com/Grammars/grammars/pgf/Calc.pgf\u0026output-lang=App\"\n \nOutput:\n\n {\n  \"status\": 0,\n  \"hypotheses\": [\n    {\n      \"utterance\": \"viis minutit sekundites\",\n      \"linearizations\": [\n        {\n          \"lang\": \"App\",\n          \"output\": \"5 ' IN \\\"\"\n        },\n        {\n          \"lang\": \"App\",\n          \"output\": \"5 min IN s\"\n        }\n      ]\n    }\n  ],\n  \"id\": \"83486feaca30995401ed4a66951a3f23\"\n }\n  \nMultiple output languages can be used, by using comma-separated values: \"..\u0026output-lang=App,App2\"\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frubyonworld%2Fpocketsphinx-server","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frubyonworld%2Fpocketsphinx-server","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frubyonworld%2Fpocketsphinx-server/lists"}