{"id":13741571,"url":"https://github.com/cmusphinx/sphinxtrain","last_synced_at":"2025-05-11T07:22:08.360Z","repository":{"id":15774524,"uuid":"18513555","full_name":"cmusphinx/sphinxtrain","owner":"cmusphinx","description":"Acoustic model trainer for CMU Sphinx","archived":false,"fork":false,"pushed_at":"2024-12-11T13:10:20.000Z","size":15884,"stargazers_count":185,"open_issues_count":17,"forks_count":113,"subscribers_count":20,"default_branch":"master","last_synced_at":"2025-05-08T21:36:24.555Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Roff","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/cmusphinx.png","metadata":{"files":{"readme":"README.md","changelog":"NEWS","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2014-04-07T10:32:58.000Z","updated_at":"2025-04-27T03:03:06.000Z","dependencies_parsed_at":"2024-12-11T14:31:27.987Z","dependency_job_id":null,"html_url":"https://github.com/cmusphinx/sphinxtrain","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cmusphinx%2Fsphinxtrain","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cmusphinx%2Fsphinxtrain/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cmusphinx%2Fsphinxtrain/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cmusphinx%2Fsphinxtrain/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/cmusphinx","download_url":"https://codeload.github.com/cmusphinx/sphinxtrain/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":253530007,"owners_count":21922787,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-03T04:01:00.414Z","updated_at":"2025-05-11T07:22:08.340Z","avatar_url":"https://github.com/cmusphinx.png","language":"Roff","funding_links":[],"categories":["Software"],"sub_categories":["Utilities"],"readme":"SphinxTrain 5.0.0\n=================\n\nThis is SphinxTrain, Carnegie Mellon University's open source acoustic\nmodel trainer. This directory contains the scripts and instructions\nnecessary for building models for the CMU Sphinx Recognizer.\n\nThis distribution is free software, see LICENSE for licence.\n\nFor up-to-date information, please see the web site at\n\n   https://cmusphinx.github.io\n\nAmong the interesting resources there, you will find a link to\n\"Resources to build a recognition system\", with pointers to a\ndictionary, audio data, acoustic model etc.\n\nFor introduction in training the acoustic model see the tutorial\n\nhttps://cmusphinx.github.io/wiki/tutorialam\n\nInstallation Guide:\n-------------------\n\nThis sections contain installation guide for various platforms. \n\nAll Platforms:\n--------------\n\nYou will unfortunately need both Perl and Python to use the scripts\nprovided. Linux usually comes with some version of Perl and Python. If\nyou do not have Perl installed, please check:\n\nhttp://www.perl.org\n\nwhere you can download it for free. For Windows, if you insist on not\nusing Windows Subsystem for Linux, a popular version, ActivePerl, is\navailable from ActiveState at:\n\nhttps://www.activestate.com/products/perl/\n\nPython for Windows can be obtained from:\n\nhttp://www.python.org/download/\n\nFor some advanced techniques (which are not enabled by default) you\nwill need NumPy and SciPy.  Packages for NumPy and SciPy can be\nobtained from:\n\nhttp://scipy.org/Download\n\nOr you can use Anaconda which makes all of this somewhat easier:\n\nhttps://www.anaconda.com/products/distribution\n\nIf you wish to use the grapheme-to-phoneme support, you will need\nrather specific versions of\n[OpenFST](https://www.openfst.org/twiki/bin/view/FST/WebHome) and\n[OpenGRM\nNGram](https://www.opengrm.org/twiki/bin/view/GRM/NGramLibrary).  It\nis known to work with OpenFST 1.6.3, and known *not* to work with\n1.8.2. There is probably nothing you want in the latest version\nanyway, and compiling it will consume several hours of your life and\nseveral gigabytes of your disk for no good reason, so best to just use\nwhat Ubuntu 20.04 LTS or 22.04 LTS will install for you with:\n\n    apt install libfst-dev libngram-dev\n\nSee the note about `-DBUILD_G2P=ON` below to enable G2P support.\n\nLinux/Unix Installation:\n------------------------\n\nThis distribution uses CMake to find out basic information about your\nsystem, and should compile on most Unix and Unix-like systems, and\ncertainly on Linux.  On reasonable Linux distributions, a suitable\nversion of CMake (at least 3.14) can be installed with your package\nmanager, or may already be there if you have installed development\ntools.\n\nOn certain unreasonable distributions that are far too often installed\non \"enterprise\" or \"cloud\" or HPC systems, the version of CMake is\nincredibly ancient, and the package manager will not help you, so you\nwill have to install it manually, following the instructions at\nhttps://cmake.org/download/\n\nTo build, simply run:\n\n    cmake -S . -B build\n    cmake --build build\n\nThis should configure everything automatically. The code has been\ntested with gcc.\n\nTo enable G2P, you need to add a magic incantation to the first\ncommand above, namely:\n\n    cmake -S . -B build -DBUILD_G2P=ON\n    \nYou can also enable shared libraries with `-DBUILD_SHARED_LIBS=ON`,\nbut I suggest that you *not* do that unless you have a very good\nreason.\n\nYou do not need to install SphinxTrain to run it, simply run\n`scripts/sphinxtrain` from the source directory when initializing a\ntraining directory.  Note that you do need to build and install\nPocketSphinx for evaluation to work properly, however.\n\nYou can also install SphinxTrain system-wide if you so desire:\n\n    sudo cmake --build build --target install\n\nThis will put various files in `/usr/local/lib`,\n`/usr/local/libexec/sphinxbase` and `/usr/local/share/sphinxbase` and\ncreate `/usr/local/bin/sphinxbase`.\n\nAlso, check the section title \"All Platforms\" above.\n\nWindows Installation:\n---------------------\n\nYou can build with Visual Studio Code using the C++ and CMake\nextensions.  This will create all the binaries in `build\\Debug` or\n`build\\Release` depending on the configuration you select.  As above,\nyou can run `python ..\\sphinxtrain\\scripts\\sphinxtrain` (or whatever\nthe path is to `scripts\\sphinxtrain` in your source directory) to set\nup and run training.\n\nNote that you will need to have Perl on your path, among other things,\nand also, note that none of this has been tested, so we suggest you just\nuse [Windows Subsystem for\nLinux](https://learn.microsoft.com/en-us/windows/wsl/install), which\nis really a lot faster and easier to use than the native Windows\ncommand-line.\n\nIf you are using Windows Subsystem for Linux, the installation\nprocedure is identical to the Unix installation.\n\nAlso, check the section title \"All Platforms\" above.\n\nAcknowldegments\n---------------\n\nThe development of this code has included support at different times\nby various United States Government agencies, under different programs,\nincluding the Defence Advanced Projects Agency (DARPA) and the\nNational Science Foundation (NSF). We are grateful for their support.\n\nThis work was built over a large number of years at CMU by most of the\npeople in the Sphinx Group. Some code goes back to 1986. The most\nrecent work in tidying this up for release includes the following,\nlisted alphabetically (at least these are the people who are most\nlikely able to help you).\n\n- Alan W Black (awb@cs.cmu.edu)\n- Arthur Chan (archan@cs.cmu.edu)\n- Evandro Gouvea (egouvea+@cs.cmu.edu)\n- Ricky Houghton (ricky.houghton@cs.cmu.edu)\n- David Huggins-Daines (dhdaines@gmail.com)\n- Kevin Lenzo (kevinlenzo@gmail.com)\n- Ravi Mosur\n- Long Qin (lqin@cs.cmu.edu)\n- Rita Singh (rsingh+@cs.cmu.edu)\n- Eric Thayer\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcmusphinx%2Fsphinxtrain","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcmusphinx%2Fsphinxtrain","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcmusphinx%2Fsphinxtrain/lists"}