https://github.com/synesthesiam/pt-synesthesiam
CMU Sphinx acoustic model for Portugese (pt-br)
https://github.com/synesthesiam/pt-synesthesiam
pocketsphinx portuguese speech-recognition
Last synced: 6 months ago
JSON representation
CMU Sphinx acoustic model for Portugese (pt-br)
- Host: GitHub
- URL: https://github.com/synesthesiam/pt-synesthesiam
- Owner: synesthesiam
- License: mit
- Created: 2019-09-08T16:34:06.000Z (about 6 years ago)
- Default Branch: master
- Last Pushed: 2019-09-08T16:41:29.000Z (about 6 years ago)
- Last Synced: 2025-02-09T13:11:31.978Z (8 months ago)
- Topics: pocketsphinx, portuguese, speech-recognition
- Language: Jupyter Notebook
- Size: 2.46 MB
- Stars: 2
- Watchers: 3
- Forks: 2
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Portuguese Acoustic Model for CMU Sphinx
This is an attempt to train a CMU Sphinx acoustic model for Portuguese (pt-br) using the datasets found in [an ASR study for Brazillian Portuguese](https://github.com/igormq/asr-study). The resulting model is used in [Rhasspy](https://github.com/synesthesiam/rhasspy).
Following the [sphinxtrain tutorial](https://cmusphinx.github.io/wiki/tutorialam/), a continuous 200 senome acoustic model was train on approximately 8 hours of speech data. No attempt was made to validate or clean the data, since I don't know Portuguese. The trained model does poorly, as expected on so little data, but may be useful enough for a very constrained domain.