Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/cmusphinx/pocketsphinx-python
Python module installed with setup.py
https://github.com/cmusphinx/pocketsphinx-python
Last synced: 2 months ago
JSON representation
Python module installed with setup.py
- Host: GitHub
- URL: https://github.com/cmusphinx/pocketsphinx-python
- Owner: cmusphinx
- License: other
- Archived: true
- Fork: true (bambocher/pocketsphinx-python)
- Created: 2014-12-16T18:01:33.000Z (about 10 years ago)
- Default Branch: master
- Last Pushed: 2022-06-29T15:22:48.000Z (over 2 years ago)
- Last Synced: 2024-08-03T04:08:11.716Z (6 months ago)
- Language: Python
- Homepage:
- Size: 24.5 MB
- Stars: 338
- Watchers: 25
- Forks: 81
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- low-resource-languages - pocketsphinx-python - Python module installed with setup.py. (Software / Utilities)
README
# Pocketsphinx Python
**This module is no longer relevant, and is being archived. Python bindings are included in the [pocketsphinx](https://github.com/cmusphinx/pocketsphinx) module. Alternatively, you may consider using [bambocher/pocketsphinx-python](https://github.com/bambocher/pocketsphinx-python)**
Pocketsphinx is a part of the [CMU Sphinx](http://cmusphinx.sourceforge.net) Open Source Toolkit For Speech Recognition.
This package provides a python interface to CMU [Sphinxbase](https://github.com/cmusphinx/sphinxbase) and [Pocketsphinx](https://github.com/cmusphinx/pocketsphinx) libraries created with [SWIG](http://www.swig.org) and [Setuptools](https://setuptools.readthedocs.io).
## Supported platforms
* Windows
* Linux
* Mac OS X## Installation
```
git clone --recursive https://github.com/cmusphinx/pocketsphinx-python/
cd pocketsphinx-python
python setup.py install
```## Usage
### LiveSpeech
An iterator class for continuous recognition or keyword search from a microphone.
Note that this is not supported (yet) in macOS Big Sur.```python
from pocketsphinx import LiveSpeech
for phrase in LiveSpeech(): print(phrase)
```An example of a keyword search:
```python
from pocketsphinx import LiveSpeechspeech = LiveSpeech(lm=False, keyphrase='forward', kws_threshold=1e-20)
for phrase in speech:
print(phrase.segments(detailed=True))
```With your model and dictionary:
```python
import os
from pocketsphinx import LiveSpeech, get_model_pathmodel_path = get_model_path()
speech = LiveSpeech(
verbose=False,
sampling_rate=16000,
buffer_size=2048,
no_search=False,
full_utt=False,
hmm=os.path.join(model_path, 'en-us'),
lm=os.path.join(model_path, 'en-us.lm.bin'),
dic=os.path.join(model_path, 'cmudict-en-us.dict')
)for phrase in speech:
print(phrase)
```### AudioFile
An iterator class for continuous recognition or keyword search from a file.
```python
from pocketsphinx import AudioFile
for phrase in AudioFile(): print(phrase) # => "go forward ten meters"
```An example of a keyword search:
```python
from pocketsphinx import AudioFileaudio = AudioFile(lm=False, keyphrase='forward', kws_threshold=1e-20)
for phrase in audio:
print(phrase.segments(detailed=True)) # => "[('forward', -617, 63, 121)]"
```With your model and dictionary:
```python
import os
from pocketsphinx import AudioFile, get_model_path, get_data_pathmodel_path = get_model_path()
data_path = get_data_path()config = {
'verbose': False,
'audio_file': os.path.join(data_path, 'goforward.raw'),
'buffer_size': 2048,
'no_search': False,
'full_utt': False,
'hmm': os.path.join(model_path, 'en-us'),
'lm': os.path.join(model_path, 'en-us.lm.bin'),
'dict': os.path.join(model_path, 'cmudict-en-us.dict')
}audio = AudioFile(**config)
for phrase in audio:
print(phrase)
```Convert frame into time coordinates:
```python
from pocketsphinx import AudioFile# Frames per Second
fps = 100for phrase in AudioFile(frate=fps): # frate (default=100)
print('-' * 28)
print('| %5s | %3s | %4s |' % ('start', 'end', 'word'))
print('-' * 28)
for s in phrase.seg():
print('| %4ss | %4ss | %8s |' % (s.start_frame / fps, s.end_frame / fps, s.word))
print('-' * 28)# ----------------------------
# | start | end | word |
# ----------------------------
# | 0.0s | 0.24s | |
# | 0.25s | 0.45s | |
# | 0.46s | 0.63s | go |
# | 0.64s | 1.16s | forward |
# | 1.17s | 1.52s | ten |
# | 1.53s | 2.11s | meters |
# | 2.12s | 2.6s | |
# ----------------------------
```### Pocketsphinx
It's a simple and flexible proxy class to `pocketsphinx.Decode`.
```python
from pocketsphinx import Pocketsphinx
print(Pocketsphinx().decode()) # => "go forward ten meters"
```A more comprehensive example:
```python
from __future__ import print_function
import os
from pocketsphinx import Pocketsphinx, get_model_path, get_data_pathmodel_path = get_model_path()
data_path = get_data_path()config = {
'hmm': os.path.join(model_path, 'en-us'),
'lm': os.path.join(model_path, 'en-us.lm.bin'),
'dict': os.path.join(model_path, 'cmudict-en-us.dict')
}ps = Pocketsphinx(**config)
ps.decode(
audio_file=os.path.join(data_path, 'goforward.raw'),
buffer_size=2048,
no_search=False,
full_utt=False
)print(ps.segments()) # => ['', '', 'go', 'forward', 'ten', 'meters', '']
print('Detailed segments:', *ps.segments(detailed=True), sep='\n') # => [
# word, prob, start_frame, end_frame
# ('', 0, 0, 24)
# ('', -3778, 25, 45)
# ('go', -27, 46, 63)
# ('forward', -38, 64, 116)
# ('ten', -14105, 117, 152)
# ('meters', -2152, 153, 211)
# ('', 0, 212, 260)
# ]print(ps.hypothesis()) # => go forward ten meters
print(ps.probability()) # => -32079
print(ps.score()) # => -7066
print(ps.confidence()) # => 0.04042641466841839print(*ps.best(count=10), sep='\n') # => [
# ('go forward ten meters', -28034)
# ('go for word ten meters', -28570)
# ('go forward and majors', -28670)
# ('go forward and meters', -28681)
# ('go forward and readers', -28685)
# ('go forward ten readers', -28688)
# ('go forward ten leaders', -28695)
# ('go forward can meters', -28695)
# ('go forward and leaders', -28706)
# ('go for work ten meters', -28722)
# ]
```### Default config
If you don't pass any argument while creating an instance of the Pocketsphinx, AudioFile or LiveSpeech class, it will use next default values:
```python
verbose = False
logfn = /dev/null or nul
audio_file = site-packages/pocketsphinx/data/goforward.raw
audio_device = None
sampling_rate = 16000
buffer_size = 2048
no_search = False
full_utt = False
hmm = site-packages/pocketsphinx/model/en-us
lm = site-packages/pocketsphinx/model/en-us.lm.bin
dict = site-packages/pocketsphinx/model/cmudict-en-us.dict
```Any other option must be passed into the config as is, without using symbol `-`.
If you want to disable default language model or dictionary, you can change the value of the corresponding options to False:
```python
lm = False
dict = False
```### Verbose
Send output to stdout:
```python
from pocketsphinx import Pocketsphinxps = Pocketsphinx(verbose=True)
ps.decode()print(ps.hypothesis())
```Send output to file:
```python
from pocketsphinx import Pocketsphinxps = Pocketsphinx(verbose=True, logfn='pocketsphinx.log')
ps.decode()print(ps.hypothesis())
```### Compatibility
Parent classes are still available:
```python
import os
from pocketsphinx import DefaultConfig, Decoder, get_model_path, get_data_pathmodel_path = get_model_path()
data_path = get_data_path()# Create a decoder with a certain model
config = DefaultConfig()
config.set_string('-hmm', os.path.join(model_path, 'en-us'))
config.set_string('-lm', os.path.join(model_path, 'en-us.lm.bin'))
config.set_string('-dict', os.path.join(model_path, 'cmudict-en-us.dict'))
decoder = Decoder(config)# Decode streaming data
buf = bytearray(1024)
with open(os.path.join(data_path, 'goforward.raw'), 'rb') as f:
decoder.start_utt()
while f.readinto(buf):
decoder.process_raw(buf, False, False)
decoder.end_utt()
print('Best hypothesis segments:', [seg.word for seg in decoder.seg()])
```## Install development version
### Install requirements
Windows requirements:
* [Python](https://www.python.org/downloads)
* [Git](http://git-scm.com/downloads)
* [Swig](http://www.swig.org/download.html)
* [Visual Studio Community](https://www.visualstudio.com/ru-ru/downloads/download-visual-studio-vs.aspx)Ubuntu requirements:
```shell
sudo apt-get install -qq python python-dev python-pip build-essential swig git libpulse-dev libasound2-dev
```Mac OS X requirements:
```shell
brew reinstall swig python
```### Install UPSTREAM version with pip
Note that this is NOT the same as this version under github cmusphinx.
```shell
pip install https://github.com/bambocher/pocketsphinx-python/archive/master.zip
```### Install with distutils
```shell
git clone --recursive https://github.com/cmusphinx/pocketsphinx-python
cd pocketsphinx-python
python setup.py install
```## Projects using pocketsphinx-python
* [SpeechRecognition](https://github.com/Uberi/speech_recognition) - Library for performing speech recognition, with support for several engines and APIs, online and offline.
## License
[The BSD License](https://github.com/bambocher/pocketsphinx-python/blob/master/LICENSE)