Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/connor-makowski/spych
Easily conduct voice recognition and use wake words to trigger events in python (using DeepSpeech)
https://github.com/connor-makowski/spych
python speech-recognition speech-to-text
Last synced: about 1 month ago
JSON representation
Easily conduct voice recognition and use wake words to trigger events in python (using DeepSpeech)
- Host: GitHub
- URL: https://github.com/connor-makowski/spych
- Owner: connor-makowski
- License: mit
- Created: 2022-01-28T22:01:17.000Z (almost 3 years ago)
- Default Branch: main
- Last Pushed: 2022-08-30T14:55:09.000Z (over 2 years ago)
- Last Synced: 2024-11-18T20:08:20.487Z (about 2 months ago)
- Topics: python, speech-recognition, speech-to-text
- Language: Python
- Homepage:
- Size: 124 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
Spych
==========
Pronounced: SpeechPython wrapper for easily accessing the [DeepSpeech](https://github.com/mozilla/DeepSpeech/) python package via python (without the DeepSpeech CLI)
Documentation for Spych Functions
--------
Spych - https://connor-makowski.github.io/spych/core.htmlSpych Wake - https://connor-makowski.github.io/spych/wake.html
Key Features
--------- Simplified access to pretrained DeepSpeech models for offline and free speech transcription
Setup
----------Make sure you have Python 3.6.x (or higher) installed on your system. You can download it [here](https://www.python.org/downloads/).
### Installation
1. Install SoX
- On Debian/Ubuntu
```
sudo apt install sox
```
- On Mac (via homebrew)
```
brew install sox
```
- On windows (Recommend WSL)2. Install Spych
```
pip install spych
```3. Get DeepSpeech Model and Score files:
```sh
curl -LO https://github.com/mozilla/DeepSpeech/releases/download/v0.9.3/deepspeech-0.9.3-models.pbmm
curl -LO https://github.com/mozilla/DeepSpeech/releases/download/v0.9.3/deepspeech-0.9.3-models.scorer
```# Examples
## Transcribe Existing Audio File
- [Spych Docs](https://connor-makowski.github.io/spych/core.html)
```py
from spych import spychspych_obj=spych(model_file='deepspeech-0.9.3-models.pbmm', scorer_file='deepspeech-0.9.3-models.scorer')
# Convert the audio file to text
print('Transcription:')
print(spych_obj.stt(audio_file='test.wav'))
```
- Note: A `.wav` file at the same sample rate as your selected DeepSpeech models is processed the fastest## Record and Transcribe
- [Spych Docs](https://connor-makowski.github.io/spych/core.html)
```py
from spych import spychspych_obj=spych(model_file='deepspeech-0.9.3-models.pbmm', scorer_file='deepspeech-0.9.3-models.scorer')
# Record using your default microphone for 3 seconds
print('Recording...')
my_audio_buffer=spych_obj.record(duration=3)
print('Recording Finished')# Convert the audio buffer to text
print('You said:')
print(spych_obj.stt(my_audio_buffer))
```## Process a Function After Hearing a Wake Word (Example Wake Word: `computer`)
- [Spych Wake Docs](https://connor-makowski.github.io/spych/wake.html)
- [Spych Docs](https://connor-makowski.github.io/spych/core.html)```py
from spych import spych, spych_wakemodel_file='deepspeech-0.9.3-models.pbmm'
scorer_file='deepspeech-0.9.3-models.scorer'spych_object=spych(model_file=model_file, scorer_file=scorer_file)
def my_function():
print("Listening...")
audio_buffer=spych_object.record(duration=3)M
print("You said:",spych_object.stt(audio_buffer=audio_buffer))listener=spych_wake(model_file=model_file, scorer_file=scorer_file, on_wake_fn=my_function, wake_word="computer")
listener.start()
```## Modifying the DeepSpeech Model
- Initialized `spych` objects contain a fully functional `DeepSpeech.Model` object inside of them
- You can modify this for each `spych` object any time after initialization
- `DeepSpeech.Model` options are documented [here](https://deepspeech.readthedocs.io/en/latest/Python-API.html#)Example:
```py
spych_obj=spych(model_file='deepspeech-0.9.3-models.pbmm')spych_obj.model.enableExternalScorer('deepspeech-0.9.3-models.scorer')
spych_obj.model.addHotWord('activate',10.0)
```# Rasberry Pi 4 Setup
1. Install system requirements
```
sudo apt install sox git python3-pip python3-scipy python3-numpy python3-pyaudio libatlas3-base
```2. Install python requirements
```
pip3 install spych
```3. Get the DeepSpeech model and score files (note Pi must use .tflite model file)
```sh
curl -LO https://github.com/mozilla/DeepSpeech/releases/download/v0.9.3/deepspeech-0.9.3-models.tflite
curl -LO https://github.com/mozilla/DeepSpeech/releases/download/v0.9.3/deepspeech-0.9.3-models.scorer
```4. Use the examples above substituting the original model file name for the `.tflite` one
- Depending on the memory available on you Pi, you may need to omit the scorer file.