Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/alxpez/alts
100% free, local & offline voice assistant with speech recognition
https://github.com/alxpez/alts
assistant chatbot llm local offline ollama speech-recognition stt tts voice voice-assistant whisper
Last synced: 23 days ago
JSON representation
100% free, local & offline voice assistant with speech recognition
- Host: GitHub
- URL: https://github.com/alxpez/alts
- Owner: alxpez
- Created: 2023-12-26T02:24:20.000Z (11 months ago)
- Default Branch: main
- Last Pushed: 2024-02-19T03:26:21.000Z (9 months ago)
- Last Synced: 2024-02-19T04:30:43.905Z (9 months ago)
- Topics: assistant, chatbot, llm, local, offline, ollama, speech-recognition, stt, tts, voice, voice-assistant, whisper
- Language: Python
- Homepage: https://github.com/alxpez/alts
- Size: 1.17 MB
- Stars: 30
- Watchers: 3
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
alts
( 🎙️ listens | 💭 thinks | 🔊 speaks )
## 💬 about
100% free, local and offline assistant with speech recognition and talk-back functionalities.## 🤖 default usage
ALTS runs in the background and waits for you to press `cmd+esc` (or `win+esc`).
- 🎙️ While holding the hotkey, your voice will be recorded _(saves in the project root)_.
- 💭 On release, the recording stops and a transcript is sent to the LLM _(the recording is deleted)_.
- 🔊 The LLM responses then get synthesized and played back to you _(also shown as desktop notifications)_.You can modify the hotkey combination and other settings in your `config.yaml`.
> ALL processes are local and __NONE__ of your recordings or queries leave your environment; the recordings are deleted as soon as they are used; it's __ALL PRIVATE__ by _default_
## ⚙️ pre-requisites
- ### python
> (tested on) version \>=3.11 on macOS and version \>=3.8 on windows- ### llm
By default, the project is configured to work with [Ollama](https://ollama.ai/), running the [`stablelm2` model](https://ollama.ai/library/stablelm2) (a very tiny and quick model). This setup makes the whole system completely free to run locally and great for low resource machines.However, we use [LiteLLM](https://github.com/BerriAI/litellm) in order to be provider agnostic, so you have full freedom to pick and choose your own combinations.
Take a look at the supported [Models/Providers](https://docs.litellm.ai/docs/providers) for more details on LLM configuration.
> See `.env.template` and `config-template.yaml` for customizing your setup- ### stt
We use `openAI's whisper` to transcribe your voice queries. It's a general-purpose speech recognition model.You will need to have [`ffmepg`](https://ffmpeg.org/) installed in your environment, you can [download](https://ffmpeg.org/download.html) it from the official site.
Make sure to check out their [setup](https://github.com/openai/whisper?tab=readme-ov-file#setup) docs, for any other requirement.
> if you stumble into errors, one reason could be the model not downloading automatically. If that's the case you can run a `whisper` example transcription in your terminal ([see examples](https://github.com/openai/whisper?tab=readme-ov-file#command-line-usage)) or manually download it and place the model-file in the [correct folder](https://github.com/openai/whisper/discussions/63)- ### tts
We use `coqui-TTS` for ALTS to talk-back to you. It's a library for advanced Text-to-Speech generation.You will need to install [`eSpeak-ng`](https://github.com/espeak-ng/espeak-ng) in your environment:
- macOS – `brew install espeak`
- linux – `sudo apt-get install espeak -y`
- windows – [download](https://github.com/espeak-ng/espeak-ng/releases) the executable from their repo
> on __windows__ you'll also need `Desktop development with C++` and `.NET desktop build tools`.
> Download the [Microsoft C++ Build Tools](https://visualstudio.microsoft.com/visual-cpp-build-tools/) and install these dependencies.Make sure to check out their [setup](https://github.com/coqui-ai/TTS/tree/dev#installation) docs, for any other requirement.
> if you don't have the configured model already downloaded it should download automatically during startup, however if you encounter any problems, the default model can be pre-downloaded by running the following:
> ```ssh
> tts --text "this is a setup test" --out_path test_output.wav --model_name tts_models/en/vctk/vits --speaker_idx p364
> ```
> The default model has several "speakers" to choose from; running the following command will serve a demo site where you can test the different voices available:
> ```ssh
> tts-server --model_name tts_models/en/vctk/vits
> ```## ✅ get it running
clone the repo
```ssh
git clone https://github.com/alxpez/alts.git
```go to the main folder
```ssh
cd alts/
```install the project dependencies
```ssh
pip install -r requirements.txt
```
> see the [pre-requisites](#%EF%B8%8F-pre-requisites) section, to make sure your machine is ready to start the ALTSduplicate and rename the needed config files
```ssh
cp config-template.yaml config.yaml
cp .env.template .env
```
> modify the default configuration to your needsstart up ALTS
```ssh
sudo python alts.py
```
> the `keyboard` package requires to be run as admin (in macOS and Linux), it's not the case on Windows