Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/volltin/xiaodou-bot

A simple voice-to-voice chatbot.
https://github.com/volltin/xiaodou-bot

azure-speech-service chatbot chatgpt openai speech-to-text text-to-speech

Last synced: about 2 months ago
JSON representation

A simple voice-to-voice chatbot.

Host: GitHub
URL: https://github.com/volltin/xiaodou-bot
Owner: volltin
License: mit
Created: 2023-09-18T02:55:29.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2023-09-18T05:44:56.000Z (over 1 year ago)
Last Synced: 2024-11-06T04:19:17.423Z (3 months ago)
Topics: azure-speech-service, chatbot, chatgpt, openai, speech-to-text, text-to-speech
Language: Python
Homepage:
Size: 5.9 MB
Stars: 5
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# Xiaodou: Voice-to-Voice Chatbot

Xiaodou is a simple voice-to-voice chatbot designed for seamless interaction with users.

## Installation

Install the required packages using one of the following methods:

- With `pip`:

```bash
pip install -r requirements.txt
```

- With `poetry` (recommended):

```bash
poetry install --without dev
```

## Configuration

### Audio Devices

To use a specific audio device, specify the input and output device names in `xiaodou/main.py`:

```python
# Input and output device names, if None, use default device
INPUT_DEVICE_NAME = None # Azure format
OUTPUT_DEVICE_NAME = None # pygame format
```

There are some scripts in `scripts/` to list the available audio devices for Azure SDK and pygame:

Example 1: on macOS, list the available audio devices:

```bash
cd scripts/macos_list_audio_devices
make run
```

Example 2: list the available audio devices using pygame:

```bash
cd scripts/pygame_list_audio_devices
make run
```

### API Keys

Set the following environment variables in a `.env` file, refer to `.env.example` for an example:

```bash
OPENAI_API_TYPE="azure"
OPENAI_API_BASE="https://example.openai.azure.com/"
OPENAI_API_KEY="..."
OPENAI_API_VERSION="2023-03-15-preview"
AZURE_OPENAI_DEPLOYMENT_NAME="gpt-4"
SPEECH_API_KEY="..."
SPEECH_SERVICE_REGION="..."
```

### Keyword Model

The chatbot is activated upon hearing the keyword "小豆". The example keyword model is located in `xiaodou/models/`. For additional information, refer to [xiaodou/models/README.md](xiaodou/models/README.md).
## Usage

Start the chatbot with the following command:

```bash
python xiaodou/main.py
```

Once activated, you can begin conversing with the chatbot. The interaction flow is as follows:

```mermaid
sequenceDiagram
participant User
participant Bot
User->>Bot: Say the keyword
Bot->>User: Play notification sound
User->>Bot: Voice input (e.g. "Can you tell me a joke")
Bot->>Bot: Stop recording after user pause
Bot->>User: Play another notification sound
Bot->>Bot: Recognize voice with Azure Speech Service
Bot->>Bot: Send prompt to OpenAI API
Bot->>Bot: Receive response
Bot->>Bot: Synthesize response using Azure Speech Service
Bot->>User: Play synthesized voice (e.g. "Sure, here's a joke, ...")
User->>Bot: Repeat (starts with keyword)

```

## Development

To contribute to the development of Xiaodou, follow these steps:

1. Install pre-commit hooks and development dependencies:

```bash
poetry install
pre-commit install
```

## License

For more information on the license, please refer to the [LICENSE](LICENSE) file.