https://github.com/assemblyai-solutions/async-chunk-py

Last synced: over 1 year ago
JSON representation

Host: GitHub
URL: https://github.com/assemblyai-solutions/async-chunk-py
Owner: AssemblyAI-Solutions
Created: 2024-09-19T17:25:10.000Z (almost 2 years ago)
Default Branch: main
Last Pushed: 2024-11-14T20:23:52.000Z (over 1 year ago)
Last Synced: 2025-01-24T04:53:31.880Z (over 1 year ago)
Language: Python
Size: 33.2 KB
Stars: 0
Watchers: 3
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: readme.md

Awesome Lists containing this project

README

# AsyncChunkPy: Near-Realtime Python Speech-to-Text App

AsyncChunkPy is a Python application that provides near-realtime speech-to-text transcription using chunked audio processing and asynchronous transcription. It leverages the power of AssemblyAI's async transcription API to deliver high-quality transcriptions at near real-time speeds.

## Features

- Real-time audio recording and chunking
- Voice Activity Detection (VAD) for intelligent chunk processing
- Asynchronous transcription using AssemblyAI API
- Ordered transcript logging
- Configurable chunk size and silence threshold
- Support for multiple languages

## Key Benefits

- Access to AssemblyAI's powerful Universal-2 model for English and Universal-1 model for Spanish and German
- Support for all non-English languages available in AssemblyAI's async transcription service
- Higher accuracy compared to real-time transcription models
- More cost-effective than real-time transcription services
- Near real-time performance with the quality of async transcription

## Prerequisites

- [Python](https://www.python.org/) 3.7 or later
- [pip](https://pip.pypa.io/en/stable/installation/) (Python package installer)
- AssemblyAI API key (You can [sign up for an AssemblyAI account](https://www.assemblyai.com/app) and get your API key from your dashboard.)

## Installation

1. Clone the repository:
```
git clone https://github.com/AssemblyAI-Solutions/async-chunk-py.git
cd async-chunk-py
```

2. Install dependencies:
```
pip install -r requirements.txt
```

3. Create a `.env` file in the root directory and add your AssemblyAI API key:
```
ASSEMBLYAI_API_KEY=your_api_key_here
```

## Usage

1. Start the application:
```
python main.py
```

2. Speak into your microphone. The application will record and transcribe your speech in near-realtime.

3. Press Ctrl+C to stop the recording and see the final transcript.

## Configuration

You can modify the following parameters in `config.py`:

- `CHUNK_SIZE`: Size of each audio chunk in bytes
- `CHUNK_DURATION_MS`: Duration of each audio chunk in milliseconds (default: 5000ms)
- `SILENCE_THRESHOLD_MS`: Duration of silence required to trigger chunk processing (default: 600ms)

To change the language or enable language detection, modify the `transcription_worker.py` file:

- Set `language_code='en'` to the desired language code in the `transcribe` method, or
- Add `language_detection=True` to enable automatic language detection

## Voice Activity Detection (VAD)

This project uses the py-webrtcvad library for VAD. You can adjust VAD parameters by modifying the `Vad` configuration in `audio_recorder.py`. For more information on VAD parameters, visit the [py-webrtcvad GitHub repository](https://github.com/wiseman/py-webrtcvad).

## Project Structure

- `main.py`: Main application file handling coordination between audio recording and transcription.
- `audio_recorder.py`: Handles audio recording and Voice Activity Detection.
- `transcription_worker.py`: Worker for handling transcription tasks using AssemblyAI API.
- `config.py`: Configuration file for various parameters.

## Acknowledgments

- [py-webrtcvad](https://github.com/wiseman/py-webrtcvad) for the Voice Activity Detection functionality

## Troubleshooting

If you encounter any issues with audio recording or transcription, ensure that:

1. Your microphone is properly connected and selected as the input device.
2. Your AssemblyAI API key is correctly set in the `.env` file.
3. You have a stable internet connection for API communication.

For any other issues, please check the console output for error messages and refer to the documentation of the individual dependencies if needed.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/assemblyai-solutions/async-chunk-py

Awesome Lists containing this project

README