https://github.com/mxcaoalina/speech_recognition

Last synced: 12 months ago
JSON representation

Host: GitHub
URL: https://github.com/mxcaoalina/speech_recognition
Owner: mxcaoalina
Created: 2025-03-24T21:43:37.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2025-03-24T21:46:36.000Z (over 1 year ago)
Last Synced: 2025-06-18T15:49:18.135Z (about 1 year ago)
Language: Python
Size: 760 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# Speech Recognition Project

This project provides various speech recognition and audio processing capabilities using AssemblyAI and OpenAI APIs.

## Features

- Basic audio file processing (WAV/MP3)
- Real-time speech recognition
- Sentiment analysis of speech
- Podcast summarization
- Real-time transcription with OpenAI integration

## Prerequisites

- Python 3.8 or higher
- AssemblyAI API key
- OpenAI API key (for some features)

## Installation

1. Clone the repository:
```bash
git clone https://github.com/yourusername/speech-recognition-python.git
cd speech-recognition-python
```

2. Install dependencies:
```bash
pip install -r requirements.txt
```

3. Set up environment variables:
Create a `.env` file in the root directory with the following content:
```
ASSEMBLYAI_API_KEY=your_assemblyai_api_key
OPENAI_API_KEY=your_openai_api_key
```

## Project Structure

- `1.basic framework/`: Basic audio file processing
- `2.simple recognition/`: Simple speech recognition
- `3.sentiment-analysis/`: Sentiment analysis of speech
- `4.podcast summarization/`: Podcast transcription and summarization
- `5.realtime-openai/`: Real-time transcription with OpenAI integration
- `shared/`: Shared modules and configuration

## Usage

### Basic Audio Processing
```bash
python "1.basic framework/load_mp3.py"
```

### Simple Speech Recognition
```bash
python "2.simple recognition/main.py"
```

### Sentiment Analysis
```bash
python "3.sentiment-analysis/main.py"
```

### Podcast Summarization
```bash
python "4.podcast summarization/main.py"
```

### Real-time Transcription
```bash
python "5.realtime-openai/main.py"
```

## Configuration

The project uses a centralized configuration system in `shared/config.py`. You can modify audio and API settings there.

## Error Handling

The project includes comprehensive error handling for:
- API communication issues
- File operations
- Audio stream management
- Resource cleanup

## Contributing

1. Fork the repository
2. Create your feature branch
3. Commit your changes
4. Push to the branch
5. Create a new Pull Request

## License

This project is licensed under the MIT License - see the LICENSE file for details.

## Acknowledgments

- AssemblyAI for speech recognition capabilities
- OpenAI for language model integration
- PyAudio for audio processing

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/mxcaoalina/speech_recognition

Awesome Lists containing this project

README