An open API service indexing awesome lists of open source software.

https://github.com/200ok-ch/ok-audio-transcription

Voice transcription for Emacs using OpenAI's Whisper API.
https://github.com/200ok-ch/ok-audio-transcription

Last synced: 4 months ago
JSON representation

Voice transcription for Emacs using OpenAI's Whisper API.

Awesome Lists containing this project

README

          

Voice Transcription for Emacs using OpenAI's Realtime API

** Requirements

- Emacs 27.1 or later
- ffmpeg
- PulseAudio (currently supports only Linux systems with PulseAudio)
- OpenAI API Key

** Installation

*** MELPA
Coming soon.

*** Manual
Clone this repository and add its path to `load-path`:

```elisp
(add-to-list 'load-path "/path/to/ok-audio-transcription")
(require 'ok-audio-transcription)
```

** Configuration

Set your OpenAI API token:

```elisp
(setq ok-audio-transcription--openai-token "your-api-token")
```

** Usage

- `M-x ok-audio-transcription--record-dwim` :: Start/stop recording (press once to start, again to stop)
- `M-x ok-audio-transcription--transcribe-file` :: Transcribe an existing audio file

The transcription result will be inserted at point.

Define a shortcut:

```elisp
(define-key global-map (kbd "C-x R") 'ok-audio-transcription--record-dwim)
```

** Limitations

- Currently supports only Linux systems with PulseAudio
- Maximum recording length defaults to 180 seconds (modifiable via `ok-audio-transcription--max-recording-seconds`)
- Requires an active internet connection
- Utilizes OpenAI's API (incurs costs)

** License

AGPL-3.0. See [[file:LICENSE][LICENSE]].

** Credits

Inspired by [[https://simonsafar.com/2024/whisper/][Simon Safar's blog post]].