https://github.com/olekli/drdictaphone

Dictation app for the terminal and Neovim, using Whisper for transcription and ChatGPT for post-processing.
https://github.com/olekli/drdictaphone

chatgpt dictate dictation neovim neovim-plugin openai openai-chatgpt openai-whisper speech-to-text terminal terminal-based transcription

Last synced: 4 months ago
JSON representation

Dictation app for the terminal and Neovim, using Whisper for transcription and ChatGPT for post-processing.

Host: GitHub
URL: https://github.com/olekli/drdictaphone
Owner: olekli
License: apache-2.0
Created: 2023-11-17T17:08:27.000Z (almost 2 years ago)
Default Branch: main
Last Pushed: 2024-09-08T13:08:59.000Z (about 1 year ago)
Last Synced: 2025-06-11T06:08:22.955Z (4 months ago)
Topics: chatgpt, dictate, dictation, neovim, neovim-plugin, openai, openai-chatgpt, openai-whisper, speech-to-text, terminal, terminal-based, transcription
Language: Python
Homepage:
Size: 2.17 MB
Stars: 6
Watchers: 1
Forks: 1
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# DrDictaphone

Dictation app for the terminal and Neovim, using Whisper for transcription and ChatGPT for post-processing.

### Installation

You can use the installation script:
```
curl https://raw.githubusercontent.com/olekli/DrDictaphone/main/script/install.sh | sh
```

Or create a virtual environment and do:
```
pip install drdictaphone
python -m drdictaphone.cli install ~/DrDictaphone
```

Place OpenAI API key in `~/DrDictaphone/config/openai_api_key`.

### Running

To start the standalone app, do `./drdictaphone`.

To start only the server, do `./drdictaphone server`.

Shutdown a running server by doing `./drdictaphone shutdown`.

### Neovim Plugin

If you are not already using Python plugins in Neovim,
you need to create a virtual environment for Neovim to use.
Tell Neovim about it by adding to your `init.vim`:
```
let g:python3_host_prog = '~/.neovim-venv/bin/python'
```
(Or wherever your venv is located.)

Inside this virtual environment, install the Neovim plugin:
```
pip install drdictaphone-neovim-plugin
```

Now you need to add the plugin to your Neovim config directory:
```
ln -s ~/.neovim-venv/lib/python3.11/site-packages/drdictaphone_neovim/DrDictaphone.py ~/.config/nvim/rplugin/python3/.
```
(Your paths may vary.)

Then start the server. Do `:UpdateRemotePlugins` once in Neovim, restart. Use `DrDictaphoneSetProfile`, `DrDictaphoneToggle` vim commands.

### Controlling Standalone App:

- `s`: select profile
- `p`: start / stop and transcribe recording
- `d`: stop and discard recording
- `q`: exit

### Profiles

Profiles consist of:

- `topic` for transcribing and post-processing, a list of strings
- `language` to use for the transcriber, a string
- `output` directory, a string
- `output_command` to pipe output to
- `enable_vad` whether or not to enable VAD, a bool, defaults to `false`

Output will be written to a timestamped file in the output directory.

VAD will filter recordings for parts with voice before processing them.

### Post-Processor

The Post-Processor specs consist of:

- `instructions` for the post-processor, either a filename to load from or a list of strings
- `gpt_model` to use for post-processing, either a filename to load from or an object
- `options` to use for post-processing, either a filename to load from or an object
- `tools` to use for the post-processor, either a filename to load from or an object

The context for the post-processor is built from the profile and the post-processor specs. Settings in the profile take precedence over settings in the specs.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/olekli/drdictaphone

Awesome Lists containing this project

README