Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/cgbur/whisp

A lightweight and minimal desktop speech-to-text tool.
https://github.com/cgbur/whisp

accessibility speech-to-text whisper

Last synced: about 2 months ago
JSON representation

A lightweight and minimal desktop speech-to-text tool.

Host: GitHub
URL: https://github.com/cgbur/whisp
Owner: cgbur
License: mit
Created: 2024-10-13T16:39:17.000Z (4 months ago)
Default Branch: main
Last Pushed: 2024-10-20T08:03:13.000Z (4 months ago)
Last Synced: 2024-11-30T22:46:10.588Z (2 months ago)
Topics: accessibility, speech-to-text, whisper
Language: Rust
Homepage:
Size: 124 KB
Stars: 3
Watchers: 1
Forks: 0
Open Issues: 6
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# Whisp

A lightweight desktop speech-to-text tool powered by modern models like
[OpenAI's Whisper](https://github.com/openai/whisper). Whisp provides a simple
interface for converting speech to text with minimal resource overhead.

[![Crates.io][crates-badge]][crates-url]
[![MIT licensed][mit-badge]][mit-url]

[crates-badge]: https://img.shields.io/crates/v/whisp.svg
[crates-url]: https://crates.io/crates/whisp
[mit-badge]: https://img.shields.io/badge/license-MIT-blue.svg
[mit-url]: https://github.com/cgbur/whisp/blob/main/LICENSE

## Overview

Whisp offers an unobtrusive and customizable way to transcribe your voice into
text. It operates as a globally available desktop application. Activate it via a
hotkey, and it can automatically paste the transcribed text into any focused
input field.

Design principles:

- **Reliable**: Built to be stable and handle errors gracefully. Resilient in
the face of errors. Retry and recovery.

- **Lightweight**: Resource-efficient, minimal system impact, simple.

## Installation

Currently the only way to install this is via cargo:

```sh
cargo install whisp
whisp
# if cargo bin is not in your path
~/.cargo/bin/whisp
```

## Configuration

Configuration is managed through a `whisp.toml` file located in your systems
configuration directory. The whisp drop-down has an option to copy the
configuration file path to the clipboard.

```toml
hotkey = "shift+super+Semicolon"
openai_key = "your-api-key"
language = "en"
model = "whisper-1"
restore_clipboard = true
auto_paste = false
```

## Usage

To start using Whisp, define your preferred hotkey, configure the model, and run
the application. You can then trigger voice recording via the hotkey and receive
transcriptions automatically.

### Common Use Cases

- **Messaging**: Quickly respond to messages in chat applications like Discord
or Slack.

- **Document Writing**: Speak freely to draft large amounts of text quickly.
Then apply post-processing yourself or with the help of a language model to
refine the text.

- **Code Commenting**: Dictate comments directly into your editor. Note this
tool does not write code well. However, perhaps this can change when the
automatic post processing is added. Reach out if you are interested in
contributing.

## License

Whisp is licensed under the [MIT
license](https://github.com/cgbur/whisp/blob/main/LICENSE).