https://github.com/dictate-button/dictate-button

Customizable Web Component that adds speech-to-text dictation capabilities to site text fields
https://github.com/dictate-button/dictate-button

button custom-element dictate dictate-button dictation speech-recognition speech-to-text transcribe transcribing transcription voice-recognition voice-to-text web-component whisper

Last synced: 5 months ago
JSON representation

Customizable Web Component that adds speech-to-text dictation capabilities to site text fields

Host: GitHub
URL: https://github.com/dictate-button/dictate-button
Owner: dictate-button
License: apache-2.0
Created: 2025-07-21T16:06:41.000Z (11 months ago)
Default Branch: main
Last Pushed: 2026-01-16T19:30:21.000Z (6 months ago)
Last Synced: 2026-01-17T07:22:27.486Z (5 months ago)
Topics: button, custom-element, dictate, dictate-button, dictation, speech-recognition, speech-to-text, transcribe, transcribing, transcription, voice-recognition, voice-to-text, web-component, whisper
Language: TypeScript
Homepage: https://dictate-button.io
Size: 264 KB
Stars: 16
Watchers: 0
Forks: 1
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          # Dictate Button

[![NPM Version](https://img.shields.io/npm/v/dictate-button)](https://www.npmjs.com/package/dictate-button)

[![Tests](https://github.com/dictate-button/dictate-button/actions/workflows/test.yml/badge.svg)](https://github.com/dictate-button/dictate-button/actions/workflows/test.yml)

A customizable web component that adds speech-to-text dictation capabilities to any text input, textarea field, or contenteditable element on your website.

Developed for [dictate-button.io](https://dictate-button.io).

## Features

- Easy integration with any website

- Compatible with any framework (or no framework)

- Automatic injection into text fields with the `data-dictate-button-on` attribute (exclusive mode) or without the `data-dictate-button-off` attribute (inclusive mode)

- Simple speech-to-text functionality with clean UI

- Customizable size and API endpoint

- Dark and light theme support

- Event-based API for interaction with your application

- Built with SolidJS for optimal performance

- Accessibility is ensured with ARIA attributes, high-contrast mode support, and clear keyboard focus states

## Supported tags (by our inject scripts)

- textarea

- input[type="text"]

- input[type="search"]

- input (without a type; defaults to text)

- [contenteditable] elements

## Usage

### Auto-inject modes

Choose the auto-inject mode that best suits your needs:

| Mode | Description | Scripts |

|---|---|---|

| Exclusive | Enables for text fields with the `data-dictate-button-on` attribute only. | `inject-exclusive.js` |

| Inclusive | Enables for text fields without the `data-dictate-button-off` attribute. | `inject-inclusive.js` |

Both auto-inject modes:

- Automatically run on DOMContentLoaded (or immediately if the DOM is already loaded).

- Watch for DOM changes to apply the dictate button to newly added elements.

- Set the button’s language from `document.documentElement.lang` (if present). Long codes like `en-GB` are normalized to `en`.

- Position the button to the top right-hand corner of the text field, respecting its padding with 4px fallback if the padding is not set (0).

### From CDN

#### Option 1: Using the exclusive auto-inject script

In your HTML `` tag, add the following script tag:

```html

```

Add the `data-dictate-button-on` attribute to any `textarea`, `input[type="text"]`, `input[type="search"]`, `input` without a `type` attribute, or element with the `contenteditable` attribute:

```html



```

#### Option 2: Using the inclusive auto-inject script

In your HTML `` tag, add the following script tag:

```html

```

All `textarea`, `input[type="text"]`, `input[type="search"]`, `input` elements without a `type` attribute, and elements with the `contenteditable` attribute that lack `data-dictate-button-off` will be automatically enhanced by default.

To disable that for a specific field, add the `data-dictate-button-off` attribute to it this way:

```html



```

#### Option 3: Manual integration

Import the component and use it directly in your code:

```html

```

### From NPM

Import once for your app:

```js

// For selected text fields (with data-dictate-button-on attribute):

import 'dictate-button/inject-exclusive'

// or for all text fields (except those with data-dictate-button-off attribute):

import 'dictate-button/inject-inclusive'

```

To choose between **exclusive** and **inclusive** auto-inject modes, see the [Auto-inject modes](#auto-inject-modes) section.

### Advanced usage with library functions

If you need more control over when and how the dictate buttons are injected, you can use the library functions directly:

Tip: You can also import from subpaths (e.g., 'dictate-button/libs/injectDictateButton')

for smaller bundles, if your bundler resolves package subpath exports.

```js

import 'dictate-button' // Required when using library functions directly

import { injectDictateButton, injectDictateButtonOnLoad } from 'dictate-button/libs'

// Inject dictate buttons immediately to matching elements

injectDictateButton(

  'textarea.custom-selector', // CSS selector for target elements

  {

    buttonSize: 30,           // Button size in pixels (optional; default: 30)

    verbose: false,           // Log events to console (optional; default: false)

    apiEndpoint: 'wss://api.example.com/transcribe' // Optional custom API endpoint

  }

)

// Inject on DOM load with mutation observer to catch dynamically added elements

injectDictateButtonOnLoad(

  'input.custom-selector',    // CSS selector for target elements

  {

    buttonSize: 30,           // Button size in pixels (optional; default: 30)

    verbose: false,           // Log events to console (optional; default: false)

    apiEndpoint: 'wss://api.example.com/transcribe', // Optional custom API endpoint

    watchDomChanges: true     // Watch for DOM changes (optional; default: false)

  }

)

```

Note: the injector mirrors the target field’s display/margins into the wrapper, 

sets wrapper width to 100% for block-level fields, and adds padding to avoid the button overlapping text.

The wrapper also has the `dictate-button-wrapper` class for easy styling.

## Events

The dictate-button component emits the following events:

- `dictate-start`: Fired when transcription starts (after microphone access is granted and WebSocket connection is established).

- `dictate-text`: Fired during transcription when text is available. This includes both interim (partial) transcripts that may change and final transcripts. The event detail contains the current transcribed text.

- `dictate-end`: Fired when transcription ends. The event detail contains the final transcribed text.

- `dictate-error`: Fired when an error occurs (microphone access denied, WebSocket connection failure, server error, etc.). The event detail contains the error message.

The typical flow is:

> dictate-start -> dictate-text (multiple times) -> dictate-end

In case of an error, the `dictate-error` event is fired.

Example event handling:

```javascript

const dictateButton = document.querySelector('dictate-button');

dictateButton.addEventListener('dictate-start', () => {

  console.log('Transcription started');

});

dictateButton.addEventListener('dictate-text', (event) => {

  const currentText = event.detail;

  console.log('Current text:', currentText);

  // Update UI with interim/partial transcription

});

dictateButton.addEventListener('dictate-end', (event) => {

  const finalText = event.detail;

  console.log('Final transcribed text:', finalText);

  // Add the final text to your input field

  document.querySelector('#my-input').value += finalText;

});

dictateButton.addEventListener('dictate-error', (event) => {

  const error = event.detail;

  console.error('Transcription error:', error);

});

```

## Attributes

| Attribute     | Type    | Default                                    | Description                            |

|---------------|---------|--------------------------------------------|-----------------------------------------|

| size          | number  | 30                                         | Size of the button in pixels           |

| apiEndpoint   | string  | wss://api.dictate-button.io/v2/transcribe  | WebSockets API endpoint of transcription service |

| language      | string  | en                                         | Optional [language](https://github.com/dictate-button/dictate-button/wiki/Supported-Languages-and-Dialects) code (e.g., 'fr', 'de') |

| theme         | string  | (inherits from page)                       | 'light' or 'dark'                      |

| class         | string  |                                            | Custom CSS class                       |

## Styling

You can customize the appearance of the dictate button using CSS parts:

```css

/* Style the button container */

dictate-button::part(container) {

  /* Custom styles */

}

/* Style the button itself */

dictate-button::part(button) {

  /* Custom styles */

}

/* Style the button icons */

dictate-button::part(icon) {

  /* Custom styles */

}

```

## API Endpoint

By default, dictate-button uses the `wss://api.dictate-button.io/v2/transcribe` endpoint for real-time speech-to-text streaming.

You can specify your own endpoint by setting the `apiEndpoint` attribute.

The API uses WebSocket for real-time transcription:

- **Protocol**: WebSocket (wss://)

- **Connection**: Opens WebSocket connection with optional language query parameter (e.g., `?language=en`)

- **Audio Format**: PCM16 audio data at 16kHz sample rate, sent as binary chunks

- **Messages Sent**:

  - Binary audio data (Int16Array buffers) - Continuous stream of PCM16 audio chunks

  - `{ type: 'close' }` - JSON message to signal end of audio stream and trigger finalization

- **Messages Received**: JSON messages with the following types:

  - `{ type: 'session_opened', sessionId: string, expiresAt: number }` - Session started

  - `{ type: 'interim_transcript', text: string }` - Interim (partial) transcription result that may change as more audio is processed

  - `{ type: 'transcript', text: string, turn_order?: number }` - Final transcription result for the current turn

  - `{ type: 'session_closed', code: number, reason: string }` - Session ended

  - `{ type: 'error', error: string }` - Error occurred

## Browser Compatibility

The dictate-button component requires the following browser features:

- Web Components

- MediaStream API (getUserMedia)

- Web Audio API (AudioContext, AudioWorklet)

- WebSocket API

Works in all modern browsers (Chrome, Firefox, Safari, Edge).

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/dictate-button/dictate-button

Awesome Lists containing this project

README