https://github.com/compulim/react-dictate-button

A button to start dictation using Web Speech API.
https://github.com/compulim/react-dictate-button
Last synced: about 1 month ago
JSON representation
A button to start dictation using Web Speech API.
Host: GitHub
URL: https://github.com/compulim/react-dictate-button
Owner: compulim
License: mit
Created: 2018-06-24T22:38:53.000Z (almost 7 years ago)
Default Branch: main
Last Pushed: 2025-02-13T08:46:22.000Z (3 months ago)
Last Synced: 2025-04-02T10:47:00.070Z (about 2 months ago)
Language: TypeScript
Homepage: https://compulim.github.io/react-dictate-button/
Size: 1.81 MB
Stars: 26
Watchers: 1
Forks: 2
Open Issues: 7
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE
Awesome Lists containing this project

README

        # react-dictate-button

[![npm version](https://badge.fury.io/js/react-dictate-button.svg)](https://badge.fury.io/js/react-dictate-button) [![Build Status](https://travis-ci.org/compulim/react-dictate-button.svg?branch=master)](https://travis-ci.org/compulim/react-dictate-button)

A button to start speech recognition using [Web Speech API](https://developer.mozilla.org/en-US/docs/Web/API/Web_Speech_API/Using_the_Web_Speech_API), with an easy to understand event lifecycle.

# Breaking changes

## [2.0.0] - 2021-05-15

- Requires [`react@>=16.8.0`](https://npmjs.com/package/react) and [`core-js@3`](https://npmjs.com/package/core-js`)

- Modifying props while recognition has started will no longer abort recognition immediately, props will be updated in next recognition

- `SpeechGrammarList` is only constructed when `grammar` props is present

- If `speechRecognition` prop is not present, capability detection is now done through `window.mediaDevices.getUserMedia`

# Demo

Try out this component at [github.io/compulim/react-dictate-button](https://github.io/compulim/react-dictate-button/).

# Background

Reasons why we need to build our own component, instead of using [existing packages](https://www.npmjs.com/search?q=react%20speech) on NPM:

- Most browsers required speech recognition (or WebRTC) to be triggered by a user event (button click)

- Bring your own engine for [Web Speech API](https://developer.mozilla.org/en-US/docs/Web/API/Web_Speech_API/Using_the_Web_Speech_API)

  - Enable speech recognition on unsupported browsers by bridging it with [cloud-based service](https://npmjs.com/package/web-speech-cognitive-services)

- Support grammar list thru [JSpeech Grammar Format](https://www.w3.org/TR/jsgf/)

- Ability to interrupt recognition

- Ability to [morph into other elements](#customization-thru-morphing)

# How to use

First, install our production version by `npm install react-dictate-button`. Or our development version by `npm install react-dictate-button@master`.

```jsx

import { DictateButton } from 'react-dictate-button';

export default () => (

  

    Start/stop

  

);

```

## Props

| Name                | Type                     | Default                                         | Description                                                                                                                                                                                                                                                       |

| ------------------- | ------------------------ | ----------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |

| `className`         | `string`                 | `undefined`                                     | Class name to apply to the button                                                                                                                                                                                                                                 |

| `continuous`        | `boolean`                | `false`                                         | `true` to set Web Speech API to use continuous mode and should continue to recognize until stop, otherwise, `false`                                                                                                                                               |

| `disabled`          | `boolean`                | `false`                                         | `true` to abort ongoing recognition and disable the button, otherwise, `false`                                                                                                                                                                                    |

| `extra`             | `{ [key: string]: any }` | `{}`                                            | Additional properties to set to [`SpeechRecognition`](https://developer.mozilla.org/en-US/docs/Web/API/SpeechRecognition) before `start`, useful when bringing your own [`SpeechRecognition`](https://developer.mozilla.org/en-US/docs/Web/API/SpeechRecognition) |

| `grammar`           | `string`                 | `undefined`                                     | Grammar list in [JSGF format](https://developer.mozilla.org/en-US/docs/Web/API/SpeechGrammarList/addFromString)                                                                                                                                                   |

| `lang`              | `string`                 | `undefined`                                     | Language to recognize, for example, `'en-US'` or [`navigator.language`](https://developer.mozilla.org/en-US/docs/Web/API/NavigatorLanguage/language)                                                                                                              |

| `speechGrammarList` | `any`                    | `window.SpeechGrammarList` (or vendor-prefixed) | Bring your own [`SpeechGrammarList`](https://developer.mozilla.org/en-US/docs/Web/API/SpeechGrammarList)                                                                                                                                                          |

| `speechRecognition` | `any`                    | `window.SpeechRecognition` (or vendor-prefixed) | Bring your own [`SpeechRecognition`](https://developer.mozilla.org/en-US/docs/Web/API/SpeechRecognition)                                                                                                                                                          |

> Note: change of `extra`, `grammar`, `lang`, `speechGrammarList`, and `speechRecognition` will not take effect until next speech recognition is started.

## Events

  

    

      Name

      Signature

      Description

    

  

  

    

      onClick

      
(event: MouseEvent) => void

      Emit when the user click on the button, preventDefault will stop recognition from starting

    

    

      onDictate

      

        ({

  result: {

    confidence: number,

    transcript: number

  },

  type: 'dictate'

}) => void

      

      Emit when recognition is completed

    

    

      onError

      (event: SpeechRecognitionErrorEvent) => void

      Emit when error has occurred or recognition is interrupted, see below

    

    

      onProgress

      

        ({

  abortable: boolean,

  results: [{

    confidence: number,

    transcript: number

  }],

  type: 'progress'

}) => void

      

      Emit for interim results, the array contains every segments of recognized text

    

    

      onRawEvent

      (event: SpeechRecognitionEvent) => void

      

        Emit for handling raw events from

        SpeechRecognition

      

    

  

## Hooks

> Although previous versions exported a React Context, it is recommended to use the hooks interface.

| Name            | Signature   | Description                                                                                         |

| --------------- | ----------- | --------------------------------------------------------------------------------------------------- |

| `useAbortable`  | `[boolean]` | If ongoing speech recognition has `abort()` function and can be aborted, `true`, otherwise, `false` |

| `useReadyState` | `[number]`  | Returns the current state of recognition, refer to [this section](#function-as-a-child)             |

| `useSupported`  | `[boolean]` | If speech recognition is supported, `true`, otherwise, `false`                                      |

### Checks if speech recognition is supported

To determines whether speech recognition is supported in the browser:

- If `speechRecognition` prop is `undefined`

  - If both [`window.navigator.mediaDevices`](https://developer.mozilla.org/en-US/docs/Web/API/MediaDevices) and [`window.navigator.mediaDevices.getUserMedia`](https://developer.mozilla.org/en-US/docs/Web/API/MediaDevices/getUserMedia) are falsy, it is not supported

    - Probably the browser is not on a secure HTTP connection

  - If both `window.SpeechRecognition` and vendor-prefixed are falsy, it is not supported

  - If recognition failed once with `not-allowed` error code, it is not supported

- Otherwise, it is supported

> Even the browser is on an insecure HTTP connection, `window.SpeechRecognition` (or vendor-prefixed) will continue to be truthy. Instead, `mediaDevices.getUserMedia` is used for capability detection.

### Event lifecycle

One of the design aspect is to make sure events are easy to understand and deterministic. First rule of thumb is to make sure `onProgress` will lead to either `onDictate` or `onError`. Here are some samples of event firing sequence (tested on Chrome 67):

- Happy path: speech is recognized

  1.  `onStart`

  1.  `onProgress({})` (just started, therefore, no `results`)

  1.  `onProgress({ results: [] })`

  1.  `onDictate({ result: ... })`

  1.  `onEnd`

- Happy path: speech is recognized with continuous mode

  1.  `onStart`

  1.  `onProgress({})` (just started, therefore, no `results`)

  1.  `onProgress({ results: [] })`

  1.  `onDictate({ result: ... })`

  1.  `onProgress({ results: [] })`

  1.  `onDictate({ result: ... })`

  1.  `onEnd`

- Heard some sound, but nothing can be recognized

  1.  `onStart`

  1.  `onProgress({})`

  1.  `onDictate({})` (nothing is recognized, therefore, no `result`)

  1.  `onEnd`

- Nothing is heard (audio device available but muted)

  1.  `onStart`

  1.  `onProgress({})`

  1.  `onError({ error: 'no-speech' })`

  1.  `onEnd`

- Recognition aborted

  1.  `onStart`

  1.  `onProgress({})`

  1.  `onProgress({ results: [] })`

  1.  While speech is getting recognized, set `props.disabled` to `false`, abort recognition

  1.  `onError({ error: 'aborted' })`

  1.  `onEnd`

- Not authorized to use speech or no audio device is availablabortable: truee

  1.  `onStart`

  1.  `onError({ error: 'not-allowed' })`

  1.  `onEnd`

## Function as a child

Instead of passing child elements, you can pass a function to render different content based on ready state. This is called [function as a child](https://reactjs.org/docs/render-props.html#using-props-other-than-render).

| Ready state | Description                                                                |

| ----------- | -------------------------------------------------------------------------- |

| `0`         | Not started                                                                |

| `1`         | Starting recognition engine, recognition is not ready until it turn to `2` |

| `2`         | Recognizing                                                                |

| `3`         | Stopping                                                                   |

For example,

```jsx

  {({ readyState }) =>

    readyState === 0 ? 'Start' : readyState === 1 ? 'Starting...' : readyState === 2 ? 'Listening...' : 'Stopping...'

  }

```

# Customization thru morphing

You can build your own component by copying our layout code, without messing around the [logic code behind the scene](packages/component/src/Composer.js). For details, please refer to [`DictateButton.js`](packages/component/src/DictateButton.js), [`DictateCheckbox.js`](packages/component/src/DictateCheckbox.js), and [`DictationTextBox.js`](packages/pages/src/DictationTextBox.js).

## Checkbox version

In addition to ``, we also ship `` out of the box. The checkbox version is better suited for toggle button scenario and web accessibility. You can use the following code for the checkbox version.

```jsx

import { DictateCheckbox } from 'react-dictate-button';

export default () => (

  

    Start/stop

  

);

```

## Text box with dictate button

We also provide a "text box with dictate button" version. But instead of shipping a full-fledged control, we make it a minimally-styled control so you can start copying the code and customize it in your own project. The sample code can be found at [DictationTextBox.js](packages/pages/src/DictationTextBox.js).

# Design considerations

- Hide the complexity of Web Speech events because we only want to focus on recognition experience

  - Complexity in lifecycle events: `onstart`, `onaudiostart`, `onsoundstart`, `onspeechstart`

  - `onresult` may not fire in some cases, `onnomatch` is not fired in Chrome

  - To reduce complexity, we want to make sure event firing are either:

    - Happy path: `onProgress`, then either `onDictate` or `onError`

    - Otherwise: `onError`

- "Web Speech" could means speech synthesis, which is out of scope for this package

- "Speech Recognition" could means we will expose Web Speech API as-is, which we want to hide details and make it straightforward for recognition scenario

# Roadmap

Please feel free to [file](https://github.com/compulim/react-dictate-button/issues) suggestions.

- While `readyState` is 1 or 3 (transitioning), the underlying speech engine cannot be started/stopped until the state transition is complete

  - Need rework on the state management

- Instead of putting all logic inside [`Composer.js`](packages/component/src/Composer.js), how about

  1.  Write an adapter to convert `SpeechRecognition` into another object with simpler event model and `readyState`

  2.  Rewrite `Composer.js` to bridge the new `SimpleSpeechRecognition` model and React Context

  3.  Expose `SimpleSpeechRecognition` so people not on React can still benefit from the simpler event model

# Contributions

Like us? [Star](https://github.com/compulim/react-dictate-button/stargazers) us.

Want to make it better? [File](https://github.com/compulim/react-dictate-button/issues) us an issue.

Don't like something you see? [Submit](https://github.com/compulim/react-dictate-button/pulls) a pull request.
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/compulim/react-dictate-button

Awesome Lists containing this project

README